Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add with nocopy is still requiring path argument #5984

Closed
mitra42 opened this issue Feb 11, 2019 · 18 comments
Closed

add with nocopy is still requiring path argument #5984

mitra42 opened this issue Feb 11, 2019 · 18 comments

Comments

@mitra42
Copy link

mitra42 commented Feb 11, 2019

Version information:

go-ipfs version: 0.4.18-
Repo version: 7
System version: amd64/darwin
Golang version: go1.11.1

Type: bug ?

Description:

URLs like

http://127.0.0.1:5001/api/v0/add?nocopy=true&arg=/Users/mitra/temp/archiveorg/commute/commute.mp4

Are failing with File argument 'path' is required

The docs unfortunately appear to be buggy [https://docs.ipfs.io/reference/api/http/#api-v0-add] says there is a required arg= parameter, but the curl example omits that, so I'm not sure to trust the docs, but it also says. Argument “path” is of file type. This endpoint expects a file in the body of the request as ‘multipart/form-data’. which is unclear, but suggests it needs the contents of the file supplied inline.

Apart from the bug in the docs (missing arg from example, or erroneously specified as a required argument), I think there is a bug where even if the path is supplied to arg=, and nocopy is provided, then it is still expecting content inline.

Its possible I'm misreading something, but there seem to be no examples anywhere surfacable with google of the HTTP API being used to add an url to filestore with nocopy=true

@parkan
Copy link

parkan commented Feb 11, 2019

in the normal (non-filestore) case, the meaning of "argument" is that the file is POSTed in the form body (this can definitely be phrased a lot better)

EDIT: hmm ok yeah the behavior of add doesn't seem consistent with the docs, I'll open an issue on the appropriate repo

I'm not entirely sure how the HTTP API works with filestore, even conceptually, given that you may not have any insight into the filesystem that the daemon operates on

@Stebalien
Copy link
Member

Stebalien commented Feb 11, 2019

The filestore case is the same as the normal add case (send a bunch of files as a multipart mime body) except that, instead of actually copying the data, we insert a filestore reference.

I believe (?) the motivation here is that it allows renaming. That is, the names of the files/directories added to go-ipfs don't need to match those on the filesystem.

@parkan
Copy link

parkan commented Feb 11, 2019

@Stebalien how does the reference know where on the filesystem the actual blocks reside? if we're just posting a multipart form with bytes it doesn't have the file metadata

@parkan
Copy link

parkan commented Feb 11, 2019

ohh I see, you specify the path and send the file separately:

$ echo "filez" > test
$ curl 'http://127.0.0.1:5001/api/v0/add?nocopy=true&arg=test' -F file=@test
{"Name":"test","Hash":"zb2rhmqiJJ7fpnwE1p6F6khKSrmTrVVN2CZBtVB3B4K9Cpdx2","Size":"6"}
$ ipfs cat zb2rhmqiJJ7fpnwE1p6F6khKSrmTrVVN2CZBtVB3B4K9Cpdx2
filez

is that right?

@Stebalien
Copy link
Member

Stebalien commented Feb 11, 2019

Almost. It takes file arguments so these aren't specified in the URL parameters. To specify the path, you actually need to set an Abspath header. See: #4558

I think the right approach is:

 curl 'http://127.0.0.1:5001/api/v0/add?nocopy=true' -F 'file=@test;headers="Abspath: /path/to/test"'

@parkan
Copy link

parkan commented Feb 11, 2019

ahh ok this isn't really well documented -- I'm going to PR the HTTP API repo

@parkan
Copy link

parkan commented Feb 11, 2019

@Stebalien hmm getting this with 0.4.18:

$ curl 'http://127.0.0.1:5001/api/v0/add?nocopy=true' -F 'file=@test;headers="Abspath: /tmp/test"'
Warning: skip unknown form field: headers="Abspath: /tmp/test
{"Name":"test","Hash":"zb2rhh6JScQtJzrxHPfm6e2nB271yKNDPu9P5Hp1RCj1wV6fw","Size":"7"}

EDIT: the correct way to set the header with curl is

curl 'http://127.0.0.1:5001/api/v0/add?nocopy=true' -F 'file=@test' -H 'Abspath: /tmp/test'

EDIT2: but wait, this is a custom header inside the multipart form fragment... I don't know how to set those

@parkan
Copy link

parkan commented Feb 12, 2019

using custom headers inside multipart forms is illegal (https://tools.ietf.org/html/rfc7578#section-4.8), which means this approach will not be supported by HTTP client libraries

EDIT: actually only old client libs, see below!

@hsanjuan
Copy link
Contributor

hsanjuan commented Feb 12, 2019

using custom headers inside multipart forms is illegal (https://tools.ietf.org/html/rfc7578#section-4.8), which means this approach will not be supported by HTTP client libraries

That seems to affect only Content-* headers?

@Stebalien
Copy link
Member

Stebalien commented Feb 12, 2019

ahh ok this isn't really well documented -- I'm going to PR the HTTP API repo

Unfortunately, I'm pretty sure those docs are auto-generated. I'm not sure what the right approach is.

EDIT: the correct way to set the header with curl is

You're missing a " but I also think you may have an old version of curl. The following works for me:

curl 'http://127.0.0.1:5001/api/v0/add?nocopy=true' -F 'file=@/tmp/test;headers="Abspath: /tmp/test'

@magik6k
Copy link
Member

magik6k commented Feb 12, 2019

That seems to affect only Content-* headers?

As noted in #4558 (comment), this is the case:

This restriction only applies to Content- headers, and was already sort-of defined in https://www.ietf.org/rfc/rfc2046.txt:

5.1 [...] The only header fields that have defined meaning for body parts are
   those the names of which begin with "Content-".  All other header
   fields may be ignored in body parts.  Although they should generally
   be retained if at all possible, they may be discarded by gateways if
   necessary.  Such other fields are permitted to appear in body parts
   but must not be depended on.  "X-" fields may be created for
   experimental or private purposes, with the recognition that the
   information they contain may be lost at some gateways.

This also sounds a bit like we should use X-, but other than that it doesn't prohibit this use.

@parkan
Copy link

parkan commented Feb 12, 2019

could you show me the output from your curl with --trace-ascii -?

@magik6k
Copy link
Member

magik6k commented Feb 12, 2019

$ curl 'http://127.0.0.1:5001/api/v0/add?nocopy=true' -F 'file=@test;headers="Abspath: /tmp/test"' --trace-ascii -

== Info:   Trying 127.0.0.1...
== Info: TCP_NODELAY set
== Info: Connected to 127.0.0.1 (127.0.0.1) port 5001 (#0)
=> Send header, 208 bytes (0xd0)
0000: POST /api/v0/add?nocopy=true HTTP/1.1
0027: Host: 127.0.0.1:5001
003d: User-Agent: curl/7.63.0
0056: Accept: */*
0063: Content-Length: 222
0078: Content-Type: multipart/form-data; boundary=--------------------
00b8: ----1718eb6ba03c1a34
00ce: 
=> Send data, 222 bytes (0xde)
0000: --------------------------1718eb6ba03c1a34
002c: Content-Disposition: form-data; name="file"; filename="test"
006a: Content-Type: application/octet-stream
0092: Abspath: /tmp/test
00a6: 
00a8: Hello.
00b0: --------------------------1718eb6ba03c1a34--
<= Recv header, 17 bytes (0x11)
0000: HTTP/1.1 200 OK
<= Recv header, 83 bytes (0x53)
0000: Access-Control-Allow-Headers: X-Stream-Output, X-Chunked-Output,
0040:  X-Content-Length
<= Recv header, 84 bytes (0x54)
0000: Access-Control-Expose-Headers: X-Stream-Output, X-Chunked-Output
0040: , X-Content-Length
<= Recv header, 32 bytes (0x20)
0000: Content-Type: application/json
<= Recv header, 28 bytes (0x1c)
0000: Server: go-ipfs/0.4.19-dev
<= Recv header, 25 bytes (0x19)
0000: Trailer: X-Stream-Error
<= Recv header, 14 bytes (0xe)
0000: Vary: Origin
<= Recv header, 21 bytes (0x15)
0000: X-Chunked-Output: 1
<= Recv header, 37 bytes (0x25)
0000: Date: Tue, 12 Feb 2019 01:24:18 GMT
<= Recv header, 28 bytes (0x1c)
0000: Transfer-Encoding: chunked
<= Recv header, 2 bytes (0x2)
0000: 
<= Recv data, 92 bytes (0x5c)
0000: 56
0004: {"Name":"test","Hash":"zb2rhdYtXM8X3Jfsm6VrmXnmcSqtfgHZbhYRJ32EN
0044: kmARL78K","Size":"6"}.
{"Name":"test","Hash":"zb2rhdYtXM8X3Jfsm6VrmXnmcSqtfgHZbhYRJ32ENkmARL78K","Size":"6"}
<= Recv data, 5 bytes (0x5)
0000: 0
0003: 
== Info: Connection #0 to host 127.0.0.1 left intact

(There is another valid example in #4558 (comment))

@magik6k
Copy link
Member

magik6k commented Feb 12, 2019

For curl --version:

curl 7.63.0 (x86_64-pc-linux-gnu) libcurl/7.63.0 OpenSSL/1.1.1a zlib/1.2.11 libidn2/2.1.0 libpsl/0.20.2 (+libidn2/2.1.0) libssh2/1.8.0 nghttp2/1.36.0
Release-Date: 2018-12-12
Protocols: dict file ftp ftps gopher http https imap imaps pop3 pop3s rtsp scp sftp smb smbs smtp smtps telnet tftp 
Features: AsynchDNS IDN IPv6 Largefile GSS-API Kerberos SPNEGO NTLM NTLM_WB SSL libz TLS-SRP HTTP2 UnixSockets HTTPS-proxy PSL

@parkan
Copy link

parkan commented Feb 12, 2019

yes ok it seems that 7.47 (latest that this machine/Ubuntu 16.04 LTS has, sadly) quietly does not support this, sadly I can't find the corresponding fix in curl changelogs

would probably prefer "X-IPFS-FilestorePath" or the like but I guess it's easier this way because it matches the FileInfo property?

@hsanjuan is there a reasonably painless way to add this to the API docs? happy to get it nicely written up if there's a way to add an override

@parkan
Copy link

parkan commented Feb 12, 2019

found it, looks like > 7.57 is what you need for custom form headers

curl/curl@fec7a85

the interesting thing is that this seems to be aimed at MIME mail messages but also happens to work for our use case 😄

they also seem to be in favor of the X-... convention

@mitra42
Copy link
Author

mitra42 commented Feb 12, 2019

And of course, the issue isn't using Curl. Curl is just for testing, the reason for doing this is going to be for example inside Javascript, where its going to be **** hard to construct this weird query in JS, to set an inner custom header on a multipart mime.

I', womdering why the Http-api add call has been constructed to need BOTH the path to the file and the actual contents, i.e. why doesn't:
GET http://localhost:5001/api/v0/add?arg=/foo/bar&nostore=true work ? Since if IPFS can't read /foo/bar then storing the file reference is going to fail anyway.

Note @parkan this is no longer urgent for me though I think its a "design bug", I gave up on filestore (between this and the problem with not liking cross-filesystem paths) and switched to urlstore for what I needed.

@parkan
Copy link

parkan commented Feb 12, 2019

@mitra42 given that filestore is still considered experimental, it makes sense that the interface is patterned after the normal add (which, sensibly, makes no assumptions about the filesystem)

I expect future API versions (like ipfsx) to have a cleaner approach based on these learnings

will close this for now (though I'm still hoping to clear up the existing API docs a bit)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants