Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

client: add support for nocopy, raw-leaves to add method #159

Merged
merged 5 commits into from
Mar 2, 2019

Conversation

radfish
Copy link
Contributor

@radfish radfish commented Feb 3, 2019

nocopy is accessible form the CLI, so let's make it accessible from API as well.

@AuHau
Copy link
Member

AuHau commented Feb 3, 2019

I believe that the target branch should be py-ipfs-http-client and not master.
The py-ipfs-http-client will soon replace master.

@radfish radfish changed the base branch from master to py-ipfs-http-client February 3, 2019 20:00
@radfish
Copy link
Contributor Author

radfish commented Feb 3, 2019

Ok. Rebased and re-tested.

@ntninja
Copy link
Contributor

ntninja commented Feb 5, 2019

Looks good and I'd love to just merge this. However… Two things:

  • As far as I know, until recently the nocopy parameter did not work correctly when using the API since the API would stream a copy of the local file using HTTP to the daemon which would then blindly add it to its block store – nocopy flag or not. Could you please check that this works with the recent fix in go-ipfs? The linked issues has good “steps to reproduce”, just use the Python API (ipfshttpclient.connect().add("filepath", nocopy=True)), rather then curl for uploading.
  • And while we're at testing: Please also add a test that uses this new parameter to test/functional/files.py. Ideally the test would be similar to what was described in the linked issue but a basic test will do too.

@radfish radfish force-pushed the PR--add-nocopy branch 2 times, most recently from 826799e to 530b8bc Compare February 12, 2019 03:42
@radfish
Copy link
Contributor Author

radfish commented Feb 12, 2019

  1. Confirmed, tested on go-ipfs 77cd41a. I did not notice that blocks were not actually added to the filestore, thanks for pointing it out. This time I checked with filestore ls and dag get, but see notes on testing nocopy below. This PR adds the Abspath header for absolute paths -- see rationale in commit message below.

  2. Added tests for the new arguments. Tests testing error path depend on the error report added to the server in 77cd41a, so perhaps can't be released until go-ipfs is released with that fix.

Note that server is still broken: it does not return an error if the path is invalid, it again silently ignores nocopy and adds to blockstore in this case. Also note that, relative paths are invalid (neither relative to server user home nor relative to working dir of server is works), despite paths in the filestore being recorded relative to the user home. ipfs/kubo#5987

Finally note that adding a file with nocopy whose blocks happen to already be in the blockstore, is going to succeed, but those blocks won't be (also) added to the filestore, they won't be listed as duplicates by filestore dups. So, if you test nocopy but don't see the file in the filestore, that may be normal. (I don't find this behavior intuitive for getting your data into the filestore, so I patched my IPFS filestore/filestore.go Has method to allow duplicate adds to the filestore, but also need to patch the block rm command to support removing from blockstore only, in order to be able to remove the data in duplicated in the blockstore.).

commit 530b8bc99153b5567422f37cfc4ec26b13a6f0c9 (HEAD -> PR--add-nocopy, rf/PR--add-nocopy)
Author: redfish <redfish@galactica.pw>
Date:   Sun Feb 3 13:51:22 2019 -0500

    client: add nocopy, raw-leaves options to add cmd
    
    The nocopy command is for adding files by reference to their
    location in the filesystem. The nocopy implies raw-leaves.
    
    Requires bug fix in go-ipfs 77cd41a (see issue #4558), ver >0.4.18
    (strictly greater).
    
    The nocopy option implies raw-leaves, and the default for raw-leaves
    depends on nocopy. This dynamic default is the behavior of the HTTP API
    tested via curl.
    
    Adding files to the filestore via nocopy requires that the path
    is absolute, not relative. The multipart stream component appens
    the Abspath header when it has the absolute path to the file.
    
    In IPFS daemon used for tests, enable the filestore feature.
    
    The tests could leverage the filestore API, but that part of the API is
    not yet implemented, so leaving parts of the test commented out.

commit db87134d4ba02ee955464055a9eb4f81ff450350
Author: redfish <redfish@galactica.pw>
Date:   Sat Feb 9 20:22:54 2019 -0500

    multipart: add Abspath header when have abs path
    
    This header is required for adding files by reference, i.e.  adding into
    the filestore via the add command with nocopy=True option.
    
    We can't unconditionally add the header, because we don't always have
    the path to the file (e.g. not when the caller of add() passes us a file
    descriptor only). Unconditionally adding a header with a blank value
    does not seem good. We only append the Abspath header if we have a path
    and if it is an absolute path.  Note that Abspath value must be an
    absolute path -- the nocopy option only makes sense with an absolute
    path.
    
    Regardless of this client side decision for when to append the header,
    though, the server should return an error if a relative path or a
    non-existant path is passed in Abspath. At the very least, it should
    return an error nocopy is passed with a relative path or non-existant
    path or a path outside of the daemon's working directory.  Currently,
    the server does NOT return an error, instead it silently ignores nocopy
    and adds the file to the blockstore instead -- this is a bug.
    
    Luckily, the server does return an error if nocopy is passed but Abspath
    header was not set. The client-side logic described above reduces the
    case of nocopy with relative path to nocopy without Abspath header, and
    thus the caller will get an error in all cases.

commit 4db4de7a79cd5ffd9c1215713cebba66bac12ab9
Author: redfish <redfish@galactica.pw>
Date:   Sat Feb 9 23:20:16 2019 -0500

    multipart: fix warning about regexp flags
    
    ./ipfshttpclient/multipart.py:362: DeprecationWarning: Flags not at
    the start of the expression '^.*\\Z(?ms)$'

commit 2cc4d051c6d3c858059178335446b5905f788a4e
Author: redfish <redfish@galactica.pw>
Date:   Sat Feb 9 23:23:16 2019 -0500

    test: unit: multipart: use regexp literals
    
    Use r'' when defining literals that form a regular expression.
    Fixes warning about invalid escape characters, like \S.

@radfish radfish changed the title client: add nocopy argument to add method client: add support for nocopy, raw-leaves to add method Feb 12, 2019
Copy link
Contributor

@ntninja ntninja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One tiny thing regarding the tests only: We try to leave the daemon in a similar state after the tests as when it was started, so you need to add the relevant fixtures to your tests.

Thank you for this submission!

test/functional/test_files.py Outdated Show resolved Hide resolved
Copy link
Contributor

@ntninja ntninja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

… and the CI will of course need to be made happy again. I added some comments about specifics, but you'll also need to:

  • Rebase onto latest py-ipfs-http-client to make timeouts on Python 3.5+ go away (my mistake)
    • GIT commands: git remote add upstream https://github.com/ipfs/py-ipfs-api.git; git fetch upstream py-ipfs-http-client && git rebase upstream/py-ipfs-http-client
  • Fix the code style issues (run tox -e codestyle to see)

Again, thanks for your contribution!

test/unit/test_multipart.py Outdated Show resolved Hide resolved
test/functional/test_files.py Show resolved Hide resolved
Use r'' when defining literals that form a regular expression.
Fixes warning about invalid escape characters, like \S.
./ipfshttpclient/multipart.py:362: DeprecationWarning: Flags not at
the start of the expression '^.*\\Z(?ms)$'
This header is required for adding files by reference, i.e.  adding into
the filestore via the add command with nocopy=True option.

We can't unconditionally add the header, because we don't always have
the path to the file (e.g. not when the caller of add() passes us a file
descriptor only). Unconditionally adding a header with a blank value
does not seem good. We only append the Abspath header if we have a path
and if it is an absolute path.  Note that Abspath value must be an
absolute path -- the nocopy option only makes sense with an absolute
path.

Regardless of this client side decision for when to append the header,
though, the server should return an error if a relative path or a
non-existant path is passed in Abspath. At the very least, it should
return an error nocopy is passed with a relative path or non-existant
path or a path outside of the daemon's working directory.  Currently,
the server does NOT return an error, instead it silently ignores nocopy
and adds the file to the blockstore instead -- this is a bug.

Luckily, the server does return an error if nocopy is passed but Abspath
header was not set. The client-side logic described above reduces the
case of nocopy with relative path to nocopy without Abspath header, and
thus the caller will get an error in all cases.
The nocopy command is for adding files by reference to their
location in the filesystem. The nocopy implies raw-leaves.

Requires bug fix in go-ipfs 77cd41a (see issue #4558), ver >0.4.18
(strictly greater).

The nocopy option implies raw-leaves, and the default for raw-leaves
depends on nocopy. This dynamic default is the behavior of the HTTP API
tested via curl.

Adding files to the filestore via nocopy requires that the path
is absolute, not relative. The multipart stream component appens
the Abspath header when it has the absolute path to the file.

In IPFS daemon used for tests, enable the filestore feature.

The tests could leverage the filestore API, but that part of the API is
not yet implemented, so leaving parts of the test commented out.
Requires go-ipfs patch that is not yet released.
Please revert this commit when the cited commit makes it into the
release and the go-ipfs on buildbots for ipfsapi is updated.
@radfish
Copy link
Contributor Author

radfish commented Mar 2, 2019

Ok, applied all the fixes. Thanks for specific patches.
CI timed out on the stat test.

Also, skipped the test that tests the error path. Please revert the last commit when go-ipfs is updated.

@ntninja
Copy link
Contributor

ntninja commented Mar 2, 2019

@radfish: Thank you! This looks perfect! 😁

@ntninja ntninja merged commit ec70912 into ipfs-shipyard:py-ipfs-http-client Mar 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants