Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

annexrepo.whereis: Implement batch=True #5533

Merged
merged 4 commits into from Mar 29, 2021

Conversation

kyleam
Copy link
Contributor

@kyleam kyleam commented Mar 25, 2021

This series fills in the batch=True functionality of AnnexRepo.whereis, replacing the noop to-do warning from when the batch parameter was introduced in v0.3 (2016).

Note that this PR's base is master, but the branch is on top of maint. That is so that the bug fix in the first commit can be merged to maint (if desired).

Closes #5457.

@codecov
Copy link

codecov bot commented Mar 25, 2021

Codecov Report

Merging #5533 (417f292) into master (bc80aba) will decrease coverage by 0.08%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #5533      +/-   ##
==========================================
- Coverage   90.17%   90.09%   -0.09%     
==========================================
  Files         296      299       +3     
  Lines       42022    42470     +448     
==========================================
+ Hits        37895    38265     +370     
- Misses       4127     4205      +78     
Impacted Files Coverage Δ
...atalad/interface/tests/test_add_archive_content.py 99.21% <100.00%> (-0.01%) ⬇️
datalad/support/annexrepo.py 90.61% <100.00%> (-0.80%) ⬇️
datalad/support/tests/test_annexrepo.py 97.65% <100.00%> (+0.17%) ⬆️
datalad/ui/utils.py 56.25% <0.00%> (-21.88%) ⬇️
datalad/support/archive_utils_patool.py 29.87% <0.00%> (-10.39%) ⬇️
datalad/support/tests/test_path.py 92.50% <0.00%> (-5.00%) ⬇️
datalad/tests/test_archives.py 91.15% <0.00%> (-4.09%) ⬇️
datalad/customremotes/tests/test_ria_utils.py 96.87% <0.00%> (-3.13%) ⬇️
datalad/local/copy_file.py 93.61% <0.00%> (-2.56%) ⬇️
datalad/customremotes/tests/test_datalad.py 97.87% <0.00%> (-2.13%) ⬇️
... and 53 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bc80aba...417f292. Read the comment docs.

@kyleam
Copy link
Contributor Author

kyleam commented Mar 25, 2021

There's at least one test failure to deal with: test_add_archive_content.TestAddArchiveOptions.test_add_delete_after_and_drop

https://travis-ci.com/github/datalad/datalad/jobs/493853157#L1780

@kyleam kyleam marked this pull request as draft March 25, 2021 21:40
@yarikoptic
Copy link
Member

FWIW, looks good to me ATM changes wise.

When a file doesn't have enough copies, whereis() exits with a
non-zero status, and the "success" field in those records is set to
false.  The "full" output value is the only one that raises an error
in this case because it contains an assertion that, if "success" is
present in the output, it is true.

Drop that assertion for consistent behavior with the other output
modes.

Note that for invalid options, a CommandError is raised upstream.  See
256823e (RF: Maintain non-exception behavior of whereis() and
fsck(), 2020-12-08).
This assertion is too strict for run-time code given that it throws an
exception for unexpected output that isn't necessarily fatal.
_whereis_json_to_dict() retrieves the "whereis" field with
dict.get("whereis").  It then iterates over this directly, so if the
whereis output from git-annex unexpectedly doesn't have a "whereis"
field, a type error would be signaled.

Change this to a direct key lookup so that the failure is more
informative if the assumption that the "whereis" field is always
present turns out to be wrong.  (There hasn't been an indication that
it is wrong: `for remote in j.get('whereis')` was introduced in 2015
and a type error hasn't yet been reported.)

We could also change this to a more permissive .get("whereis", []),
though that would likely just hide away an underlying issue.  Note
that 'git annex whereis' has an empty "whereis" field when a file
doesn't have any known locations.
git-annex-whereis has had a --batch option since 6.20160126.

Closes datalad#5457.
@kyleam kyleam marked this pull request as ready for review March 26, 2021 13:34
@yarikoptic
Copy link
Member

still looks good to me, so let's proceed. Thank you @kyleam!

@yarikoptic yarikoptic merged commit 3381658 into datalad:master Mar 29, 2021
@kyleam kyleam deleted the whereis-batch branch March 30, 2021 12:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

AnnexRepo.whereis: needs batch=True handling
2 participants