New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
annexrepo.whereis: Implement batch=True #5533
Conversation
Codecov Report
@@ Coverage Diff @@
## master #5533 +/- ##
==========================================
- Coverage 90.17% 90.09% -0.09%
==========================================
Files 296 299 +3
Lines 42022 42470 +448
==========================================
+ Hits 37895 38265 +370
- Misses 4127 4205 +78
Continue to review full report at Codecov.
|
There's at least one test failure to deal with: https://travis-ci.com/github/datalad/datalad/jobs/493853157#L1780 |
FWIW, looks good to me ATM changes wise. |
When a file doesn't have enough copies, whereis() exits with a non-zero status, and the "success" field in those records is set to false. The "full" output value is the only one that raises an error in this case because it contains an assertion that, if "success" is present in the output, it is true. Drop that assertion for consistent behavior with the other output modes. Note that for invalid options, a CommandError is raised upstream. See 256823e (RF: Maintain non-exception behavior of whereis() and fsck(), 2020-12-08).
This assertion is too strict for run-time code given that it throws an exception for unexpected output that isn't necessarily fatal.
_whereis_json_to_dict() retrieves the "whereis" field with dict.get("whereis"). It then iterates over this directly, so if the whereis output from git-annex unexpectedly doesn't have a "whereis" field, a type error would be signaled. Change this to a direct key lookup so that the failure is more informative if the assumption that the "whereis" field is always present turns out to be wrong. (There hasn't been an indication that it is wrong: `for remote in j.get('whereis')` was introduced in 2015 and a type error hasn't yet been reported.) We could also change this to a more permissive .get("whereis", []), though that would likely just hide away an underlying issue. Note that 'git annex whereis' has an empty "whereis" field when a file doesn't have any known locations.
git-annex-whereis has had a --batch option since 6.20160126. Closes datalad#5457.
still looks good to me, so let's proceed. Thank you @kyleam! |
This series fills in the
batch=True
functionality ofAnnexRepo.whereis
, replacing the noop to-do warning from when thebatch
parameter was introduced in v0.3 (2016).Note that this PR's base is master, but the branch is on top of maint. That is so that the bug fix in the first commit can be merged to maint (if desired).
Closes #5457.