Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NF: AnnexRepo.call_annex*() #5163

Merged
merged 30 commits into from Dec 11, 2020
Merged

NF: AnnexRepo.call_annex*() #5163

merged 30 commits into from Dec 11, 2020

Conversation

mih
Copy link
Member

@mih mih commented Nov 16, 2020

Analog to GitRepo.call_git*() and a requirement for #5152

Issues touched upon:

TODO

  • check that tests pass, mostly adjust for new behavior to raise whenever a command fails, and not hide errors in results BF: make AnnexRepo.add() raise an IncompleteResultsException if any non-success #5048
  • adjust drop command (and maybe helpers) to anticipate command error when calling AnnexRepo.drop(). ATM @mih thinks that calling AnnexRepo._call_annex_records(['drop']) is cleaner than going through the various hoops that AnnexRepo.drop() implies.
  • crosscheck with Investigate call_git*(["annex", ...]) use #4406
  • complete docstrings
  • investigate removal of _annex_custom_command() -> AnnexRepo._annex_custom_command() is being removed datalad-crawler#84
  • remove _annex_custom_command()
  • re the removal of _run_*annex_command*() I can only see a single use outside -core:
    ./ukbiobank/datalad_ukbiobank/update.py:            for rec in repo._run_annex_command_json('drop', **drop_kwargs):
    
    for @mih it is OK to rip out immediately (i.e. keep the respective commits in this PR, and fix ukbiobank in parallel)
  • discuss a public partner API for _call_annex_records()
  • return mechanism for records delivered before command failure. Right now they end up in CommandError.kwargs['stdout_json']
  • discuss which post-error recovery/annotation functionality to shift from _call_annex_records() into _call_annex()
  • should we anticipate a future generator capability of the Runner and make call_annex_records() a call_annex_records_() right away?

@codecov
Copy link

codecov bot commented Nov 17, 2020

Codecov Report

Merging #5163 (3ee22d0) into master (8f078cb) will decrease coverage by 0.27%.
The diff coverage is 94.42%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #5163      +/-   ##
==========================================
- Coverage   90.00%   89.72%   -0.28%     
==========================================
  Files         301      301              
  Lines       42426    42413      -13     
==========================================
- Hits        38184    38054     -130     
- Misses       4242     4359     +117     
Impacted Files Coverage Δ
datalad/metadata/aggregate.py 74.48% <0.00%> (ø)
datalad/plugin/addurls.py 95.55% <0.00%> (ø)
datalad/plugin/export_to_figshare.py 26.54% <0.00%> (ø)
datalad/tests/test_direct_mode.py 44.89% <0.00%> (ø)
datalad/customremotes/tests/test_datalad.py 98.07% <50.00%> (-1.93%) ⬇️
datalad/distribution/siblings.py 79.52% <50.00%> (ø)
...atalad/interface/tests/test_add_archive_content.py 99.22% <66.66%> (-0.78%) ⬇️
datalad/support/annexrepo.py 89.53% <96.02%> (-0.02%) ⬇️
datalad/cmd.py 85.94% <100.00%> (-6.33%) ⬇️
datalad/core/distributed/clone.py 89.88% <100.00%> (ø)
... and 36 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8f078cb...f933e1f. Read the comment docs.

@kyleam
Copy link
Contributor

kyleam commented Nov 17, 2020

There are failures in the DATALAD_LOG_LEVEL=2 build. It looks like all of the failures might boil down to
rm_urls use of '--', file_, url. With the log level <= 8, --debug is appended, which git-annex complains about because it's after -- file url. (To my surprise, git-annex seems fine with -c k=v values there).

https://travis-ci.com/github/datalad/datalad/jobs/441808237

@mih
Copy link
Member Author

mih commented Nov 17, 2020

Yes, saw that. Thx! I wanted to be clever, but wasn't. Will revert to the prev. setup.

@kyleam
Copy link
Contributor

kyleam commented Nov 19, 2020

This series looks really nice.

I'm pretty wary about changing the behavior of things like AnnexRepo.drop and AnnexRepo.fsck without adding a compatibility kludge and warning about the upcoming change. Going from the first behavior below to second is a big change:

$ python -c "from datalad.support.annexrepo import AnnexRepo; ar = AnnexRepo('.'); from pprint import pprint; pprint(ar.fsck([]))" 
[{'command': 'fsck',
  'dead': [],
  'error-messages': ['  Only 1 of 3 trustworthy copies exist of one',
                     '  Back it up with git-annex copy.'],
  'file': 'one',
  'input': ['one'],
  'key': 'SHA256E-s4--2c8b08da5ce60398e1f19af0e5dccc744df274b826abe585eaba68c525434806',
  'note': 'checksum...',
  'success': False,
  'untrusted': []}]
$ python -c "from datalad.support.annexrepo import AnnexRepo; ar = AnnexRepo('.'); from pprint import pprint; pprint(ar.fsck([]))" 
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/kyle/src/python/datalad/datalad/support/annexrepo.py", line 2708, in fsck
    git_options=git_options,
  File "/home/kyle/src/python/datalad/datalad/support/annexrepo.py", line 1073, in _call_annex_records
    raise e
  File "/home/kyle/src/python/datalad/datalad/support/annexrepo.py", line 999, in _call_annex_records
    **kwargs,
  File "/home/kyle/src/python/datalad/datalad/support/annexrepo.py", line 971, in _call_annex
    **kwargs)
  File "/home/kyle/src/python/datalad/datalad/cmd.py", line 518, in run
    **results,
datalad.support.exceptions.CommandError: CommandError: 'git annex fsck --json --json-error-messages -c annex.dotfiles=true -c annex.retry=3' failed with exitcode 1 under /tmp/po-OUCQPk6 [err: 'git-annex: fsck: 1 failed'] [info keys: stdout_json]

discuss a public partner API for _call_annex_records()

Based on the added callers aside from push outside of annexrepo.py, perhaps we could get be with cmd and files, adding as needed later?

@mih
Copy link
Member Author

mih commented Nov 20, 2020

I'm pretty wary about changing the behavior of things like AnnexRepo.drop and AnnexRepo.fsck without adding a compatibility kludge and warning about the upcoming change.

Yes, I am uncertain about the best approach here, too. But if we take #5048 seriously and exercise it in full, all these methods must raise an exception. Few things that I know about this so far:

  • the way that results up to a failure are provided is suboptimal (CommandError.kwargs['stdout_json'])
  • we should adjust the code base to the new behavior in full, such that any compatibility kludges are not exercised by us

I have not settled on a good place to put such a kludge. I think most error handling that is still scattered around should move into _call_annex(), and it (or _call_annex_records()) should get a private switch that catches the CommandError to maintain previous behavior. We turn it on for 0.14, switch the default to off in 0.15, and remove the kludge in 0.16.

discuss a public partner API for _call_annex_records()

Based on the added callers aside from push outside of annexrepo.py, perhaps we could get be with cmd and files, adding as needed later?

Yes, that matches my POV. Thx for your thoughts!

@kyleam
Copy link
Contributor

kyleam commented Nov 20, 2020

I'm pretty wary about changing the behavior of things like AnnexRepo.drop and AnnexRepo.fsck without adding a compatibility kludge and warning about the upcoming change.

Yes, I am uncertain about the best approach here, too. But if we take #5048 seriously and exercise it in full, all these methods must raise an exception.

If we were starting fresh, I would probably share that view, though I think always raising could be a bit awkard. For example, it seems very likely that a caller of whereis is inspecting the result; making the caller go through an additional try-except dance seems unnecessary. But perhaps it's worth it for consistency.

As for whether it's worth changing the behavior now, I don't know.

Few things that I know about this so far:

  • the way that results up to a failure are provided is suboptimal (CommandError.kwargs['stdout_json'])

Hmm, I don't see a better alternative than attaching them to the exception, so the only improvement that comes ot mind is tweaking CommandErrors constructor so that stdout_json is stored as an attribute.

  • we should adjust the code base to the new behavior in full, such that any compatibility kludges are not exercised by us

I have not settled on a good place to put such a kludge.

Yeah, it's of course easy for me to say that we should, but how to go about it does seem tricky...

I think most error handling that is still scattered around should move into _call_annex(), and it (or _call_annex_records()) should get a private switch that catches the CommandError to maintain previous behavior. We turn it on for 0.14, switch the default to off in 0.15, and remove the kludge in 0.16.

How's the switch flipped in this case? Above you say core code shouldn't use it (and I agree, our code base shouldn't be emitting our own warnings), but if it's just a parameter on one of the _call_annex* methods, fsck would have to use the old code path in order retain its current behavior.

I haven't come up with any workable ideas that wouldn't be essentially adding a parameter to each top-level method that calls _records and phasing that through, perhaps all the way to a noop. That'd be quite ugly, and require updates of all callers in our code base.

I think a real alternative is to say the consistency isn't worth it at this point (in terms of coming up with the approach or making third-party code update their scripts), unless it's carried in separately with something like new core repo classes.

Anyway, that is of course all just my two cents. Thanks for all of your work on this.

datalad/cmd.py Show resolved Hide resolved
@mih
Copy link
Member Author

mih commented Dec 7, 2020

Trying to get back to this project. Pushed a rebase on present master and addressed two of @kyleam's more recent comments.

@mih
Copy link
Member Author

mih commented Dec 7, 2020

We discussed how the repo methods' error behavior should be (relative to the high-level API and considering established usage). However, we failed to document the consensus here or in the meeting notes. Does anyone remember?

@kyleam
Copy link
Contributor

kyleam commented Dec 7, 2020

I think the consensus was something along the lines of "most of these repo methods should raise exceptions and that they don't can be considered a bug; in the few methods where we think inspection of the output is the main goal (perhaps things like whereis and fsck [*]), specific methods can catch and retain the current behavior".

[*] I don't think there was agreement about whether these shouldn't raise.

@mih mih force-pushed the rf-callannex branch 2 times, most recently from f2e90bd to bd7ae00 Compare December 8, 2020 12:07
kyleam added a commit to kyleam/datalad-crawler that referenced this pull request Dec 9, 2020
test_openfmri_pipeline2() fails in datalad/datalad#5163 (NF:
AnnexRepo.call_annex*()) because get() now raises an exception if
the git-annex-get call fails.

https://github.com/datalad/datalad/runs/1517140012
@kyleam
Copy link
Contributor

kyleam commented Dec 9, 2020

Hopefully all of the remaining travis failures are addressed by the last push, and the -crawler failure should be covered by datalad/datalad-crawler#87.

@mih
Copy link
Member Author

mih commented Dec 9, 2020

I love you too!

kyleam added a commit to kyleam/datalad-crawler that referenced this pull request Dec 9, 2020
test_openfmri_pipeline2() fails in datalad/datalad#5163 (NF:
AnnexRepo.call_annex*()) because get() now raises an exception if
the git-annex-get call fails.

https://github.com/datalad/datalad/runs/1517140012
@mih
Copy link
Member Author

mih commented Dec 10, 2020

Re potential call_annex_records() API: Cases that do not fit (cmd, files=None) are:

  • push: needs custom protocol parameterization and stdin
  • get: needs custom protocol parameterization
  • _get_expected_files(): needs merge_annex_branches flag
  • add: needs custom protocol parameterization and jobs parameter
  • addurl: needs progress flag
  • drop: needs jobs
  • info: needs merge_annex_branches
  • fsck: needs git_options
  • copy: needs jobs, progress, and custom protocol parameterization

Given this assessment, I concur with @kyleam that going with (cmd, files=None) as a start is a good thing.

I would not touch jobs now, and until there is a plan for #3415

As a next step it could make sense to formalize the progress reportin request (progress) and the expected amount (total_nbytes) into a single parameter and start supporting that. But that is not cooked well enough right now IMHO.

mih and others added 13 commits December 10, 2020 14:32
And avoid custom calls and other workaround identified in dataladgh-4406.
There wasn't any benefit anymore and this way another normalize_paths
usage can be avoided.

The big question is, whether the implemented approach is a good general
paradigm to approach the desired behavior of having low-level methods
raise (see dataladgh-5048), while higher-level commands must abide to the `on_failure`
instructions.
…nks obsolete

The latter is now candidate for removal (last usage in
_git_custom_command())

Thx @kyleam for the hint!
From a more specific _call_annex_records()
There is no use case, so far, for a need of a different protocol vs.
improving AnnexJsonProtocol.
Blind attempt, this is not tested on my machine, and seamingly untested
in our CIs too.
@mih mih marked this pull request as ready for review December 10, 2020 13:34
@mih
Copy link
Member Author

mih commented Dec 10, 2020

I believe this is ready.

This was referenced Dec 10, 2020
I believe this was addressed a few commits back (RF: Move generic
error detection into _call_annex()).

[ci skip]
Copy link
Contributor

@kyleam kyleam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks for the updates. Reading through again (not as carefully), I only spotted two minor docstring/comments issues (push incoming).


RF: Replace last _run_annex_command_json() call

Blind attempt, this is not tested on my machine, and seamingly untested
in our CIs too.

Thanks. This codepath is only triggered with fake dates. (I should add a test case that patches DATALAD_FAKE__DATES.) Anyway, your blind substitutions look good and pass on my machine under DATALAD_FAKE__DATES=1.

kyleam added a commit to kyleam/datalad that referenced this pull request Dec 10, 2020
The --key functionality added in 4591591 (NF: addurls: Support
creating content-less tree from known checksums, 2020-11-25) has two
code paths, batch and non-batch, because batch commands are avoided
when fake dates are enabled [1].  However, as pointed out by @mih in
dataladgh-5163, the non-batch code path is not covered in the CI runs.

Extend test_addurls_from_key() to cover datasets create with
fake_dates=True.

[1] 4d7ada2 (RF: annexrepo: Disable batching when fake dates are
    enabled, 2018-04-13)
@mih
Copy link
Member Author

mih commented Dec 11, 2020

Alrighty. This took a long time, but I think the outcome is quite nice and pleasantly mirrors the GitRepo solution. Thanks for the feedback over the past months, and to @kyleam for his contributions specifically.

@mih mih merged commit 51d7ed0 into datalad:master Dec 11, 2020
@mih mih deleted the rf-callannex branch December 11, 2020 06:16
kyleam added a commit to kyleam/datalad that referenced this pull request Dec 18, 2020
As of 46d78a8 (2020-11-18), AnnexRepo.get() uses
_call_annex_records(), which unlike _run_annex_command_json(), raises
a CommandError when git-annex exits with a non-zero status (dataladgh-5163).
Update `datalad get` to catch the exception, extract the json records,
and yield them as results.

Thanks to @adswa for the test case.

Fixes datalad#5260.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants