-
Notifications
You must be signed in to change notification settings - Fork 112
Change a key-value pair in drop result record #6625
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Replaced error_messages with error_message because the generic results renderer (from datalad.interface.utils) uses error_message. If there are multiple error messages, they will be joined into one multi-line string (as done e.g. by annexjson2result from datalad.interface.results).
@@ -415,7 +415,9 @@ def _drop_dataset(ds, paths, what, reckless, recursive, recursion_limit, jobs): | |||
res['message'] = message | |||
error_messages = r.get('error-messages') | |||
if error_messages: | |||
res['error_messages'] = error_messages | |||
res['error_message'] = '\n'.join( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quick comment: I'd suggest +=
, since both error_message
and error_messages
could be present. That may indicate there's a problem in the command implementation, but we'd probably not swallow information in that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what you mean by +=
- to include both the original error_messages
list and the error_message
string in the res
dictionary? Or to append the string to res['error_message
]? The latter wouldn't matter, because res
is instantiated with a dict(...)
moments earlier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant to append, but you are right - here it doesn't matter. However, r
could have an error_message
already that we ignore here. That may not currently being the case, but that can change whenever things change at a lower-level function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looking at the code, indeed the error_message
shouldn't end up being there. And indeed it might ;) if there is an assumption that it is not there:
res['error_message'] = '\n'.join( | |
assert 'error_message' not in res | |
res['error_message'] = '\n'.join( |
to ensure that, so that if code changes, tests (if case is covered) would alert us that assumption is broken
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is esssentially saying that the following code:
d = dict(a=5)
# <imagine 7 lines of code with no reference to `d`>
d['b'] = 7
should become
d = dict(a=5)
# <imagine 7 lines of code with no reference to `d`>
assert 'b' not in d
d['b'] = 7
if this is to become a standard approach, we have to add a few thousand patch to the code base ASAP ;-)
The test expects that when calling drop with allkeys, potential error messages reported by git-annex will be passed on to the drop's result record (using 'error_message' key).
Code Climate has analyzed commit cf2fd32 and detected 0 issues on this pull request. View more on Code Climate. |
Codecov Report
@@ Coverage Diff @@
## master #6625 +/- ##
=======================================
Coverage 91.16% 91.17%
=======================================
Files 353 353
Lines 44438 44491 +53
=======================================
+ Hits 40513 40564 +51
- Misses 3925 3927 +2
Continue to review full report at Codecov.
|
# https://github.com/datalad/datalad/issues/6577 | ||
@with_tempfile | ||
def test_drop_allkeys_result_contains_annex_error_messages(path): | ||
# when calling drop with allkeys, expect git-annex error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually: what=allkeys
is irrelevant from my POV. This may be, where it currently happens, but the change should generally ensure that we get whatever comes from annex regardless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am kind of fine with this, but why not simply address what the issue suggests? Let generic_result_renderer
also take error_messages
into account? There may be more spots than just drop, where this could happen (now or in the future).
If you take my suggestion instead, then nothing changes in the result records - just in rendering. It can then go against |
@@ -415,7 +415,9 @@ def _drop_dataset(ds, paths, what, reckless, recursive, recursion_limit, jobs): | |||
res['message'] = message | |||
error_messages = r.get('error-messages') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TL;DR: I think we would benefit from some AnnexRepo.annex_to_datalad_error
@staticmethod
helper or alike to be reused across all those uses.
uff... venting about my quality time to understand what we are talking about. We would benefit from typing the result record into some @dataclass
and adding type checking... First I have spent time hunting for error-message
(not error_message
) to find none!
I did find "error-messages"
$> git grep -p "[\"']error-messages"
CHANGELOG.md=only supported Python flavor.
CHANGELOG.md: relay information from the "error-messages" field. ([#3931][])
datalad/distributed/drop.py=def _drop_dataset(ds, paths, what, reckless, recursive, recursion_limit, jobs):
datalad/distributed/drop.py: error_messages = r.get('error-messages')
datalad/distributed/tests/test_ora_http.py=def test_read_access(store_path, store_url, ds_path):
datalad/distributed/tests/test_ora_http.py: assert_equal(r["error-messages"], [])
datalad/interface/results.py=def annexjson2result(d, ds, **kwargs):
datalad/interface/results.py: if d.get('error-messages', None):
datalad/interface/results.py: res['error_message'] = '\n'.join(m.strip() for m in d['error-messages'])
datalad/local/copy_file.py=def _place_filekey(finfo, str_src, dest, str_dest, dest_repo):
datalad/local/copy_file.py: m for r in res for m in r.get('error-messages', [])),
datalad/runner/exception.py=def _format_json_error_messages(recs: List[Dict]):
datalad/runner/exception.py: '\n'.join(r.get('error-messages', [])),
datalad/runner/tests/test_exception.py=def get_json_objects(object_count, message_count) -> List[Dict]:
datalad/runner/tests/test_exception.py: "error-messages": [
datalad/support/annex_utils.py=def _fake_json_for_non_existing(paths, cmd):
datalad/support/annex_utils.py: "error-messages": ["File unknown to git"] # Note,
datalad/support/annexrepo.py=class AnnexRepo(GitRepo, RepoInterface):
datalad/support/annexrepo.py: for k in j if k != 'file' and k != 'error-messages'})
datalad/support/annexrepo.py: if j.get('error-messages', None):
datalad/support/annexrepo.py: rec['error_message'] = '\n'.join(m.strip() for m in j['error-messages'])
datalad/support/annexrepo.py: message='\n'.join(r['error-messages'])
datalad/support/annexrepo.py: if 'error-messages' in r else None,
datalad/support/annexrepo.py=class AnnexJsonProtocol(WitlessProtocol):
datalad/support/annexrepo.py: for err_msg in action.pop('error-messages', []):
datalad/support/gitrepo.py=class GitRepo(CoreGitRepo):
datalad/support/gitrepo.py: message='\n'.join(r['error-messages'])
datalad/support/gitrepo.py: if 'error-messages' in r else None,
datalad/support/tests/test_annexrepo.py=def test_annex_drop(src, dst):
datalad/support/tests/test_annexrepo.py: # CommandError has to pull the errors from the JSON record 'error-messages'
datalad/support/tests/test_annexrepo.py=def test_error_reporting(path):
datalad/support/tests/test_annexrepo.py: 'error-messages': ['File unknown to git'],
datalad/support/tests/test_annexrepo.py=def test_save_noperms(path):
datalad/support/tests/test_annexrepo.py: assert_in('permission denied', res[0]['error-messages'][0])
confused, and re-reading spotted the difference in -
vs _
, so indeed there is error_message
but really just few:
$> git grep -p "[\"']error_message[^s]"
datalad/interface/results.py=def get_status_dict(action=None, ds=None, path=None, type=None, logger=None,
datalad/interface/results.py: d['error_message'] = error_message
datalad/interface/results.py=def annexjson2result(d, ds, **kwargs):
datalad/interface/results.py: res['error_message'] = '\n'.join(m.strip() for m in d['error-messages'])
datalad/interface/utils.py=def generic_result_renderer(res):
datalad/interface/utils.py: res['error_message'][0] % res['error_message'][1:]
datalad/interface/utils.py: if isinstance(res['error_message'], tuple) else res[
datalad/interface/utils.py: 'error_message']), ac.RED)
datalad/interface/utils.py: if res.get('error_message', None) and res.get('status', None) != 'ok' else ''))
datalad/support/annexrepo.py=class AnnexRepo(GitRepo, RepoInterface):
datalad/support/annexrepo.py: rec['error_message'] = '\n'.join(m.strip() for m in j['error-messages'])
and only then, looking at above duplications of conversions from error-messages
to error_message
that we are talking about duplicated handling of error-messages
we receive from direct invocations of git-annex (now spread out through code base instead of handled uniformly within AnnexRepo
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking a closer look! Sorry if I didn't make it clearer about the spelling.
Let me just add that error_messages
(underscore / plural) which started this issue, and which this PR changes into error_message
, does not exist outside drop
at all (below I'm looking at master, but usage in annexrepo has been since removed in maint), therefore I decided to change drop
and not the renderer:
❱ git grep -p "[^_format_json_]error_messages"
datalad/distributed/drop.py=def _drop_dataset(ds, paths, what, reckless, recursive, recursion_limit, jobs):
datalad/distributed/drop.py: error_messages = r.get('error-messages')
datalad/distributed/drop.py: if error_messages:
datalad/distributed/drop.py: res['error_messages'] = error_messages
datalad/support/annexrepo.py=class AnnexRepo(GitRepo, RepoInterface):
datalad/support/annexrepo.py: for k in j if k != 'file' and k != 'error_messages'})
datalad/support/annexrepo.py: # change annex' `error_messages` into singular to match result
I think we would benefit from some AnnexRepo.annex_to_datalad_error @staticmethod helper or alike to be reused across all those uses
There is, in a way, annexjson2result
in datalad/interface/results.py
, which maybe could be reused?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I approval for the current diff. It fixes a real problem, in a non-intrusive way. Many things could be done better, in many place and new issues should be filed. But here is a fix, it works, it is better than before. Nothing should block it for another two weeks.
Unless stopped, I will merge tomorrow morning CEST |
🚀 PR was released in |
When used with
what=(allkeys/datasets/all)
andreckless != kill
,drop
created a result record in which a list of error messages coming from git-annex was placed in the field namederror_messages
. This led to the generic results renderer (which expectserror_message
and/ormessage
instead) not showing the error message to the user, as described in #6577.This PR rewrites the error messages into a string under
res['error_message']
(if there are multiple messages, they are joined into one multiline string, followingannexjson2result
fromdatalad.interface.results
). With that, results renderer behaves as expected.Before (I'm mocking both note and error-messages produced by git-annex, although in the original issue the note was empty):
and after:
Note. An alternative solution (taking #6577 literally) would be to equip the results renderer with the ability to understand a list of error messages, but I think the solution above is more in line with the rest of the code (
error_messages
is not used anywhere else,error-messages
is, but usually to process git annex output, often converted intoerror_message
as above), and good as an immediate UX fix.Note 2. The added test is fairly specific - it tests directly for the thing I wanted to change (annex's
error-messages
being rewritten to result'serror_message
so that the renderer gets what it wants) but I'm not 100% sure if that's the thing to be tested in the long run.Note 3. Should this PR be against master or maint?
TODO:
_drop_allkeys
/_call_annex_record_items
) to always return an errorpatch
context managerChangelog
🐛 Bug Fixes
drop
not being shown under some circumstances. Fixesgeneric_result_renderer
should not ignoreerror_messages
#6577