Skip to content

Conversation

@yarikoptic
Copy link
Member

Initially observed while giving a live demo (kheh kheh -- always the best time
to find bugs) on DANDI datasets such as
http://github.com/dandisets/000029, e.g.:

(git-annex)lena:/tmp/000029[tags/0.210806.2112^0]git
$> datalad get sub-anm36996*
[ERROR  ]   download failed: Not Found
[ERROR  ]   download failed: Not Found
[ERROR  ]   download failed: Not Found
[ERROR  ]   download failed: Not Found
[ERROR  ]   download failed: Not Found
[ERROR  ]   download failed: Not Found
[ERROR  ]   download failed: Not Found
[ERROR  ]   download failed: Not Found
[ERROR  ]   download failed: Not Found
[ERROR  ]   download failed: Not Found
[ERROR  ]   download failed: Not Found
Total:   0%|                                                             | 0.00/20.7M [00:12<?, ? Bytes/s]
Get sub-anm369964/sub-anm .. 9964_behavior+ecephys.nwb:  41%|█▋  | 3.11M/7.66M [00:10<00:15, 295k Bytes/s]

which is due to the fact that the first URL in git-annex for the files is
no longer available, so annex falls to the second available URL, but it keeps
repeating the same list error-messages in its --json-error-messages:

$> 'git' 'annex' 'get' '--json' '--json-error-messages' '--json-progress' '-c' 'annex.dotfiles=true' '--' 'sub-anm369962/sub-anm369962_behavior+ecephys.nwb'
{"byte-progress":4989429,"action":{"command":"get","note":"from web...","input":["sub-anm369962/sub-anm369962_behavior+ecephys.nwb"],"key":"SHA256E-s6644036--d01c43576e3aeceed53a052defa8bbbf15b885146f518acf63d228cee44576ac.nwb","error-messages":["  download failed: Not Found"],"file":"sub-anm369962/sub-anm369962_behavior+ecephys.nwb"},"total-size":6644036,"percent-progress":"75.1%"}
{"byte-progress":5546485,"action":{"command":"get","note":"from web...","input":["sub-anm369962/sub-anm369962_behavior+ecephys.nwb"],"key":"SHA256E-s6644036--d01c43576e3aeceed53a052defa8bbbf15b885146f518acf63d228cee44576ac.nwb","error-messages":["  download failed: Not Found"],"file":"sub-anm369962/sub-anm369962_behavior+ecephys.nwb"},"total-size":6644036,"percent-progress":"83.48%"}
...
{"byte-progress":6451701,"action":{"command":"get","note":"from web...","input":["sub-anm369962/sub-anm369962_behavior+ecephys.nwb"],"key":"SHA256E-s6644036--d01c43576e3aeceed53a052defa8bbbf15b885146f518acf63d228cee44576ac.nwb","error-messages":["  download failed: Not Found"],"file":"sub-anm369962/sub-anm369962_behavior+ecephys.nwb"},"total-size":6644036,"percent-progress":"97.11%"}
{"command":"get","note":"from web...\nchecksum...","success":true,"input":["sub-anm369962/sub-anm369962_behavior+ecephys.nwb"],"key":"SHA256E-s6644036--d01c43576e3aeceed53a052defa8bbbf15b885146f518acf63d228cee44576ac.nwb","error-messages":["  download failed: Not Found"],"file":"sub-anm369962/sub-anm369962_behavior+ecephys.nwb"}

Not sure yet if there could be a json record from annex which would have
"action" but without byte-progress reporting, so decided to just condition on
it being not a progress report for logging an error (work @mih is doing might
eventually remove all those ERROR logs anyways). With this patch we would just
get the cleaner

$> datalad get sub-anm369962/sub-anm369962_behavior+ecephys.nwb
get(ok): sub-anm369962/sub-anm369962_behavior+ecephys.nwb (file) [download failed: Not Found]

where this non-consequential/non-informative and possibly confusing to the user
error is reported, but how could we address that (e.g. not report any errors if
command eventually succedeed?) is a separate question.

PS. relevant issues in dandisets, which whenever addressed would eliminate this
problem, and thus would cause this issue to not reproduce using aforementioned
commands:

Initially observed while giving a live demo (kheh kheh -- always the best time
to find bugs) on DANDI datasets such as
http://github.com/dandisets/000029, e.g.:

	(git-annex)lena:/tmp/000029[tags/0.210806.2112^0]git
	$> datalad get sub-anm36996*
	[ERROR  ]   download failed: Not Found
	[ERROR  ]   download failed: Not Found
	[ERROR  ]   download failed: Not Found
	[ERROR  ]   download failed: Not Found
	[ERROR  ]   download failed: Not Found
	[ERROR  ]   download failed: Not Found
	[ERROR  ]   download failed: Not Found
	[ERROR  ]   download failed: Not Found
	[ERROR  ]   download failed: Not Found
	[ERROR  ]   download failed: Not Found
	[ERROR  ]   download failed: Not Found
	Total:   0%|                                                             | 0.00/20.7M [00:12<?, ? Bytes/s]
	Get sub-anm369964/sub-anm .. 9964_behavior+ecephys.nwb:  41%|█▋  | 3.11M/7.66M [00:10<00:15, 295k Bytes/s]

which is due to the fact that the first URL in git-annex for the files is
no longer available, so annex falls to the second available URL, but it keeps
repeating the same list error-messages in its --json-error-messages:

	$> 'git' 'annex' 'get' '--json' '--json-error-messages' '--json-progress' '-c' 'annex.dotfiles=true' '--' 'sub-anm369962/sub-anm369962_behavior+ecephys.nwb'
	{"byte-progress":4989429,"action":{"command":"get","note":"from web...","input":["sub-anm369962/sub-anm369962_behavior+ecephys.nwb"],"key":"SHA256E-s6644036--d01c43576e3aeceed53a052defa8bbbf15b885146f518acf63d228cee44576ac.nwb","error-messages":["  download failed: Not Found"],"file":"sub-anm369962/sub-anm369962_behavior+ecephys.nwb"},"total-size":6644036,"percent-progress":"75.1%"}
	{"byte-progress":5546485,"action":{"command":"get","note":"from web...","input":["sub-anm369962/sub-anm369962_behavior+ecephys.nwb"],"key":"SHA256E-s6644036--d01c43576e3aeceed53a052defa8bbbf15b885146f518acf63d228cee44576ac.nwb","error-messages":["  download failed: Not Found"],"file":"sub-anm369962/sub-anm369962_behavior+ecephys.nwb"},"total-size":6644036,"percent-progress":"83.48%"}
    ...
	{"byte-progress":6451701,"action":{"command":"get","note":"from web...","input":["sub-anm369962/sub-anm369962_behavior+ecephys.nwb"],"key":"SHA256E-s6644036--d01c43576e3aeceed53a052defa8bbbf15b885146f518acf63d228cee44576ac.nwb","error-messages":["  download failed: Not Found"],"file":"sub-anm369962/sub-anm369962_behavior+ecephys.nwb"},"total-size":6644036,"percent-progress":"97.11%"}
	{"command":"get","note":"from web...\nchecksum...","success":true,"input":["sub-anm369962/sub-anm369962_behavior+ecephys.nwb"],"key":"SHA256E-s6644036--d01c43576e3aeceed53a052defa8bbbf15b885146f518acf63d228cee44576ac.nwb","error-messages":["  download failed: Not Found"],"file":"sub-anm369962/sub-anm369962_behavior+ecephys.nwb"}

Not sure yet if there could be a json record from annex which would have
"action" but without byte-progress reporting, so decided to just condition on
it being not a progress report for logging an error (work @mih is doing might
eventually remove all those ERROR logs anyways).  With this patch we would just
get the cleaner

	$> datalad get sub-anm369962/sub-anm369962_behavior+ecephys.nwb
	get(ok): sub-anm369962/sub-anm369962_behavior+ecephys.nwb (file) [download failed: Not Found]

where this non-consequential/non-informative and possibly confusing to the user
error is reported, but how could we address that (e.g. not report any errors if
command eventually succedeed?) is a separate question.

PS. relevant issues in dandisets, which whenever addressed would eliminate this
problem, and thus would cause this issue to not reproduce using aforementioned
commands:

- dandi/dandisets#78
- dandi/dandisets#79
@yarikoptic yarikoptic added the semver-patch Increment the patch version when merged label Aug 26, 2021
@yarikoptic yarikoptic added this to the 0.14.8 milestone Aug 26, 2021
@yarikoptic yarikoptic added the merge-if-ok OP considers this work done, and requests review/merge label Aug 26, 2021
@codecov
Copy link

codecov bot commented Aug 26, 2021

Codecov Report

Merging #5936 (0e18fe8) into maint (dfdd352) will decrease coverage by 34.65%.
The diff coverage is 100.00%.

❗ Current head 0e18fe8 differs from pull request most recent head 0fd7dbf. Consider uploading reports for the commit 0fd7dbf to get more accurate results
Impacted file tree graph

@@             Coverage Diff             @@
##            maint    #5936       +/-   ##
===========================================
- Coverage   90.28%   55.63%   -34.66%     
===========================================
  Files         300      297        -3     
  Lines       42516    42480       -36     
===========================================
- Hits        38387    23632    -14755     
- Misses       4129    18848    +14719     
Impacted Files Coverage Δ
datalad/support/annexrepo.py 78.99% <100.00%> (-12.43%) ⬇️
datalad/tests/test_api.py 0.00% <0.00%> (-100.00%) ⬇️
datalad/tests/test_base.py 0.00% <0.00%> (-100.00%) ⬇️
datalad/tests/test_config.py 0.00% <0.00%> (-100.00%) ⬇️
datalad/ui/tests/__init__.py 0.00% <0.00%> (-100.00%) ⬇️
datalad/tests/test__main__.py 0.00% <0.00%> (-100.00%) ⬇️
datalad/tests/test_strings.py 0.00% <0.00%> (-100.00%) ⬇️
datalad/ui/tests/test_base.py 0.00% <0.00%> (-100.00%) ⬇️
datalad/support/tests/utils.py 0.00% <0.00%> (-100.00%) ⬇️
datalad/tests/test_testrepos.py 0.00% <0.00%> (-100.00%) ⬇️
... and 205 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update dfdd352...0fd7dbf. Read the comment docs.

@yarikoptic
Copy link
Member Author

travis -- unrelated error, restarted that one run. appveyor -- test_parallel.test_gracefull_death also unrelated (need to let that test case RiP I am afraid -- too flaky)

@yarikoptic
Copy link
Member Author

ok, no objections stated, and it is imho quite straightforward and goes along with the idea of reduced logging, so I will proceed

@yarikoptic yarikoptic merged commit 6fe0ef9 into datalad:maint Sep 10, 2021
@yarikoptic yarikoptic deleted the bf-annex-get-error-reporting branch October 8, 2021 14:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merge-if-ok OP considers this work done, and requests review/merge semver-patch Increment the patch version when merged

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant