Skip to content

[BF] Prevent double yielding of impossible get result#7093

Merged
yarikoptic merged 6 commits intodatalad:maintfrom
jsheunis:bf/issue-5537
Oct 24, 2022
Merged

[BF] Prevent double yielding of impossible get result#7093
yarikoptic merged 6 commits intodatalad:maintfrom
jsheunis:bf/issue-5537

Conversation

@jsheunis
Copy link
Copy Markdown
Member

@jsheunis jsheunis commented Oct 17, 2022

Fixes #5537.

This PR prevents double yielding of an impossible get result on a path that does not exist by keeping track of impossible get results and skipping additional yields inside _install_targetpath().

The issue reported this problem occurring twice:

  • when called from a superdataset without interim subdatasets in the problematic path installed: this is addressed by changing the bottomup flag to True in the Subdatasets.__call__().
  • when called from a superdataset with interim subdatasets in the problematic path already installed: this addressed by tracking the paths for which _install_targetpath() has already been called, using the new helper function _check_error_reported() and the tracking dict error_reported

These changes were rerun locally using the example provided in the issue, and succeeded to give single impossible results in both cases. Tests all succeeded locally with python -m pytest -s -v --pyargs datalad/core/distributed datalad/distribution datalad/distributed

Changelog

🐛 Bug Fixes

@codecov
Copy link
Copy Markdown

codecov Bot commented Oct 17, 2022

Codecov Report

Base: 76.01% // Head: 90.96% // Increases project coverage by +14.94% 🎉

Coverage data is based on head (19dd9ef) compared to base (dccc798).
Patch coverage: 88.63% of modified lines in pull request are covered.

Additional details and impacted files
@@             Coverage Diff             @@
##            maint    #7093       +/-   ##
===========================================
+ Coverage   76.01%   90.96%   +14.94%     
===========================================
  Files         355      355               
  Lines       59101    46458    -12643     
  Branches     6318     6326        +8     
===========================================
- Hits        44925    42259     -2666     
+ Misses      14162     4184     -9978     
- Partials       14       15        +1     
Impacted Files Coverage Δ
datalad/tests/utils_pytest.py 89.91% <69.23%> (-0.22%) ⬇️
datalad/distribution/get.py 96.53% <92.30%> (-0.23%) ⬇️
datalad/core/distributed/tests/test_clone.py 97.59% <100.00%> (+21.28%) ⬆️
datalad/distribution/tests/test_get.py 100.00% <100.00%> (+29.42%) ⬆️
datalad/utils.py 88.10% <100.00%> (+12.09%) ⬆️
datalad/metadata/metadata.py 89.41% <0.00%> (-0.72%) ⬇️
datalad/api.py 92.68% <0.00%> (-0.66%) ⬇️
datalad/support/path.py 95.40% <0.00%> (-0.60%) ⬇️
datalad/interface/utils.py 95.98% <0.00%> (-0.27%) ⬇️
datalad/runner/runner.py 100.00% <0.00%> (ø)
... and 167 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@yarikoptic yarikoptic added semver-patch Increment the patch version when merged CHANGELOG-missing When a PR's description does not contain a changelog item, yet. labels Oct 17, 2022
@github-actions github-actions Bot removed the CHANGELOG-missing When a PR's description does not contain a changelog item, yet. label Oct 17, 2022
Copy link
Copy Markdown
Member

@yarikoptic yarikoptic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left some suggestions but I think it would warrant a test (wasn't some test triggering it already? just add pdb at return True and see if some test hits it, then add a test there for this fix. If not a single test was hitting it -- I think it is worthwhile adding such a use case if we "value" this fix. Otherwise it would be tricky to see why it was added in retrospective.

@@ -871,7 +891,7 @@ def __call__(
# as is
dataset=dataset,
# always come from the top to get sensible generator behavior
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this change changes this behavior:

  • I wonder why top-down was considered to be more "sensible" (may be smth in commit which added)
  • comment should be adjusted according to the change if this change is needed (seems to be orthogonal to the actual fix described)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for picking this up, I missed the comment and will update

Comment thread datalad/distribution/get.py Outdated
Comment thread changelog.d/pr-7093.md Outdated
@jsheunis
Copy link
Copy Markdown
Member Author

jsheunis commented Oct 18, 2022

Thanks for the review @yarikoptic. Responding individually to comments.

left some suggestions but I think it would warrant a test (wasn't some test triggering it already? just add pdb at return True and see if some test hits it, then add a test there for this fix. If not a single test was hitting it -- I think it is worthwhile adding such a use case if we "value" this fix. Otherwise it would be tricky to see why it was added in retrospective.

Indeed, no tests hit that spot, so a good idea to add a test.

Co-authored-by: Yaroslav Halchenko <debian@onerussian.com>
Comment thread datalad/distribution/get.py
Comment thread datalad/distribution/get.py Outdated
Comment thread datalad/distribution/get.py Outdated
jsheunis and others added 3 commits October 18, 2022 14:15
- helper function and invocation name change
- update bottomup comment

Co-authored-by: Yaroslav Halchenko <debian@onerussian.com>
to make sure `get` does not yield redundant `impossible` results if a
not existing path is given.
Setting this to bottom-up seemed sensible to prevent futile trials of
getting a file several times in the context of dataladgh-7093, but it turned
out to prevent getting deep levels altogether. Since the double
reporting dataladgh-7093 is about, is catched either way - revert that change.
@bpoldrack
Copy link
Copy Markdown
Member

@jsheunis, @yarikoptic: I added a test (fails on maint w/o this PR) and reverted the change to bottomup. Failing test in datalad-deprecated showed, that it was misguided.

@yarikoptic
Copy link
Copy Markdown
Member

hm, travis stalled with

[gw0] [ 98%] PASSED ../datalad/ui/tests/test_dialog.py::test_message_pbar_state_logging_is_demoted 
No output has been received in the last 10m0s, this potentially indicates a stalled build or something wrong with the build itself.
Check the details on how to adjust your build configuration on: https://docs.travis-ci.com/user/common-build-problems/#build-times-out-because-no-output-was-received
The build has been terminated

in the first matrix run. For some reason travis refuses to log me in atm to restart, can't waste time on that . Hard to say where it stalled since we run with -n 2 thus in parallel. But it is odd since I do not think we saw such stalls before :-/

@yarikoptic
Copy link
Copy Markdown
Member

I don't see how this fix could relate to stalling, I found many of other stalled runs in the historical logs, didn't investigate, let's proceed!

@yarikoptic yarikoptic merged commit 8ae3539 into datalad:maint Oct 24, 2022
@yarikoptic-gitmate
Copy link
Copy Markdown
Collaborator

PR released in 0.17.8

adswa pushed a commit to adswa/datalad that referenced this pull request Oct 26, 2022
Setting this to bottom-up seemed sensible to prevent futile trials of
getting a file several times in the context of dataladgh-7093, but it turned
out to prevent getting deep levels altogether. Since the double
reporting dataladgh-7093 is about, is catched either way - revert that change.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

semver-patch Increment the patch version when merged

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants