Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get_content_{annex,}info: Restrict full report to paths=None #5693

Merged
merged 3 commits into from May 25, 2021

Conversation

kyleam
Copy link
Contributor

@kyleam kyleam commented May 22, 2021

This series makes

get_content_info(paths=[], ...)

different from

get_content_info(paths=None, ...)

That, in turn, makes GitRepo.diff and GitRepo.status match their documented behavior.

See gh-3292 and the second commit message for the details.

diff() converts the keys of an ordered dict into the `paths` argument
for a diffstatus() call and two get_content_annexinfo() calls.  For
the diffstatus() call, it maps any empty dictionary to paths=None,
while for the get_content_annexinfo() calls it maps the empty dict to
an empty .keys().

With the next commit, get_content_annexinfo() will distinguish an
empty list from None, so pass None for these calls too to keep the
current behavior.  Also, pull the argument into a variable to make it
more obvious to the reader that all calls receive the same argument.
GitRepo.get_content_info() and AnnexRepo.get_content_annexinfo()
report on all paths in a tree if the `paths` argument is false-y,
including if it is an empty list.

This behavior requires callers to guard against the empty list case if
they really want to restrict results to the paths in the list.  That
means it's usually not safe to do something like

  get_content_info(paths=[<some filtering list comprehension>], ...)

which was the source of the bug in dataladgh-3288.

This behavior is also problematic because GitRepo.diff()
GitRepo.status() claim to treat None and an empty list differently in
their docstrings.  However, this isn't the case because underneath
they use get_content_info().

Update get_content_info() and get_content_annexinfo() to distinguish
None versus an empty list.  It's a breaking change in behavior.
However, it seems unlikely that many third party callers are using
get_content_{,annex}info and even more unlikely that those that do
rely on an empty list reporting the full tree.  So, risk the breakage
to align diff() and status() with their documented behavior and to
avoid future bugs like dataladgh-3288.

Closes datalad#3292.
@codecov
Copy link

codecov bot commented May 22, 2021

Codecov Report

Merging #5693 (5a27faf) into master (f63ed56) will decrease coverage by 4.18%.
The diff coverage is 90.47%.

❗ Current head 5a27faf differs from pull request most recent head d524cd9. Consider uploading reports for the commit d524cd9 to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##           master    #5693      +/-   ##
==========================================
- Coverage   90.53%   86.34%   -4.19%     
==========================================
  Files         308      305       -3     
  Lines       42036    42037       +1     
==========================================
- Hits        38057    36298    -1759     
- Misses       3979     5739    +1760     
Impacted Files Coverage Δ
datalad/support/tests/test_fileinfo.py 94.07% <87.50%> (-5.93%) ⬇️
datalad/core/local/diff.py 90.72% <100.00%> (+0.09%) ⬆️
datalad/support/annexrepo.py 89.34% <100.00%> (-2.30%) ⬇️
datalad/support/gitrepo.py 91.69% <100.00%> (-0.07%) ⬇️
datalad/plugin/wtf.py 0.00% <0.00%> (-100.00%) ⬇️
datalad/plugin/addurls.py 0.00% <0.00%> (-100.00%) ⬇️
datalad/plugin/no_annex.py 0.00% <0.00%> (-100.00%) ⬇️
datalad/plugin/add_readme.py 0.00% <0.00%> (-100.00%) ⬇️
datalad/plugin/check_dates.py 0.00% <0.00%> (-100.00%) ⬇️
datalad/support/tests/utils.py 0.00% <0.00%> (-100.00%) ⬇️
... and 146 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f63ed56...d524cd9. Read the comment docs.

Copy link
Member

@yarikoptic yarikoptic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a cursory review with 1 minor recommendation in the test.

@with_tempfile
def test_status_paths_empty_list(path):
ds = Dataset(path).create()
assert_false(ds.repo.status(paths=[]))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assert_false(ds.repo.status(paths=[]))
assert_equal(ds.repo.status(paths=[]), [])

to make explicit and not wonder if could be None (or is it None?)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an empty OrderedDict. I've updated the test.

Copy link
Member

@mih mih left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much appreciated sanity improvement! Thx!

@mih mih merged commit 173b6b0 into datalad:master May 25, 2021
@kyleam kyleam deleted the content-info-none-vs-empty branch May 25, 2021 13:21
@yarikoptic yarikoptic added the semver-minor Increment the minor version when merged label Jun 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
semver-minor Increment the minor version when merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants