Skip to content

MNT: Factor out helper function used in core and next from ria functionality #6706

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 13, 2022

Conversation

adswa
Copy link
Member

@adswa adswa commented May 25, 2022

create-sibling-ria has in-line code to check for datasets with matching sibling names that datalad-next refactored into a helper function for create_sibling_webdav (datalad/datalad-next#17, #6644).
This PR adds the helper function created in datalad-next to datalad core. I went with datalad/distribution/utils.py as a location, but there might be better places for it - please chime in.
The helper is then used in create-sibling-ria instead of the previous in-line code to make the code more modular. A companion PR in datalad-next may import the helper from datalad core.

Fixes #6644

Changelog

🏠 Internal

  • in-line code of create-sibling-ria has been refactored to an internal helper to check for siblings with particular names across dataset hierarchies in datalad-next, and is reintroduced into core to modularize the code base further

…onality

create-sibling-ria had in-line code to check for datasets with matching sibling names
that datalad-next refactored into a helper function.
This PR adds the helper function created in datalad-next to datalad/distribution/utils
and uses it instead of the previous in-line code to make the code more modular.
A companion PR in datalad-next may import the helper from datalad core.
@adswa adswa added the semver-internal Changes only affect the internal API label May 25, 2022
adswa added a commit to adswa/datalad-next that referenced this pull request May 25, 2022
@codecov
Copy link

codecov bot commented May 25, 2022

Codecov Report

Merging #6706 (56b8e79) into master (c21a31c) will increase coverage by 1.61%.
The diff coverage is 75.86%.

@@            Coverage Diff             @@
##           master    #6706      +/-   ##
==========================================
+ Coverage   88.81%   90.42%   +1.61%     
==========================================
  Files         353      353              
  Lines       45733    45819      +86     
==========================================
+ Hits        40617    41432     +815     
+ Misses       5116     4387     -729     
Impacted Files Coverage Δ
datalad/distributed/create_sibling_ria.py 81.76% <40.00%> (+2.09%) ⬆️
datalad/distribution/utils.py 92.00% <83.33%> (-8.00%) ⬇️
datalad/customremotes/datalad.py 61.64% <0.00%> (-2.74%) ⬇️
datalad/distribution/tests/test_install.py 99.81% <0.00%> (-0.19%) ⬇️
datalad/support/tests/test_annexrepo.py 97.75% <0.00%> (-0.13%) ⬇️
datalad/local/unlock.py 100.00% <0.00%> (ø)
datalad/core/local/diff.py 90.38% <0.00%> (ø)
datalad/local/copy_file.py 93.72% <0.00%> (ø)
datalad/local/export_archive.py 100.00% <0.00%> (ø)
datalad/core/distributed/push.py 92.54% <0.00%> (ø)
... and 22 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c21a31c...56b8e79. Read the comment docs.

@adswa
Copy link
Member Author

adswa commented May 30, 2022

Can I ask @mih @christian-monch or @mslw for a review of this?

Copy link
Contributor

@mslw mslw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @adswa for missing the notification!

Comparing by eye I can confirm that this is an exact reflection of how it is done in datalad-next's create_sibling_webdav (now tried and tested). Compared to the current create_sibling_ria, it introduces the use of ds.foreach_dataset() instead of looping over ds.siblings - as far as I understand it is an improvement.

I left one small comment on a comment block.

It seems to me that the PR achieves its goal, but I wasn't actively involved in the code in next, so while LGTM, I'd defer approval to those who were @mih @christian-monch

@mslw
Copy link
Contributor

mslw commented Jun 10, 2022

As a side note and more like a question for others (not within the scope of this PR) - the other (much older) function in distribution/utils also starts with an underscore, so I completely agree with _yield_ds_w_matching_siblings. But for functions that both get imported (also across submodules), wouldn't it be more conventional to use no underscore?

@adswa
Copy link
Member Author

adswa commented Jun 10, 2022

As a side note and more like a question for others (not within the scope of this PR) - the other (much older) function in distribution/utils also starts with an underscore, so I completely agree with _yield_ds_w_matching_siblings. But for functions that both get imported (also across submodules), wouldn't it be more conventional to use no underscore?

The more experienced Pythonistas can correct me, but as far as I know functions with underscores signal 'I'm an internal function' to users and don't come with as much of a promise of stability. Changes to such a function do not necessitate a new release or deprecation period; anyone outside who uses them knows not to rely on them in the same way as the exposed API.

@codeclimate
Copy link

codeclimate bot commented Jun 10, 2022

Code Climate has analyzed commit 56b8e79 and detected 1 issue on this pull request.

Here's the issue category breakdown:

Category Count
Complexity 1

View more on Code Climate.

@mslw
Copy link
Contributor

mslw commented Jun 10, 2022

As a side note and more like a question for others (not within the scope of this PR) - the other (much older) function in distribution/utils also starts with an underscore, so I completely agree with _yield_ds_w_matching_siblings. But for functions that both get imported (also across submodules), wouldn't it be more conventional to use no underscore?

The more experienced Pythonistas can correct me, but as far as I know functions with underscores signal 'I'm an internal function' to users and don't come with as much of a promise of stability. Changes to such a function do not necessitate a new release or deprecation period; anyone outside who uses them knows not to rely on them in the same way as the exposed API.

Thanks for this comment! I understood it in a very similar way (internal) but more as in "not meant to be called from outside its class/module, although ofc there's nothing stopping you". I guess your perspective is perfectly complementary (in why it's not meant to be called from outside). FTR I checked, and PEP 8 calls this convention a weak “internal use” indicator with a comment: e.g. from M import * does not import objects whose names start with an underscore (didn't know that!).

But as I said that was a side note or question, not important for the gist of the PR.

Copy link
Member

@bpoldrack bpoldrack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks much, @adswa!

Re _:

Yes, @mslw , technically your understanding is the usual python convention. However, we do use it the way @adswa described as well. I think ideally both notions converge by refactoring accordingly. Now that it is in a central place, the function may be become "public", but I'd delay and see how it may be useful for more 'create-sibling-*` commands and how that may still change it's signature/behavior.

@bpoldrack bpoldrack merged commit 0654290 into datalad:master Jun 13, 2022
@mih mih mentioned this pull request Jul 1, 2022
40 tasks
mih pushed a commit to datalad/datalad-next that referenced this pull request Jul 8, 2022
mih pushed a commit to datalad/datalad-next that referenced this pull request Jul 8, 2022
mih added a commit to datalad/datalad-next that referenced this pull request Jul 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
semver-internal Changes only affect the internal API
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Import existing-sibling check from datalad-next for create-sibling-ria
3 participants