Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BF: gitrepo: Revert changes to store branch in .gitmodules #4375

Merged
merged 1 commit into from Apr 3, 2020

Conversation

kyleam
Copy link
Contributor

@kyleam kyleam commented Apr 2, 2020

As of v0.12.0, specifically 2fda83e (ENH: Record the active branch
of a subdataset in the parent, 2019-10-20, gh-3817), we record the
current branch in the parent repository as the value for
submodule.<name>.branch in .gitmodules when saving a new submodule.

There are a few problems with this:

  • The current branch in the parent is recorded. It seems unlikely
    that that was the intent because there is no reason to assume a
    branch with that name exists in the submodule repository or, if it
    does, to assume that the parent and submodule branches are
    necessarily coupled. 2fda83e mentions gh-1424
    (Datalad.recall_state() -> load command) as the motivation,
    which makes it seem likely the current branch in the submodule,
    not the parent, was supposed be recorded.

    This was mentioned in a currently open PR:
    #4275 (comment)

  • Discussion of recording the branch at gh-1424 suggests that the
    idea is to record this information with every save, but 2fda83e
    records it only when the submodule is initially added. 2fda83e
    doesn't say explicitly that its intent was to do it with every
    save (just that the change moves in the direction of gh-1424), but
    still it doesn't seem a particularly useful incremental step to
    have a one-shot record of the current branch at the time the
    submodule is added.

  • It's not clear that submodule.<name>.branch is a good spot to
    record the branch information required by gh-1424.
    submodule.<name>.branch is about which remote branch is used
    when --remote is passed to submodule update. The goal in
    gh-1424 seems to be tracking what the current local branch was
    at the time of a save. Given these different purposes, it seems
    like it'd be a good idea to track this information in a different
    way (perhaps an entry in .gitmodules with a different key).

Given these issues, let's revert the changes from 2fda83e, along
with the changes from the follow-up commit ee50107 (BF: Do not
record adjust branch in submodule config, 2019-10-21).

Fixes #4373.

As of v0.12.0, specifically 2fda83e (ENH: Record the active branch
of a subdataset in the parent, 2019-10-20, dataladgh-3817), we record the
current branch in the _parent_ repository as the value for
`submodule.<name>.branch` in .gitmodules when saving a new submodule.

There are a few problems with this:

  * The current branch in the parent is recorded.  It seems unlikely
    that that was the intent because there is no reason to assume a
    branch with that name exists in the submodule repository or, if it
    does, to assume that the parent and submodule branches are
    necessarily coupled.  2fda83e mentions dataladgh-1424
    (`Datalad.recall_state() -> `load` command) as the motivation,
    which makes it seem likely the current branch in the submodule,
    not the parent, was supposed be recorded.

    This was mentioned in a currently open PR:
    <datalad#4275 (comment)>

  * Discussion of recording the branch at dataladgh-1424 suggests that the
    idea is to record this information with every save, but 2fda83e
    records it only when the submodule is initially added.  2fda83e
    doesn't say explicitly that its intent was to do it with every
    save (just that the change moves in the direction of dataladgh-1424), but
    still it doesn't seem a particularly useful incremental step to
    have a one-shot record of the current branch at the time the
    submodule is added.

  * It's not clear that `submodule.<name>.branch` is a good spot to
    record the branch information required by dataladgh-1424.
    `submodule.<name>.branch` is about which _remote_ branch is used
    when `--remote` is passed to `submodule update`.  The goal in
    dataladgh-1424 seems to be tracking what the current _local_ branch was
    at the time of a save.  Given these different purposes, it seems
    like it'd be a good idea to track this information in a different
    way (perhaps an entry in .gitmodules with a different key).

Given these issues, let's revert the changes from 2fda83e, along
with the changes from the follow-up commit ee50107 (BF: Do not
record adjust branch in submodule config, 2019-10-21).

Fixes datalad#4373.
@codecov
Copy link

@codecov codecov bot commented Apr 2, 2020

Codecov Report

Merging #4375 into maint will increase coverage by 47.08%.
The diff coverage is n/a.

Impacted file tree graph

@@             Coverage Diff             @@
##            maint    #4375       +/-   ##
===========================================
+ Coverage   42.53%   89.61%   +47.08%     
===========================================
  Files         275      275               
  Lines       36933    36931        -2     
===========================================
+ Hits        15708    33095    +17387     
+ Misses      21225     3836    -17389     
Impacted Files Coverage Δ
datalad/core/local/tests/test_save.py 96.76% <ø> (+93.03%) ⬆️
datalad/support/gitrepo.py 89.64% <ø> (+19.95%) ⬆️
datalad/core/local/create.py 94.77% <0.00%> (+0.74%) ⬆️
datalad/core/local/tests/test_create.py 100.00% <0.00%> (+0.90%) ⬆️
datalad/ui/progressbars.py 83.10% <0.00%> (+1.35%) ⬆️
datalad/support/external_versions.py 95.62% <0.00%> (+1.45%) ⬆️
datalad/dochelpers.py 87.40% <0.00%> (+1.48%) ⬆️
datalad/support/tests/test_network.py 100.00% <0.00%> (+1.83%) ⬆️
datalad/core/local/status.py 98.03% <0.00%> (+1.96%) ⬆️
datalad/support/tests/test_external_versions.py 91.94% <0.00%> (+2.01%) ⬆️
... and 187 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f9cfd11...8b47be1. Read the comment docs.

@yarikoptic
Copy link
Member

@yarikoptic yarikoptic commented Apr 3, 2020

I think reverting this for now and asap would be the best strategy or we will breed "broken" datasets when someone does use branches. Hopefully #4275 would get finalized and released soon to provide proper handling

@yarikoptic yarikoptic added this to the 0.12.6 milestone Apr 3, 2020
@kyleam kyleam merged commit 9186ff1 into datalad:maint Apr 3, 2020
11 checks passed
@kyleam
Copy link
Contributor Author

@kyleam kyleam commented Apr 3, 2020

@yarikoptic Thanks for weighing in. Given that @mih expressed confusion about this behavior in gh-4275, I'll go ahead and merge this without his explicit blessing.

Hopefully #4275 would get finalized and released soon to provide proper handling

IIRC what is happening there is pretty orthogonal (there just happened to be a drive-by comment left by that PR), so perhaps an attempt at gh-1424 is more likely to provide that handling.

@kyleam kyleam deleted the gitmodules-branch branch Apr 3, 2020
@mih
Copy link
Member

@mih mih commented Jun 6, 2020

FTR: @adswa here is the last point of a chain of developments that maintain the fact that submodule branches are not recorded.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants