Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gitrepo: Direct files to git with annex.gitaddtoannex #5545

Merged
merged 2 commits into from Apr 7, 2021

Conversation

kyleam
Copy link
Contributor

@kyleam kyleam commented Apr 6, 2021

  • Update GitRepo.add_ to use annex.gitaddtoannex instead of annex.largefiles to direct files to Git. See the message of second commit for the background.

  • Update the associated regression test for a change in git-annex's behavior.

Back when this test was added in ad0158c (BF(v7): gitrepo: Avoid
adding files to annex, 2019-09-23), 'git add' in a v6+ repo would add
a file to the annex unless annex.largefiles instructed otherwise.
With git-annex 7.20191024, 'git add' doesn't add files to the annex
_unless_ annex.largefiles is set.

Configure annex.largefiles for the test repo so that this test remains
an effective regression test for the GitRepo.add_() change in
ad0158c.
As mentioned in d3c9c9e (BF: annexrepo: Use annex-add's
--force-large when available, 2020-03-11), using
`annex.largefiles={anything,nothing}` for one-off calls to send files
into the annex or git is discouraged because it leads to issues like
the one described in be12952 (TST: annexrepo: Ignore racy "is
clean?" failure, 2020-01-02) and the related git-annex issue.

Now that our minimum version is above 7.20191024, use
annex.gitaddtoannex instead.

https://git-annex.branchable.com/bugs/A_case_where_file_tracked_by_git_unexpectedly_becomes_annex_pointer_file/
@codecov
Copy link

codecov bot commented Apr 6, 2021

Codecov Report

Merging #5545 (a9a674c) into master (f2f95b3) will increase coverage by 0.06%.
The diff coverage is 100.00%.

❗ Current head a9a674c differs from pull request most recent head 7527b72. Consider uploading reports for the commit 7527b72 to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##           master    #5545      +/-   ##
==========================================
+ Coverage   88.52%   88.58%   +0.06%     
==========================================
  Files         296      296              
  Lines       42467    42485      +18     
==========================================
+ Hits        37592    37637      +45     
+ Misses       4875     4848      -27     
Impacted Files Coverage Δ
datalad/support/gitrepo.py 91.61% <ø> (ø)
datalad/support/tests/test_gitrepo.py 99.80% <100.00%> (ø)
datalad/customremotes/tests/test_archives.py 89.26% <0.00%> (-0.61%) ⬇️
datalad/__init__.py 80.40% <0.00%> (-0.42%) ⬇️
datalad/plugin/export_to_figshare.py 26.54% <0.00%> (-0.17%) ⬇️
datalad/support/digests.py 100.00% <0.00%> (ø)
datalad/interface/tests/test_ls_webui.py 100.00% <0.00%> (ø)
datalad/interface/add_archive_content.py 90.09% <0.00%> (+0.04%) ⬆️
datalad/support/tests/test_annexrepo.py 97.58% <0.00%> (+0.07%) ⬆️
datalad/customremotes/base.py 82.15% <0.00%> (+0.14%) ⬆️
... and 9 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f2f95b3...7527b72. Read the comment docs.

Copy link
Member

@bpoldrack bpoldrack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@yarikoptic
Copy link
Member

Just to make sure I grasp situation correctly: with this change we will avoid possible race condition in git-annex, but we will change behavior of GitRepo.add if repository was not initiated with e.g. -c text2git or some other procedure which would setup the annex.largefiles in .gitattributes. Given that it was only a single dedicated test which behavior has changed, no other changes in typical use cases (through datalad.save) should be effected, and thus it should not come as a surprise in change in behavior to typical users (unless they use GitRepo.add specifically). Correct?

@kyleam
Copy link
Contributor Author

kyleam commented Apr 7, 2021

Thanks, @bpoldrack and @yarikoptic, for taking a look.

@yarikoptic:

we will change behavior of GitRepo.add if repository was not initiated with e.g. -c text2git or some other procedure which would setup the annex.largefiles in .gitattributes.

No change in behavior is intended. Both the old setting (annex.largefiles=nothing) and new one (annex.gitaddtoannex=false) are used to make GitRepo.add_ always send files to Git regardless of whether the repository has git-annex initialized or has largefiles configured.

Given that it was only a single dedicated test which behavior has changed, no other changes in typical use cases (through datalad.save) should be effected

Right, it's a weird state to get into: an annex repo needs to be instantiated as a GitRepo. This isn't relevant in the typical situation where AnnexRepo is used.

This original change came in ad0158c (BF(v7): gitrepo: Avoid adding files to annex, 2019-09-23), part of gh-3648. This issue was discovered when troubleshooting a crawler test failure, though it looks like it ended up not being the reason for it (or at least not the sole reason).

and thus it should not come as a surprise in change in behavior to typical users (unless they use GitRepo.add specifically). Correct?

As mentioned above, this shouldn't change the behavior of GitRepo.add_.

The change in behavior noted in the PR description is about git-annex 7.20191024 changing its effect on git add. That resulted in the test from ad0158c no longer being an effective regression test; test_gitrepo_add_to_git_with_annex_v7 still passes if you drop the -c annex.largefiles=nothing bit added in ad0158c.

Copy link
Member

@mih mih left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx for this change!

@yarikoptic
Copy link
Member

Thank you @kyleam for the details, and @mih for blessing, let's proceed then!

@yarikoptic yarikoptic merged commit 3fd5c21 into datalad:master Apr 7, 2021
@kyleam kyleam deleted the gitaddtoannex branch April 7, 2021 23:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants