Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clone supports passing options to git-clone (Replacement of #5662) #6218

merged 7 commits into from
Nov 20, 2021


Copy link

@mih mih commented Nov 19, 2021

This substitutes #5662, because I cannot push to Kyle's fork to resolve conflicts. Please see that PR for extensive information on the motivation for this change. Beyond the original purpose "Support cloning a specified branch or tag" it does:

The 'source' parameter of clone() includes EnsureNone in its
constraints, but downstream code doesn't handle a None value.
GitRepo.clone() calls git-clone within a loop so that it can retry the
same git-clone invocation to work around a known failure.  It defines
a "base command" outside of the loop and then extends that with the
path, url, and to_options(**clone_options) result within the loop.
This outside/within-loop split is potentially confusing to the reader
because the values that are tacked on per-iteration don't change.

Construct the entire command ahead of the loop.
GitRepo.clone() takes clone options as a dictionary (GitPython-style)
and converts them to a list of command line arguments.  Add support
for passing the list of arguments so that datalad-clone can directly
relay options from the command line.
git-clone has several options that are useful to expose, including
--branch for specifying a branch or tag to clone as well as --filter
for partial clones and --sparse for sparse checkouts.  (However, some
options, like --single-branch, don't play well with annex

Follow the approach of datalad-create, and extend datalad-clone to
take arbitrary git-clone options.

Re: datalad#2109
Closes datalad#4035
Re: datalad#5286
The previous commit taught datalad-clone how to pass arbitrary
git-clone options to GitRepo.clone().  For datalad-install, we could
also allow arbitrary git-clone options.  However, given that
datalad-install wraps datalad-get in addition to datalad-clone, it's
not obvious that clone options should take precedence over arbitrary
git-annex-get options.

Instead add a dedicated --branch argument.  Callers that need the more
complicated options can use datalad-clone.  If it becomes clear that
datalad-install should support arbitrary clone options, the main cost
of the conversion is that --branch, like other arbitrary arguments,
would need to be positioned after install's other arguments.
@mih mih changed the title Support passing options to git-clone (Replacement of #5662) clone supports passing options to git-clone (Replacement of #5662) Nov 19, 2021
Copy link

Going through the original discussion, stumbled upon this: #5662 (comment) by @mih

If there is need for homogenization, we could investigate to strip the support for embedded versions from the RIA code. WDYT @bpoldrack ?

There's one potential issue I can see with stripping support for this: By now we store the originally given URL in .gitmodules under datalad-url. This would be the complete RIA URL in such a case, hence stripping support would invalidate existing entries in datasets. That's not necessarily an issue in the sense that we wouldn't know what to clone (git knows the commit of a submodule), but we must not fail on parsing such a RIA URL.

Other than that: LGTM.

@mih mih added the semver-minor Increment the minor version when merged label Nov 19, 2021
Copy link
Member Author

mih commented Nov 19, 2021

This PR does not do stripping, and I have no plans to do such thing without a dedicated issue and discussion. Promised!

Copy link
Member Author

mih commented Nov 19, 2021

How about this (going from #5286 (comment))

What if we execute the proposed

git fetch --depth 1 origin +refs/heads/git-annex:refs/remotes/origin/git-annex

whenever git_clone_opts where provided, but no git-annex branch manifested locally after a clone? This would make it possible to use --depth 1 with datalad clone and make for pretty speedy clones for large datasets.

Would do in a follow-up PR, if it turns out to be useful.


Copy link

codecov bot commented Nov 19, 2021

Codecov Report

Merging #6218 (716813b) into master (13fc6ac) will decrease coverage by 15.05%.
The diff coverage is 65.85%.

❗ Current head 716813b differs from pull request most recent head 6db0f31. Consider uploading reports for the commit 6db0f31 to get more accurate results
Impacted file tree graph

@@             Coverage Diff             @@
##           master    #6218       +/-   ##
- Coverage   89.27%   74.21%   -15.06%     
  Files         320      320               
  Lines       41444    41742      +298     
- Hits        37000    30980     -6020     
- Misses       4444    10762     +6318     
Impacted Files Coverage Δ
datalad/core/distributed/tests/ 0.00% <0.00%> (-97.08%) ⬇️
datalad/distribution/ 98.87% <ø> (ø)
datalad/core/distributed/ 68.35% <75.00%> (-20.84%) ⬇️
datalad/distribution/tests/ 100.00% <100.00%> (+0.21%) ⬆️
datalad/support/ 89.91% <100.00%> (-0.78%) ⬇️
datalad/tests/ 100.00% <100.00%> (ø)
datalad/ 0.00% <0.00%> (-100.00%) ⬇️
datalad/plugin/ 0.00% <0.00%> (-100.00%) ⬇️
datalad/plugin/ 0.00% <0.00%> (-100.00%) ⬇️
datalad/interface/ 0.00% <0.00%> (-100.00%) ⬇️
... and 151 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 13fc6ac...6db0f31. Read the comment docs.

@mih mih merged commit 1fcbbb5 into datalad:master Nov 20, 2021
@mih mih deleted the clone-install-branch branch November 20, 2021 09:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
semver-minor Increment the minor version when merged
None yet
3 participants