-
Notifications
You must be signed in to change notification settings - Fork 133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
clone: --filter=tree:0 implies fetch.recurseSubmodules=no #797
Conversation
The partial clone feature has several modes, but only a few are quick for a server to process using reachability bitmaps: * Blobless: --filter=blob:none downloads all commits and trees and fetches necessary blobs on-demand. * Treeless: --filter=tree:0 downloads all commits and fetches necessary trees and blobs on demand. This treeles mode is most similar to a shallow clone in the total size (it only adds the commit objects for the full history). This makes treeless clones an interesting replacement for shallow clones. A user can run more commands in a treeless clone than in a shallow clone, especially 'git log' (no pathspec). In particular, servers can still serve 'git fetch' requests quickly by calculating the difference between commit wants and haves using bitmaps. I was testing this feature with this in mind, and I knew that some trees would be downloaded multiple times when checking out a new branch, but I did not expect to discover a significant issue with 'git fetch', at least in repostiories with submodules. I was testing these commands: $ git clone --filter=tree:0 --single-branch --branch=master \ https://github.com/git/git $ git -C git fetch origin "+refs/heads/*:refs/remotes/origin/*" This fetch command started downloading several pack-files of trees before completing the command. I never let it finish since I got so impatient with the repeated downloads. During debugging, I found that the stack triggering promisor_remote_get_direct() was going through fetch_populated_submodules(). Notice that I did not recurse my submodules in the original clone, so the sha1collisiondetection submodule is not initialized. Even so, my 'git fetch' was scanning commits for updates to submodules. I decided that even if I did populate the submodules, the nature of treeless clones makes me not want to care about the contents of commits other than those that I am explicitly navigating to. This loop of tree fetches can be avoided by adding --no-recurse-submodules to the 'git fetch' command or setting fetch.recurseSubmodules=no. To make this as painless as possible for future users of treeless clones, automatically set fetch.recurseSubmodules=no at clone time. Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
/submit |
Submitted as pull.797.git.1605904586929.gitgitgadget@gmail.com To fetch this version into
To fetch this version to local tag
|
On the Git mailing list, Jeff King wrote (reply to this):
|
User |
On the Git mailing list, Philippe Blain wrote (reply to this):
|
User |
On the Git mailing list, Derrick Stolee wrote (reply to this):
|
User |
This branch is now known as |
This patch series was integrated into seen via git@1e15d8c. |
On the Git mailing list, Jeff King wrote (reply to this):
|
This patch series was integrated into seen via git@fc484f3. |
I'm going to drop this for now. |
While testing different partial clone options, I stumbled across this one. My initial thought was that we were parsing commits and loading their root trees unnecessarily, but I see that doesn't happen after this change.
Here are some recent discussions about using --filter=tree:0:
[1] https://lore.kernel.org/git/aa7b89ee-08aa-7943-6a00-28dcf344426e@syntevo.com/
[2] https://lore.kernel.org/git/cover.1588633810.git.me@ttaylorr.com/
[3] https://lore.kernel.org/git/58274817-7ac6-b6ae-0d10-22485dfe5e0e@syntevo.com/
Thanks,
-Stolee
cc: Jonathan Tan jonathantanmy@google.com
cc: Taylor Blau me@ttaylorr.com
cc: Jeff King peff@peff.net
cc: Philippe Blain levraiphilippeblain@gmail.com
cc: Derrick Stolee stolee@gmail.com