Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce bandwidth for installation with git #9603

Closed
kumaraditya303 opened this issue Feb 12, 2021 · 8 comments
Closed

Reduce bandwidth for installation with git #9603

kumaraditya303 opened this issue Feb 12, 2021 · 8 comments
Labels
C: build logic Stuff related to metadata generation / wheel generation C: vcs pip's interaction with version control systems like git, svn and bzr type: enhancement Improvements to functionality

Comments

@kumaraditya303
Copy link

What's the problem this feature will solve?

When installing a project with git, pip clones the full repository thereby cloning all the history and takes a lot of time and bandwidth.
Describe the solution you'd like

When installing a project with git, pip should Treeless clone or Blobless clone

Repositories with large history would be benefitted a lot.

Additional context

https://github.blog/2020-12-22-git-clone-a-data-driven-study-on-cloning-behaviors/

@RonnyPfannschmidt
Copy link
Contributor

creating incomplete clones regularly creates issues with build tools that use git metadata
if ever supported this should be a opt in for the vcs url

@kumaraditya303
Copy link
Author

creating incomplete clones regularly creates issues with build tools that use git metadata
if ever supported this should be a opt in for the vcs url

Which git metadata is affected?

@uranusjr
Copy link
Member

There are tools that rely on git tags and commit hashes to infer a package’s version, and a non-full clone may break them. In fact, we already tried some of the methods, and ended up reverting to the current behaviour because every attempt broke something.

I’ll close this as wont-fix. Don’t use VCS URLs if the performance is an issue for you.

@uranusjr uranusjr added the resolution: no action When the resolution is to not do anything label Feb 12, 2021
@kumaraditya303
Copy link
Author

kumaraditya303 commented Feb 12, 2021

You must be saying about clones with limited depth, but there are alternatives especially the treeless clone which downloads all the commit history and tags and only downloads blobs on demand.
Ref
@uranusjr

@uranusjr
Copy link
Member

I would certainly be interested in reading a proof of concept if you think it would work.

@kumaraditya303
Copy link
Author

kumaraditya303 commented Feb 14, 2021

A solution is proposed at #9607, so reopen this issue. @uranusjr , Also the issues should be given sufficient time for discussing rather than closing it immediately

@sbidoul sbidoul reopened this May 30, 2021
@kumaraditya303
Copy link
Author

kumaraditya303 commented May 30, 2021

@sbidoul
Regarding your question:

I personally think this is something worth exploring. I had a look again at this and I found this relatively recent article at github.com which gives food for thought.
This article seems to point to some performance limitations with treeless clones and submodules? Perhaps blobless clones would be good compromise?

As per Github article

git clone --filter=tree:0 creates a treeless clone. These clones download all reachable commits while fetching trees and blobs on-demand. These clones are best for build environments where the repository will be deleted after a single build, but you still need access to commit history.

Treeless clone are best suited for this issue as pip only has to create a wheel for it

Another question I have is about minimum git client and server version. It would be interesting to know the minimum requirements to determine if we need to fallback to the full clone in some circumstances?

Till now I haven't had any issue with it but this should be checked, at least github and bitbucket works fine

@uranusjr uranusjr added C: build logic Stuff related to metadata generation / wheel generation C: vcs pip's interaction with version control systems like git, svn and bzr type: enhancement Improvements to functionality and removed resolution: no action When the resolution is to not do anything labels May 30, 2021
@sbidoul
Copy link
Member

sbidoul commented Aug 15, 2021

This was implemented with --filter=blob:none in #9086, so closing this one.

@sbidoul sbidoul closed this as completed Aug 15, 2021
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 24, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
C: build logic Stuff related to metadata generation / wheel generation C: vcs pip's interaction with version control systems like git, svn and bzr type: enhancement Improvements to functionality
Projects
None yet
Development

No branches or pull requests

4 participants