Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Git dependencies #283

Merged
merged 6 commits into from
Nov 2, 2023
Merged

Add support for Git dependencies #283

merged 6 commits into from
Nov 2, 2023

Conversation

charliermarsh
Copy link
Member

@charliermarsh charliermarsh commented Nov 2, 2023

Summary

This PR adds support for Git dependencies, like:

flask @ git+https://github.com/pallets/flask.git

Right now, they're only supported in the resolver (and not the installer), since the installer doesn't yet support source distributions at all.

The general approach here is based on Cargo's Git implementation. Specifically, I adapted Cargo's git module to perform the cloning, which is based on libgit2.

As compared to Cargo's implementation, I made the following changes:

  • Removed any unnecessary code.
  • Fixed any Clippy errors for our stricter ruleset.
  • Removed the dependency on curl, in favor of reqwest which we use elsewhere.
  • Removed the ability to use gix. Cargo allows the use of gix as an experimental flag, but it only supports a small subset of the operations. When Cargo fully adopts gix, we should plan to do the same.
  • Removed Cargo's host key checking. We need to re-add this! I'll do it shortly.
  • Removed Cargo's progress bars. We should re-add this too, but we use indicatif and Cargo had their own thing.

There are a few follow-ups to consider:

  • Adding support in the installer.
  • When we lock, we should write out the Git URL that includes the exact SHA. This lets us cache in perpetuity and avoids dependencies changing without re-locking.
  • When we resolve, we should always try to refresh Git dependencies. (Right now, we skip if the wheel was already built.)

I'll work on the latter two in follow-up PRs.

Closes #202.

@charliermarsh charliermarsh force-pushed the charlie/vcs branch 3 times, most recently from bf0a2c8 to 2519334 Compare November 2, 2023 01:31
@@ -0,0 +1,1367 @@
/// Git support is derived from Cargo's implementation.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file does the actual Git operations. It's largely derived from Cargo.

.join(short_id.as_str());
db.copy_to(actual_rev, &checkout_path, self.strategy, &self.client)?;

Ok(checkout_path)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cool thing about Cargo's implementation is that they maintain a "database" of repositories. So every repository that you ever check out is included in there. Then, peer to it, they have a directory of checkouts, which are created by checking out the commit in the database and hardlinking from the database to the checkout. So it's fast to switch between commits of the same repository.

@charliermarsh charliermarsh force-pushed the charlie/vcs branch 7 times, most recently from e075b5f to bed2256 Compare November 2, 2023 04:08
@charliermarsh charliermarsh marked this pull request as ready for review November 2, 2023 04:09

debug!("Performing a Git fetch for: {remote_url}");
match strategy {
FetchStrategy::Cli => fetch_with_cli(repo, remote_url, &refspecs, tags),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are so many issues about Cargo not working as expected with SSH dependencies that I'm wondering if we should just make this the default (rust-lang/cargo#2078, rust-lang/cargo#3381, etc.).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This would not remove our dependency on libgit2 since this is just the fetch.)

Copy link
Member

@konstin konstin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't review the parts copied from cargo.

It's a pity that we copy the code and there's no reusable crate.

crates/puffin-build/src/lib.rs Show resolved Hide resolved
crates/puffin-build/src/lib.rs Outdated Show resolved Hide resolved
crates/puffin-resolver/src/resolver.rs Outdated Show resolved Hide resolved
@charliermarsh
Copy link
Member Author

@konstin - I agree although I'm actually happy with how little code we had to copy in the end. It may not feel that way from the review, but I was able to pare it down a lot. Now, we're really only borrowing the actual Git operations and logic around when to refresh, when to reset, recursively cloning submodules, etc., which at least feels somewhat domain agnostic.

@charliermarsh charliermarsh merged commit 62c474d into main Nov 2, 2023
3 checks passed
@charliermarsh charliermarsh deleted the charlie/vcs branch November 2, 2023 15:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support VCS dependencies
2 participants