Skip to content

Use gix for cargo package #15534

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

Byron
Copy link
Member

@Byron Byron commented May 16, 2025

This should also help fixing these spurious "cannot package because some excluded file is untracked" issues.

Tasks

  • step-by-step conversion of vcs.rs
  • use proper feature toggle
  • cleanup final check by myself
  • move split & rename into its own commit. Probably squash all changes except for the gix upgrade.
    • I like to have the major stages of this PR conserved.
  • upgrade to a gix release including various improvements GitoxideLabs/gitoxide#2016
    • This was done in master already.
  • fix tests by fixing gix - submodules() call isn't bare-repo safe.
  • fix failure on Windows
    • gix status seems to go through a symlink, arriving at the wrong conclusion, on Windows.
  • fix performance regression on aws-sdk-rust.

Notes for the Reviewer

Related to GitoxideLabs/gitoxide#106.

@rustbot rustbot added A-git Area: anything dealing with git Command-package labels May 16, 2025
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for kicking of this!

Regardless the approach taken here. I wonder if it is possible to avoid checking repo state completely. In PathSource::list_files we already walked through it. And maybe we can attach extra info in PathEntry. Of course it'll slow down list_files a bit, but with that we could perhaps resolve #14941?.

It is just a bit relevant to this PR though. We can still merge this when ready.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's about time I get started with this - this setup here is super-hacky, but it's designed to allow a very step-wise conversion, hopefully allowing to work on it 15 minutes at a time.

Regarding #14941, I hope with #14955 I have found a representative issue. Right now I can't tell how that should be solved, so I'd probably go for having a gitoxide alternative for cargo package here and tackle the related issues another time. Ideally, the new implementation will be a bit faster and I hope I can test that on the aws-sdk-rust repository.

Copy link
Member Author

@Byron Byron May 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't resist to profile the aws-sdk-rust publishing step in Cargo and here is a picture of the run so far.

Screenshot 2025-05-17 at 12 54 31 Screenshot 2025-05-17 at 12 55 45

I don't know if this will ever finish, and I don't know how they can even get this published, maybe it just takes a very long time.

The culprit, maybe among other things, may be repo.status_file() calls for which there isn't an equivalent in gitoxide anyway. Thus I think this will look very differently when this PR is done and maybe that's an avenue for aws-sdk-rust to publish much faster.

I will definitely test that once it's working and share results :).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops I linked to the wrong issue. Was meant to say #14955

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realized that dirty_files_outside_pkg_root is probably the reason for the individual git_status_file calls.
The gitoxide implementation can certainly be backported to the git2 one if there is demand, so potential gains could be available sooner.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I realize that even with these improvements, and without --no-vcs-info or similar it will still do one git-status costing about 1s each for each of these 404 crates, wasting nearly 7 minutes. Nonetheless, it's probably acceptable compared to the over three ours that it took otherwise (which would be equivalent to ~10800 git status checks).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I think #15416 is also properly fixed now, which also means the output is different in 2 tests when using the gitoxide implementation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Really appreciate the effort here!

We are probably not going to backports, as the backporting process involves a bit more work, and usually reserved for some regression that has no way out.

Also, given the cargo team has agreed that the dirtiness check is best effort and informational only, I think once the PR is ready we can merge it without FCP.

Thanks again, really a great job!

@Byron Byron force-pushed the gix-status-for-cargo-package branch 3 times, most recently from 7a409fd to f5d1c6b Compare May 17, 2025 19:11
@rustbot

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-author Status: The marked PR is awaiting some action (such as code changes) from the PR author. label May 18, 2025
@Byron Byron force-pushed the gix-status-for-cargo-package branch from f5d1c6b to 39f3f02 Compare May 18, 2025 07:27
@Byron Byron force-pushed the gix-status-for-cargo-package branch from 39f3f02 to 071475b Compare May 18, 2025 09:11
@rustbot rustbot added the A-infrastructure Area: infrastructure around the cargo repo, ci, releases, etc. label May 18, 2025
@Byron Byron force-pushed the gix-status-for-cargo-package branch 2 times, most recently from 7105c4f to 0e60ce6 Compare May 18, 2025 14:21
@Byron Byron force-pushed the gix-status-for-cargo-package branch from fde1e43 to b79bd00 Compare May 18, 2025 16:28
Copy link
Member Author

@Byron Byron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is ready for a first review. Only gitoxide needs a release which is planned for the 22nd, which is when this PR becomes formally mergeable as it can point to the new release instead of the Git repository.

Especially after RustConf where I was quite apologetic and feared gitoxide could be removed again for not making progress, I thought maybe we should instead consider stabilizing the well-working parts earlier, such as in:

  • nightly with opt-in
  • stable with opt-out
  • stable, with git2 codepath removed.

cargo package would be candidate for it as the new implementation is more correct and as fast as it can be.

What do you think?

use std::collections::HashSet;
use std::path::{Path, PathBuf};
use tracing::debug;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept the structure and comments as similar as I could, compared to git2, so it's quite straightforward to compare.

Comment on lines 67 to 68
// TODO: Either remove this whole block, or have a test.
// It's hard to have no Cargo.toml here?
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unclear to me how this could possibly happen - there is no test for it either. From looking at git blame it wasn't obvious what motivated this check.
Probably nothing wrong with keeping it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe only some parts of the package is under git? Like src is a git repo.

Comment on lines 67 to 68
// TODO: Either remove this whole block, or have a test.
// It's hard to have no Cargo.toml here?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe only some parts of the package is under git? Like src is a git repo.

Comment on lines 294 to 297
if relative_package_root.is_some_and(|pkg_root| !rel_path.starts_with(pkg_root)) {
dirty_files_outside_of_package_root.push(path);
continue;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity. If we adopt the same optimization in git2 version, how the performance would look like?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe only some parts of the package is under git? Like src is a git repo.

It seemed that what the code really was testing is if the file exists, and not if it was tracked by Git. Thus I made it an explicit 'file exists' check, which seemed strange as Cargo won't function without such a manifest. Removing the whole block doesn't fail CI either, so it's a bit unclear why it's there. There is still no tool I know that could do forensics and find everything that Git knows about it, that would certainly be interesting.

Copy link
Member Author

@Byron Byron May 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity. If we adopt the same optimization in git2 version, how the performance would look like?

The 'architecture' changes here could all be applied to the git2 implementation, with the same effects. Performance problems should be solved then.

Edit: If the team decides not to fast-track gix here, then I think the AWS team could trivially contribute a port of this to solve their problems. Of course, I vote for gix as I believe it's more correct when dealing with ignored files, which should solve a problem I am having when releasing gitoxide 😅 (it claims an ignored file is dirty, despite it being ignored).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we are mostly doing the same thing except using gitoxide. Do I understand correctly?
Apart from the dirty_files_outside_of_package_root work, have you found any other stuff to optmize?

Copy link
Member Author

@Byron Byron May 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The driver of these changes is the lack of single-file APIs, and I noticed no other (previously unknown) optimisation opportunities.
tldr; I don't think there are simple optimisations left, except for maybe the N^2 portion, but even there it's hard to make it show up in a profiler.

But now that I am thinking about it, there might be one: I a workspace like aws-sdk-rust I believe this code runs once for each crate, meaning there will be a full git status each time. It would certainly require more complex changes to be able to cache these between the publish calls. Each one takes a little less than a second there, so it's quite some time in aggregate.
But before that, I'd probably optimize the fetch crates stage to run only once, it takes most of the time.

@weihanglo
Copy link
Member

While the PR is not ready to merge yet, I am kicking off the process to get a consensus on how we want to merge this.

@rfcbot fcp merge

Proposal: Switch to gix for cargo package VCS dirtiness check completely, without any fallback to libgit2.

Reasons:

@rfcbot
Copy link
Collaborator

rfcbot commented May 20, 2025

Team member @weihanglo has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period An FCP proposal has started, but not yet signed off. disposition-merge FCP with intent to merge labels May 20, 2025
@rustbot

This comment has been minimized.

@rfcbot rfcbot added final-comment-period FCP — a period for last comments before action is taken and removed proposed-final-comment-period An FCP proposal has started, but not yet signed off. labels May 26, 2025
@rfcbot
Copy link
Collaborator

rfcbot commented May 26, 2025

🔔 This is now entering its final comment period, as per the review above. 🔔

@Byron Byron force-pushed the gix-status-for-cargo-package branch from b79bd00 to 383795e Compare May 27, 2025 02:15
@rfcbot rfcbot added finished-final-comment-period FCP complete and removed final-comment-period FCP — a period for last comments before action is taken labels Jun 5, 2025
@rfcbot
Copy link
Collaborator

rfcbot commented Jun 5, 2025

The final comment period, with a disposition to merge, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

This will be merged soon.

Byron added 2 commits July 14, 2025 05:42
This should also help fixing these spurious "cannot package because
some excluded file is untracked" issues.
While at it, adjust two test expectations to deal with improved detail
of status checks.
@Byron Byron force-pushed the gix-status-for-cargo-package branch from 276aa24 to 80f48cc Compare July 14, 2025 03:56
@Byron Byron force-pushed the gix-status-for-cargo-package branch from 80f48cc to 46ad7af Compare July 14, 2025 03:57
@Byron
Copy link
Member Author

Byron commented Jul 14, 2025

It's too bad this had to wait for so long but I decided to finally get it off my queue, and replaced the git2 implementation with the gix one entirely as suggested.

Did I miss anything that was part of the review?

Further, the Cargo test-suite discovered a bug that makes submodule listings fail.

I will fix that in gix, but also feel like it could be caught here to not further delay merging and wait until gix is released on the 22nd or so. What do you think?

@weihanglo
Copy link
Member

weihanglo commented Jul 14, 2025

If it is an existing bug, we can wait and fix it later. Regardless, do whatever makes sense and is easy to follow on both review and implementation side.

@Byron Byron force-pushed the gix-status-for-cargo-package branch from 9529113 to 1667fb6 Compare July 15, 2025 03:49
@Byron
Copy link
Member Author

Byron commented Jul 15, 2025

I realised that the gitoxide issue was already fixed 2 months ago, and made a release instead.

Unfortunately, there still is a failure on Windows related to how gix status sees symlinks - apparently it doesn't detect it's a link and walks right through, somewhat rightfully considering the file in the submodule untracked.

For today I am out of time, but I'd hope to debug this more tomorrow.

This will fix an issue with submodules that couldn't be queried
in an unborn repository.
@Byron Byron force-pushed the gix-status-for-cargo-package branch from bae72aa to dcf1b40 Compare July 15, 2025 04:27
@Byron Byron marked this pull request as ready for review July 16, 2025 03:01
@rustbot
Copy link
Collaborator

rustbot commented Jul 16, 2025

r? @weihanglo

rustbot has assigned @weihanglo.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jul 16, 2025
@Byron Byron force-pushed the gix-status-for-cargo-package branch from dcf1b40 to d41d177 Compare July 16, 2025 03:03
@Byron
Copy link
Member Author

Byron commented Jul 16, 2025

Without me doing anything, the checks have passed.

My last commit (fix test that fails on CI, now removed) added some debug output to gather more dat for today, and that just passed for no good reason.

Let's see how prone it now is to failure. The test-setup is simple as well and it is something that should be reproducible in gitoxide as well, it's just the question how much effort it takes to reproduce on a local Windows VM, or if it reproduces at all. In the worst case, flakiness on CI only, it would be much tougher to fix, particularly because the issue had a smell of "filesystem race condition".

I marked the PR as ready for review nonetheless, in case CI passes, and will be back to investigate this phantom otherwise.

Copy link
Member

@weihanglo weihanglo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!! Just some nits and let me know what you think :)

})
.tree_index_track_renames(TrackRenames::Disabled)
.index_worktree_submodules(None)
.into_iter(None /* pathspec patterns */)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is the recent benchmark? Just compared with 6833aa7 against aws-sdk-rust and this PR slows down the process. Haven't checked what caused that. I assume it might be we collecting status for the entire repo for each workspace member, instead of using pathspec?

cargo package --allow-dirty --workspace --no-verify --no-metadata --offline

I ran this btw.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CARGO_LOG_PROFILE=1 cargo package --allow-dirty -p aws-sigv4 --no-verify --no-metadata --offline

master branch: 6833aa7

master-6833aa7.json

master

This PR d41d177

gix-pr

pr15534-d41d177.json

Copy link
Member Author

@Byron Byron Jul 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't have time today as planned to look into this, but just had to leave a comment here nonetheless.

This is a very interesting find! On the latest main of the aws-sdk-rust repository, the status time for the whole repository is ~600ms for me (hot cache). It's strange that this takes so long in your measurements and also, that the git2 version that uses status_file on each file is fast(er) all of the sudden. It's not at all what I measured in my run.

So I will definitely look into this.
Edit: I can definitely reproduce the slowdown, awesome catch!

Copy link
Member Author

@Byron Byron Jul 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started with hyperfine:

❯ hyperfine -w3 '/Users/byron/dev/github.com/rust-lang/cargo/target/release/cargo package --no-verify --no-metadata -p aws-smithy-runtime-api --offline' 'cargo package --no-verify --no-metadata -p aws-smithy-runtime-api --offline'
Benchmark 1: /Users/byron/dev/github.com/rust-lang/cargo/target/release/cargo package --no-verify --no-metadata -p aws-smithy-runtime-api --offline
  Time (mean ± σ):     967.5 ms ±  75.5 ms    [User: 718.7 ms, System: 3491.2 ms]
  Range (min … max):   923.2 ms … 1111.0 ms    10 runs

  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.

Benchmark 2: cargo package --no-verify --no-metadata -p aws-smithy-runtime-api --offline
  Time (mean ± σ):     593.1 ms ±   7.2 ms    [User: 506.7 ms, System: 93.1 ms]
  Range (min … max):   585.3 ms … 610.4 ms    10 runs

Summary
  cargo package --no-verify --no-metadata -p aws-smithy-runtime-api --offline ran
    1.63 ± 0.13 times faster than /Users/byron/dev/github.com/rust-lang/cargo/target/release/cargo package --no-verify --no-metadata -p aws-smithy-runtime-api --offline

Even though the result is less pronounced, it's still 1.6x slower than it was before. The timings also fluctuate much more, probably a side-effect of spawning many threads for the full status check.

The profile run shows the same. Here we see how gitoxide splits up the status run into parallel operations, with the dirwalk to find untracked files being the slowest at ~700ms wallclock, the index check is parallelized and finishes in ~300ms wallclock, and finally there is the tree-index diff with ~80ms.

Screenshot 2025-07-18 at 05 17 20

Also, there is no other bottleneck when looking at the actual package function.

Screenshot 2025-07-18 at 05 24 59

This means in theory, if the root of the project would be provided here…

.into_iter(None /* pathspec patterns */)

…one should be able to save a lot of time.

Initially I thought pathspecs weren't supported, but realised that dirty_files_outside_package_root is the reason it runs the status on the whole repository.

To me it feels inevitable to run the whole status, but the previous implementation certainly shows how it could be done. I also wonder where the performance problems are that I encountered.

My intuition here would be to eat the cost when publishing single repositories, but when multiple are done, it could do the full-repo status just once, while using a pathspec to do a per-project status instead.

I am looking at this loop:

for (pkg, cli_features) in sorted_pkgs {
let opts = PackageOpts {
cli_features: cli_features.clone(),
to_package: ops::Packages::Default,
..opts.clone()
};
let ar_files = prepare_archive(ws, &pkg, &opts)?;

If that was done, I'd think that a package operation per project could be about 200ms, down from 960ms or 600ms respectively.

For today I am out of time, but if you think this is acceptable (i.e. possible regression for single-packages, but speedup for multiple packages), then I'd implement it as described here. The last 'record' to package all of aws-sdk-rust is 12m (without --offline), it could probably be 10m without that extra expense.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, just a walkby comment: it seemed to me like you were comparing yourself to distributed (not locally built) cargo? If that's the case, I just wanted to note that we build Cargo on CI with lto=thin, which is otherwise not configured in the Cargo repo. The LTO config should extend also to the built C/C++ code, so to libgit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-git Area: anything dealing with git A-infrastructure Area: infrastructure around the cargo repo, ci, releases, etc. Command-package disposition-merge FCP with intent to merge finished-final-comment-period FCP complete S-waiting-on-author Status: The marked PR is awaiting some action (such as code changes) from the PR author. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-cargo Team: Cargo to-announce
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants