Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
vendor: implement --versioned-dirs #7631
For example, re-vendoring cargo's dependencies it makes a big difference in wallclock
Without explicit versions, initial vendor
With explicit versions, initial vendor:
Its interesting to look at the syscall summary:
Without explicit versions:
with explicit versions:
Specifically, there are a lot fewer opens, copy_file_ranges, and unlinks.
Thanks for the PR! I'm a little surprised though that a motivator for this change is performance, vendoring is typically a pretty rare operation, right? This was originally added to
Since this is a proposal for a new stable feature, would it be possible to add tests for this as well, exercising this to make sure it works right?
Yeah, performance is starting to be a blocker with 900+ packages. It probably wouldn't be for normal use, but since I'm developing a tool for vendored package management using cargo vendor as an underpinning, I'm noticing it pretty strongly. I'm on PTO right now so I haven't checked to see how much this PR improves it, but I was surprised at how much difference it make to vendoring cargo's deps.
The main motivations are:
I'll add some tests and docs.
Team member @alexcrichton has proposed to merge this. The next step is review by the rest of the tagged team members:
No concerns currently listed.
Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!
See this document for info about what commands tagged team members can give me.
Implement --explicit-version from standalone cargo-vendor. This helps with vendoring performance as it avoids redundantly deleting and re-copying already vendored packages. For example, when re-vendoring cargo's dependencies it makes a big improvement on wallclock time. For initial vendoring it makes no difference, but re-vendoring (ie, when most or all dependencies haven't changed) without explicit versions is actually slightly slower (5.8s -> 6s), but with explicit versions it goes from 5.8s -> 1.6s. Timings: Without explicit versions, initial vendor real 0m5.810s user 0m0.924s sys 0m2.491s Re-vendor: real 0m6.083s user 0m0.937s sys 0m2.654s With explicit versions, initial vendor: real 0m5.810s user 0m0.937s sys 0m2.461s Re-vendor: real 0m1.567s user 0m0.578s sys 0m0.967s The summaries of syscalls executed shows why: Revendoring without explicit versions: ``` % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 25.17 1.104699 18 59432 1065 openat 19.86 0.871574 21 41156 13825 unlink 13.64 0.598739 2 210510 lstat 9.02 0.395948 29 13208 copy_file_range 8.00 0.351242 11 30245 read 6.36 0.279005 3 72487 4476 statx 5.35 0.235027 6 37219 write 4.02 0.176267 3 58368 close ``` with explicit versions: ``` 29.38 0.419068 15 27798 13825 unlink 25.52 0.364021 1 209586 lstat 20.67 0.294788 16 17967 1032 openat 10.42 0.148586 4 35646 write 3.53 0.050350 3 13825 chmod 3.14 0.044786 2 16701 1622 statx 2.19 0.031171 1 16936 close 1.86 0.026538 24 1078 rmdir ``` Specifically, there are a lot fewer opens, copy_file_ranges, and unlinks.