Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vendor: implement --versioned-dirs #7631

Open
wants to merge 5 commits into
base: master
from
Open

Conversation

@jsgf
Copy link
Contributor

jsgf commented Nov 26, 2019

Implement --explicit-version from standalone cargo-vendor. This helps with vendoring
performance as it avoids redundantly deleting and re-copying already vendored packages.

For example, re-vendoring cargo's dependencies it makes a big difference in wallclock
time. For initial vendoring it makes no difference, but re-vendoring (ie, when most or all dependencies haven't changed) without explicit versions is actually slightly slower
(5.8s -> 6s), but with explicit versions it goes from 5.8s -> 1.6s.

Timings:

Without explicit versions, initial vendor
real 0m5.810s
user 0m0.924s
sys 0m2.491s

Re-vendor:
real 0m6.083s
user 0m0.937s
sys 0m2.654s

With explicit versions, initial vendor:
real 0m5.810s
user 0m0.937s
sys 0m2.461s

Re-vendor:
real 0m1.567s
user 0m0.578s
sys 0m0.967s

Its interesting to look at the syscall summary:

Without explicit versions:

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 25.17    1.104699          18     59432      1065 openat
 19.86    0.871574          21     41156     13825 unlink
 13.64    0.598739           2    210510           lstat
  9.02    0.395948          29     13208           copy_file_range
  8.00    0.351242          11     30245           read
  6.36    0.279005           3     72487      4476 statx
  5.35    0.235027           6     37219           write
  4.02    0.176267           3     58368           close

with explicit versions:

 29.38    0.419068          15     27798     13825 unlink
 25.52    0.364021           1    209586           lstat
 20.67    0.294788          16     17967      1032 openat
 10.42    0.148586           4     35646           write
  3.53    0.050350           3     13825           chmod
  3.14    0.044786           2     16701      1622 statx
  2.19    0.031171           1     16936           close
  1.86    0.026538          24      1078           rmdir

Specifically, there are a lot fewer opens, copy_file_ranges, and unlinks.

@rust-highfive

This comment has been minimized.

Copy link

rust-highfive commented Nov 26, 2019

r? @alexcrichton

(rust_highfive has picked a reviewer for you, use r? to override)

@jsgf jsgf force-pushed the jsgf:explicit-version branch 2 times, most recently from 54e299c to d866352 Nov 26, 2019
@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Nov 26, 2019

Thanks for the PR! I'm a little surprised though that a motivator for this change is performance, vendoring is typically a pretty rare operation, right? This was originally added to cargo vendor specifically for the directory structure rather than performance

Since this is a proposal for a new stable feature, would it be possible to add tests for this as well, exercising this to make sure it works right?

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Nov 26, 2019

I think it'd also be good to write up some documentation to explain this feature and why it might be used.

@jsgf

This comment has been minimized.

Copy link
Contributor Author

jsgf commented Nov 26, 2019

Yeah, performance is starting to be a blocker with 900+ packages. It probably wouldn't be for normal use, but since I'm developing a tool for vendored package management using cargo vendor as an underpinning, I'm noticing it pretty strongly. I'm on PTO right now so I haven't checked to see how much this PR improves it, but I was surprised at how much difference it make to vendoring cargo's deps.

The main motivations are:

  • Keep directory names stable. The current behaviour causes spurious deltas in source control for no-op changes (both for the vendored crates, and paths that reference them).
  • Consistent naming. It's actually quite annoying that the most recent version can't be easily determined by eye when looking in the directory.
  • Update performance, particularly for small changes. cargo vendor should be O(change) rather than O(total size of vendored set). I suspect most vendored package directories are suffixless (70% for cargo), so its currently closer to the latter than the former.

I'll add some tests and docs.

@jsgf jsgf force-pushed the jsgf:explicit-version branch 2 times, most recently from 20c7cc4 to a39a151 Nov 26, 2019
@jsgf

This comment has been minimized.

Copy link
Contributor Author

jsgf commented Nov 27, 2019

Test and doc added. LMK if you think there's more to be tested/documented.

Copy link
Member

alexcrichton left a comment

Ok thanks! I'll post an fcp merge to see what the team thinks, since this'll be a new stable feature

.sp
\fB\-\-explicit\-version\fP
.RS 4
Normally versions are only added to disamiguate multiple versions of the same package.

This comment has been minimized.

Copy link
@alexcrichton

alexcrichton Dec 2, 2019

Member

"disambiguate"

@alexcrichton alexcrichton added the T-cargo label Dec 2, 2019
@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Dec 2, 2019

@rfcbot fcp merge

@rfcbot

This comment has been minimized.

Copy link
Collaborator

rfcbot commented Dec 2, 2019

Team member @alexcrichton has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

src/etc/man/cargo-vendor.1 Show resolved Hide resolved
src/bin/cargo/commands/vendor.rs Show resolved Hide resolved
@jsgf jsgf force-pushed the jsgf:explicit-version branch from 42df95c to 0c63364 Dec 3, 2019
@jsgf jsgf changed the title vendor: implement --explicit-version vendor: implement --versioned-dirs Dec 4, 2019
@jsgf

This comment has been minimized.

Copy link
Contributor Author

jsgf commented Dec 4, 2019

I changed the option to --versioned-dirs

src/etc/man/cargo-vendor.1 Outdated Show resolved Hide resolved
@rfcbot

This comment has been minimized.

Copy link
Collaborator

rfcbot commented Dec 5, 2019

🔔 This is now entering its final comment period, as per the review above. 🔔

jsgf added 4 commits Nov 26, 2019
Implement --explicit-version from standalone cargo-vendor. This helps with
vendoring performance as it avoids redundantly deleting and re-copying
already vendored packages.

For example, when re-vendoring cargo's dependencies it makes a big
improvement on wallclock time. For initial vendoring it makes no
difference, but re-vendoring (ie, when most or all dependencies haven't
changed) without explicit versions is actually slightly slower (5.8s ->
6s), but with explicit versions it goes from 5.8s -> 1.6s.

Timings:

Without explicit versions, initial vendor
real	0m5.810s
user	0m0.924s
sys	0m2.491s

Re-vendor:
real	0m6.083s
user	0m0.937s
sys	0m2.654s

With explicit versions, initial vendor:
real	0m5.810s
user	0m0.937s
sys	0m2.461s

Re-vendor:
real	0m1.567s
user	0m0.578s
sys	0m0.967s

The summaries of syscalls executed shows why:

Revendoring without explicit versions:
```
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 25.17    1.104699          18     59432      1065 openat
 19.86    0.871574          21     41156     13825 unlink
 13.64    0.598739           2    210510           lstat
  9.02    0.395948          29     13208           copy_file_range
  8.00    0.351242          11     30245           read
  6.36    0.279005           3     72487      4476 statx
  5.35    0.235027           6     37219           write
  4.02    0.176267           3     58368           close
```
with explicit versions:
```
 29.38    0.419068          15     27798     13825 unlink
 25.52    0.364021           1    209586           lstat
 20.67    0.294788          16     17967      1032 openat
 10.42    0.148586           4     35646           write
  3.53    0.050350           3     13825           chmod
  3.14    0.044786           2     16701      1622 statx
  2.19    0.031171           1     16936           close
  1.86    0.026538          24      1078           rmdir
```

Specifically, there are a lot fewer opens, copy_file_ranges, and unlinks.
@jsgf jsgf force-pushed the jsgf:explicit-version branch from 5ac129d to fd80795 Dec 6, 2019
@jsgf

This comment has been minimized.

Copy link
Contributor Author

jsgf commented Dec 10, 2019

@joshtriplett @ehuss Documentation updated, and I think all comments addressed.

@jsgf jsgf requested a review from joshtriplett Dec 10, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.