Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Manual PEP 440 version parser #373

Closed
konstin opened this issue Nov 9, 2023 · 3 comments · Fixed by #789
Closed

Manual PEP 440 version parser #373

konstin opened this issue Nov 9, 2023 · 3 comments · Fixed by #789
Assignees
Labels
performance Potential performance improvement
Milestone

Comments

@konstin
Copy link
Member

konstin commented Nov 9, 2023

Benchmarking the top 1k pypi packages serially, we spend 8% our in parsing version numbers with regex captures, solving the top 8k in parallel it's 14%. We should instead move to a hand-written PEP 440 parser, which would also much improve the error messages.

https://github.com/astral-sh/puffin/blob/6144de0a7ee67ca9ebb1348543ae643de7492e3b/crates/pep440-rs/src/version.rs#L692-L701

@charliermarsh
Copy link
Member

This could be a fun one for @BurntSushi.

@charliermarsh charliermarsh added the performance Potential performance improvement label Nov 9, 2023
@charliermarsh charliermarsh added this to the Future milestone Nov 9, 2023
@konstin
Copy link
Member Author

konstin commented Nov 9, 2023

@BurntSushi
Copy link
Member

I agree that getting rid of the regex captures and hand-rolling a version parser is probably the right tact to take here for the interim. But I wanted to pop up a level too. So I wrote this issue: #396

BurntSushi added a commit that referenced this issue Jan 4, 2024
This rewrites the parser to avoid regexes. We do things by hand directly
on a `&[u8]`, and also use more structured errors internally (although
we don't expose those yet). This is good for about a 1.1x improvement:

    $ hyperfine -w10 \
        'puffin pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null /m/astral/tmp/requirements.in' \
        'puffin-base pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null /m/astral/tmp/requirements.in'
    Benchmark 1: puffin pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null /m/astral/tmp/requirements.in
      Time (mean ± σ):     317.3 ms ±  18.1 ms    [User: 268.4 ms, System: 106.8 ms]
      Range (min … max):   297.4 ms … 360.1 ms    10 runs

    Benchmark 2: puffin-base pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null /m/astral/tmp/requirements.in
      Time (mean ± σ):     345.0 ms ±  12.4 ms    [User: 293.5 ms, System: 109.7 ms]
      Range (min … max):   329.9 ms … 372.8 ms    10 runs

    Summary
      puffin pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null /m/astral/tmp/requirements.in ran
        1.09 ± 0.07 times faster than puffin-base pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null /m/astral/tmp/requirements.in

The improvement here is rather small because the fast path that was
added recently was doing a lot of the heavy lifting.

In subsequent commits, we will:

* Expose the error type so that we don't just return a `String`.
* Add back a fast path for parsing just `w.x.y.z`.
* Tweak the representation of `Version` to make its `Ord` impl faster.
  In particular, this is still showing up as a bottleneck from
  resolution.

Fixes #373
BurntSushi added a commit that referenced this issue Jan 5, 2024
This rewrites the parser to avoid regexes. We do things by hand directly
on a `&[u8]`, and also use more structured errors internally (although
we don't expose those yet). This is good for about a 1.1x improvement:

    $ hyperfine -w10 \
        'puffin pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null /m/astral/tmp/requirements.in' \
        'puffin-base pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null /m/astral/tmp/requirements.in'
    Benchmark 1: puffin pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null /m/astral/tmp/requirements.in
      Time (mean ± σ):     317.3 ms ±  18.1 ms    [User: 268.4 ms, System: 106.8 ms]
      Range (min … max):   297.4 ms … 360.1 ms    10 runs

    Benchmark 2: puffin-base pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null /m/astral/tmp/requirements.in
      Time (mean ± σ):     345.0 ms ±  12.4 ms    [User: 293.5 ms, System: 109.7 ms]
      Range (min … max):   329.9 ms … 372.8 ms    10 runs

    Summary
      puffin pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null /m/astral/tmp/requirements.in ran
        1.09 ± 0.07 times faster than puffin-base pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null /m/astral/tmp/requirements.in

The improvement here is rather small because the fast path that was
added recently was doing a lot of the heavy lifting.

In subsequent commits, we will:

* Expose the error type so that we don't just return a `String`.
* Add back a fast path for parsing just `w.x.y.z`.
* Tweak the representation of `Version` to make its `Ord` impl faster.
  In particular, this is still showing up as a bottleneck from
  resolution.

Fixes #373
BurntSushi added a commit that referenced this issue Jan 5, 2024
This PR builds on #780 by making both version parsing faster, and
perhaps more importantly, making version comparisons much faster.
Overall, these changes result in a considerable improvement for the
`boto3.in` workload. Here's the status quo:

```
$ time puffin pip-compile --no-build --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/requirements/boto3.in
Resolved 31 packages in 34.56s

real    34.579
user    34.004
sys     0.413
maxmem  2867 MB
faults  0
```

And now with this PR:

```
$ time puffin pip-compile --no-build --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/requirements/boto3.in
Resolved 31 packages in 9.20s

real    9.218
user    8.919
sys     0.165
maxmem  463 MB
faults  0
```

This particular workload gets stuck in pubgrub doing resolution, and
thus benefits mightily from a faster `Version::cmp` routine. With that
said, this change does also help a fair bit with "normal" runs:

```
$ hyperfine -w10 \
    "puffin-base pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in" \
    "puffin-cmparc pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in"
Benchmark 1: puffin-base pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in
  Time (mean ± σ):     337.5 ms ±   3.9 ms    [User: 310.5 ms, System: 73.2 ms]
  Range (min … max):   333.6 ms … 343.4 ms    10 runs

Benchmark 2: puffin-cmparc pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in
  Time (mean ± σ):     189.8 ms ±   3.0 ms    [User: 168.1 ms, System: 78.4 ms]
  Range (min … max):   185.0 ms … 196.2 ms    15 runs

Summary
  puffin-cmparc pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in ran
    1.78 ± 0.03 times faster than puffin-base pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in
```

There is perhaps some future work here (detailed in the commit
messages), but I suspect it would be more fruitful to explore ways of
making resolution itself and/or deserialization faster.

Fixes #373, Closes #396
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Potential performance improvement
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants