Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add linehaul info to uv-client #2493

Merged
merged 6 commits into from
Mar 18, 2024
Merged

Conversation

samypr100
Copy link
Contributor

@samypr100 samypr100 commented Mar 17, 2024

Summary

Closes #1958

This adds linehaul metadata to uv's user-agent when pep 508 markers are provided to the RegistryClientBuilder. Thanks to #2381, we were able to leverage most information from markers and avoid inconsistency.

Linehaul is meant to be accompanying metadata pip sends in it's user agent when talking to registries. You can see this output by running something like python -c 'from pip._internal.network.session import user_agent; print(user_agent())'.
In PyPI, this metadata processed by the linehaul-cloud-function. More info about linehaul can be found in #1958.

Below are some examples from pip:

  • Linux GHA: pip/24.0 {"ci":true,"cpu":"x86_64","distro":{"id":"jammy","libc":{"lib":"glibc","version":"2.35"},"name":"Ubuntu","version":"22.04"},"implementation":{"name":"CPython","version":"3.12.2"},"installer":{"name":"pip","version":"24.0"},"openssl_version":"OpenSSL 3.0.2 15 Mar 2022","python":"3.12.2","rustc_version":"1.76.0","system":{"name":"Linux","release":"6.5.0-1016-azure"}}
  • Windows GHA: pip/24.0 {"ci":true,"cpu":"AMD64","implementation":{"name":"CPython","version":"3.12.2"},"installer":{"name":"pip","version":"24.0"},"openssl_version":"OpenSSL 3.0.13 30 Jan 2024","python":"3.12.2","rustc_version":"1.76.0","system":{"name":"Windows","release":"2022Server"}}
  • OSX GHA: pip/24.0 {"ci":true,"cpu":"arm64","distro":{"name":"macOS","version":"14.2.1"},"implementation":{"name":"CPython","version":"3.12.2"},"installer":{"name":"pip","version":"24.0"},"openssl_version":"OpenSSL 3.0.13 30 Jan 2024","python":"3.12.2","rustc_version":"1.76.0","system":{"name":"Darwin","release":"23.2.0"}}

Here's how uv results look like (sorry for the keys not having the same order):

  • Linux GHA: uv/0.1.21 {"installer":{"name":"uv","version":"0.1.21"},"python":"3.12.2","implementation":{"name":"CPython","version":"3.12.2"},"distro":{"name":"Ubuntu","version":"22.04","id":"jammy","libc":null},"system":{"name":"Linux","release":"6.5.0-1016-azure"},"cpu":"x86_64","openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
  • Windows GHA: uv/0.1.21 {"installer":{"name":"uv","version":"0.1.21"},"python":"3.12.2","implementation":{"name":"CPython","version":"3.12.2"},"distro":null,"system":{"name":"Windows","release":"2022Server"},"cpu":"AMD64","openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
  • OSX GHA: uv/0.1.21 {"installer":{"name":"uv","version":"0.1.21"},"python":"3.12.2","implementation":{"name":"CPython","version":"3.12.2"},"distro":{"name":"macOS","version":"14.2.1","id":null,"libc":null},"system":{"name":"Darwin","release":"23.2.0"},"cpu":"arm64","openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

Distro information (such as the one pip uses from pip._vendor import distro to retrieve instead of platform module) was not retrieved from markers. Instead, the linux release codename/name/version uses sys-info crate, adding about 50us of extra overhead on linux. The distro osx version re-used the mac_os version implementation from #2381 which adds about 20us of overhead on osx. I tried to use other crates to avoid re-introducing mac_os.rs but most of them didn't yield satisfactory performance (40ms-60ms~) or had the wrong values needed (e.g. darwin version vs osx version).

I also didn't add rustc retrieval as those seem to add substantial overhead due to querying rustc. PyPy version detection was also not added to avoid adding extra overhead to support PyPy for linehaul. All other behavior was kept 1-1 to match what pip's linehaul implementation does (as of 24.0). This also aligns with what was discussed in #1958.

Test Plan

Added new integration test to uv-client.

@samypr100 samypr100 marked this pull request as ready for review March 17, 2024 03:40
@konstin
Copy link
Member

konstin commented Mar 18, 2024

I also didn't add libc retrieval or rustc retrieval as those seem to add substantial overhead due to querying ldd or rustc.

We now have this information for the python interpreter itself, i added it.

Distro information (such as the one pip uses from pip._vendor import distro to retrieve instead of platform module) was not retrieved from markers. Instead, the linux release codename/name/version uses sys-info crate, adding about 50us of extra overhead on linux. The distro osx version re-used the mac_os version implementation from #2381 which adds about 20us of overhead on osx. I tried to use other crates to avoid re-introducing mac_os.rs but most of them didn't yield satisfactory performance (40ms-60ms~) or had the wrong values needed (e.g. darwin version vs osx version).

Great performance work! On my machine (ubuntu) we're at 0.000s and even relative to the 2ms warm cache black resolve the linehaul instantiation is fast.

Warm cache resolve black
Warm cache resolve black

@konstin konstin enabled auto-merge (squash) March 18, 2024 10:38
@konstin konstin merged commit 42973cd into astral-sh:main Mar 18, 2024
30 checks passed
konstin added a commit that referenced this pull request Mar 18, 2024
@konstin konstin added the enhancement New feature or request label Mar 18, 2024
konstin added a commit that referenced this pull request Mar 18, 2024
@samypr100 samypr100 deleted the linehaul branch March 18, 2024 12:45
use serde::Deserialize;

/// Get the macOS version from the SystemVersion.plist file.
pub(crate) fn get_mac_os_version() -> Result<String, PlatformError> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this? I thought we got it from the interpreter now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that's the same thing, thanks #2509

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note, it's not quite just major.minor, it should include the product version from the plist as-is based on my testing with pip.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to hand this off to a mac user who can test this on an actual mac

Copy link
Member

@charliermarsh charliermarsh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

konstin added a commit that referenced this pull request Mar 20, 2024
See
#2493 (review).

---------

Co-authored-by: Zanie Blue <contact@zanie.dev>
@MichaReiser
Copy link
Member

I tried to query the big-data table for uv downloads but there are no uv records yet (but the latest ingested data is from today). I think the reason for it is that we need to add a uv parser in here https://github.com/pypi/linehaul-cloud-function/blob/main/linehaul/ua/parser.py

@samypr100
Copy link
Contributor Author

Yes, a parser can now be added since the data is being sent as of 0.1.22 ❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Include linehaul information in user agent
4 participants