Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support remote https:// requirements files (#1332) #2081

Merged
merged 21 commits into from Mar 6, 2024

Conversation

jannisko
Copy link
Contributor

@jannisko jannisko commented Feb 29, 2024

Summary

Allow using http(s) urls for constraints and requirements files handed to the CLI, by handling paths starting with http:// or https:// differently. This allows commands for such as: uv pip install -c https://raw.githubusercontent.com/apache/airflow/constraints-2.8.1/constraints-3.8.txt requests.

closes #1332

Test Plan

Testing install using a constraints.txt file hosted on github in the airflow repository:
https://github.com/jannisko/uv/blob/fbdc2eba8e5e8fb4160baae495daff8fb48df13f/crates/uv/tests/pip_install.rs#L1440-L1484

Advice Needed

  • filesystem/http dispatch is implemented at a relatively low level (at crates/uv-fs/src/lib.rs#read_to_string). Should I change some naming here so it is obvious that the function is able to dispatch?
  • I kept the CLI argument for -c and -r as a PathBuf, even though now it is technically either a path or a url. We could either keep this as is for now, or implement a new enum for this case? The enum could then handle dispatch to files/http.
  • Using another abstraction layer like https://docs.rs/object_store/latest/object_store/ for the files/urls/[s3] could work as well, though I ran into a bug during testing which I couldn't debug

@charliermarsh charliermarsh self-assigned this Feb 29, 2024
@charliermarsh charliermarsh added the configuration Settings and such label Feb 29, 2024
@charliermarsh
Copy link
Member

Thanks! I've been conflicted on whether to support this but ultimately it does seem reasonable.

For now, only manually tested. I need advice on how to test this feature. Is a test using a permalink to a GitHub repo considered stable enough for this case or do we need a temp http server for tests?

Using a permalink to a GitHub repo is totally fine and consistent with some of our other tests.

@jannisko
Copy link
Contributor Author

jannisko commented Mar 1, 2024

Ready for review now in my eyes. I tried replacing the PathBuf with something more expressive, like a LocalOrHttpPath enum, but this turned into a mess. Paths really are a good way of handling this use case IMO. Even nested requirements/constraints files work out of the box via https, as long as they are also located in the same relative position on the server.

e.g.
/inner/requirements.txt:

requests
-r ../requirements.txt

/requirements.txt

ruff
$ cargo run -- pip install -r http://localhost:8000/inner/requirements.txt
    Finished dev [unoptimized + debuginfo] target(s) in 0.19s
     Running `target/debug/uv pip install -r 'http://localhost:8000/inner/requirements.txt'`
Resolved 6 packages in 614ms
Downloaded 1 package in 2.56s
Installed 6 packages in 30ms
 + certifi==2024.2.2
 + charset-normalizer==3.3.2
 + idna==3.6
 + requests==2.31.0
 + ruff==0.3.0
 + urllib3==2.2.1

@jannisko jannisko changed the title support remote http[s] constraint.txt files (#1332) support remote http[s] constraints/requirements/overrides.txt files (#1332) Mar 1, 2024
@charliermarsh
Copy link
Member

Okay cool, thank you! Acknowledging that it's now blocked on me reviewing.

potiuk added a commit to potiuk/airflow that referenced this pull request Mar 2, 2024
With the change to switch to uv, we skipped constraints being used
in CI image - in effect all PR were not using constraints, but they
were using not constraint dependencues but lowest-direct
mode of installation so direct dependencies would not be upgraded
in such case, only the transitive ones, so the risk of failure was
anyhow small even if someone released a new, breakong dependency.

The reason is that `uv` currently does not support installing
constraints from URL. We had been silently failing back to the
"no-constraints" way in such case (this is default mode if for any reason
constraint build fail in such case.

It introduced the risk that in case 3rd-party breaking dependency
was released it would also start breaking regular PRs,
not only the "canary" build.

We fix it by downloading constraints locally when they are remote and
using them from there.

While this is being worked on in astral-sh/uv#2081
and likely to land in uv 0.1.14, it's also a good idea to actually
download the constraints and keep them around - this might be handy
if you want to later use constraints to install "golden" set of
dependencies wihtout necessity to build the right URL - you can always
use `${HOME}/constraints.txt`.

This PR fixes it and also changes the fallback mechanism to perform the
lowest-direct upgrade only in case the constraint build fails, rather
than always run the lowest-dirct upgrade even if constraints install
works fine - this will make sure that most PRs are using exactly the
constraint version of the dependencies (at least the version of
constraints that were generated last time when pyproject.toml changed).
@potiuk
Copy link

potiuk commented Mar 2, 2024

Just to add to the weight here: This is also Highly used feature in apache-airflow - if users of airflow would like to use the recommended installation option - they cannot use it directly now with uv - they need to download the constraints first.

This caused a small hicc-up in our CI images in Airflow (as I did not notice that remove constraints installation was not working with uv. I fix it in apache/airflow#37845 (and generally downloading constraints to CI image is a good idea) - but users trying to do reproducible installation of Airlfow should be able to use the URL, so big cheering on that one :).

potiuk added a commit to potiuk/airflow that referenced this pull request Mar 2, 2024
With the change to switch to uv, we skipped constraints being used
in CI image - in effect all PR were not using constraints, but they
were using not constraint dependencues but lowest-direct
mode of installation so direct dependencies would not be upgraded
in such case, only the transitive ones, so the risk of failure was
anyhow small even if someone released a new, breakong dependency.

The reason is that `uv` currently does not support installing
constraints from URL. We had been silently failing back to the
"no-constraints" way in such case (this is default mode if for any reason
constraint build fail in such case.

It introduced the risk that in case 3rd-party breaking dependency
was released it would also start breaking regular PRs,
not only the "canary" build.

We fix it by downloading constraints locally when they are remote and
using them from there.

While this is being worked on in astral-sh/uv#2081
and likely to land in uv 0.1.14, it's also a good idea to actually
download the constraints and keep them around - this might be handy
if you want to later use constraints to install "golden" set of
dependencies wihtout necessity to build the right URL - you can always
use `${HOME}/constraints.txt`.

This PR fixes it and also changes the fallback mechanism to perform the
lowest-direct upgrade only in case the constraint build fails, rather
than always run the lowest-dirct upgrade even if constraints install
works fine - this will make sure that most PRs are using exactly the
constraint version of the dependencies (at least the version of
constraints that were generated last time when pyproject.toml changed).
potiuk added a commit to potiuk/airflow that referenced this pull request Mar 2, 2024
With the change to switch to uv, we skipped constraints being used
in CI image - in effect all PR were not using constraints, but they
were using not constraint dependencues but lowest-direct
mode of installation so direct dependencies would not be upgraded
in such case, only the transitive ones, so the risk of failure was
anyhow small even if someone released a new, breakong dependency.

The reason is that `uv` currently does not support installing
constraints from URL. We had been silently failing back to the
"no-constraints" way in such case (this is default mode if for any reason
constraint build fail in such case.

It introduced the risk that in case 3rd-party breaking dependency
was released it would also start breaking regular PRs,
not only the "canary" build.

We fix it by downloading constraints locally when they are remote and
using them from there.

While this is being worked on in astral-sh/uv#2081
and likely to land in uv 0.1.14, it's also a good idea to actually
download the constraints and keep them around - this might be handy
if you want to later use constraints to install "golden" set of
dependencies wihtout necessity to build the right URL - you can always
use `${HOME}/constraints.txt`.

This PR fixes it and also changes the fallback mechanism to perform the
lowest-direct upgrade only in case the constraint build fails, rather
than always run the lowest-dirct upgrade even if constraints install
works fine - this will make sure that most PRs are using exactly the
constraint version of the dependencies (at least the version of
constraints that were generated last time when pyproject.toml changed).
potiuk added a commit to potiuk/airflow that referenced this pull request Mar 2, 2024
With the change to switch to uv, we skipped constraints being used
in CI image - in effect all PR were not using constraints, but they
were using not constraint dependencues but lowest-direct
mode of installation so direct dependencies would not be upgraded
in such case, only the transitive ones, so the risk of failure was
anyhow small even if someone released a new, breakong dependency.

The reason is that `uv` currently does not support installing
constraints from URL. We had been silently failing back to the
"no-constraints" way in such case (this is default mode if for any reason
constraint build fail in such case.

It introduced the risk that in case 3rd-party breaking dependency
was released it would also start breaking regular PRs,
not only the "canary" build.

We fix it by downloading constraints locally when they are remote and
using them from there.

While this is being worked on in astral-sh/uv#2081
and likely to land in uv 0.1.14, it's also a good idea to actually
download the constraints and keep them around - this might be handy
if you want to later use constraints to install "golden" set of
dependencies wihtout necessity to build the right URL - you can always
use `${HOME}/constraints.txt`.

This PR fixes it and also changes the fallback mechanism to perform the
lowest-direct upgrade only in case the constraint build fails, rather
than always run the lowest-dirct upgrade even if constraints install
works fine - this will make sure that most PRs are using exactly the
constraint version of the dependencies (at least the version of
constraints that were generated last time when pyproject.toml changed).
potiuk added a commit to apache/airflow that referenced this pull request Mar 2, 2024
With the change to switch to uv, we skipped constraints being used
in CI image - in effect all PR were not using constraints, but they
were using not constraint dependencues but lowest-direct
mode of installation so direct dependencies would not be upgraded
in such case, only the transitive ones, so the risk of failure was
anyhow small even if someone released a new, breakong dependency.

The reason is that `uv` currently does not support installing
constraints from URL. We had been silently failing back to the
"no-constraints" way in such case (this is default mode if for any reason
constraint build fail in such case.

It introduced the risk that in case 3rd-party breaking dependency
was released it would also start breaking regular PRs,
not only the "canary" build.

We fix it by downloading constraints locally when they are remote and
using them from there.

While this is being worked on in astral-sh/uv#2081
and likely to land in uv 0.1.14, it's also a good idea to actually
download the constraints and keep them around - this might be handy
if you want to later use constraints to install "golden" set of
dependencies wihtout necessity to build the right URL - you can always
use `${HOME}/constraints.txt`.

This PR fixes it and also changes the fallback mechanism to perform the
lowest-direct upgrade only in case the constraint build fails, rather
than always run the lowest-dirct upgrade even if constraints install
works fine - this will make sure that most PRs are using exactly the
constraint version of the dependencies (at least the version of
constraints that were generated last time when pyproject.toml changed).
@@ -386,13 +390,14 @@ impl RequirementsTxt {
end,
} => {
let sub_file = requirements_dir.join(filename);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the join here is fine even if filename is a URL?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup works really well from my testing. It is basically handled like a relative path starting with the http: directory. The only sketchy thing I could find for now is that it doesn't survive a round trip through .components():
PathBuf::from_iter(PathBuf::from_str("http://test/abc.txt").unwrap().components()) = http:/test/abc.txt (one / missing).

@@ -320,24 +321,25 @@ pub struct RequirementsTxt {
impl RequirementsTxt {
/// See module level documentation
#[instrument(skip_all, fields(requirements_txt = requirements_txt.as_ref().as_os_str().to_str()))]
pub fn parse(
pub async fn parse(
requirements_txt: impl AsRef<Path>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the question of using Path: Part of me feels like we should be using VerbatimUrl for this (used elsewhere in this crate). VerbatimUrl can represent both URLs and paths, but it preserves the user-provided representation, which we leverage to ensure that we don't expand environment variables (like secrets) when we write results back out to requirements.txt. I think we would need a join method on it, which would perhaps be somewhat awkward... But would you mind giving it a try, and seeing if it fits? Alternatively, we can punt it to another PR -- I would be open to merging without it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to give it a try in a separate PR!

Copy link
Member

@charliermarsh charliermarsh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, thanks! Some comments below, only one blocking.

Copy link
Member

@charliermarsh charliermarsh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, one other critical point: how does this work with authentication? Does pip support reading from URLs that require authentication?

We probably need to pass through our RegistryClient or something similarly-constructed. That would also ensure that this behavior respects the --offline flag.

@jannisko
Copy link
Contributor Author

jannisko commented Mar 4, 2024

Oh, one other critical point: how does this work with authentication? Does pip support reading from URLs that require authentication?

Pip seems to ask for interactive input when the url requires auth:

$ pip install -r http://localhost:8000/requirements.txt
User for localhost:8000: jannis
Password: 
Requirement already satisfied: ruff in /Users/jannis_kowalick/.pyenv/versions/3.10.9/lib/python3.10/site-packages (from -r http://localhost:8000/requirements.txt (line 1)) (0.2.1)

With my current implementation, uv would interpret the "no auth header" message as a package name, which seems dangeous:

$ uv pip install -r http://localhost:8000/requirements.txt
error: Unsupported requirement in http://localhost:8000/requirements.txt at position 0
  Caused by: URL requirement must be preceded by a package name. Add the name of the package before the URL (e.g., `package_name @ https://...`).
no auth header received
^^

I'll try to figure out how to mirror pips behavior, and take a look at the RegistryClient as well.

this change also causes pyproject.toml files to be excluded from the
remote file logic
abhishekbhakat pushed a commit to abhishekbhakat/my_airflow that referenced this pull request Mar 5, 2024
With the change to switch to uv, we skipped constraints being used
in CI image - in effect all PR were not using constraints, but they
were using not constraint dependencues but lowest-direct
mode of installation so direct dependencies would not be upgraded
in such case, only the transitive ones, so the risk of failure was
anyhow small even if someone released a new, breakong dependency.

The reason is that `uv` currently does not support installing
constraints from URL. We had been silently failing back to the
"no-constraints" way in such case (this is default mode if for any reason
constraint build fail in such case.

It introduced the risk that in case 3rd-party breaking dependency
was released it would also start breaking regular PRs,
not only the "canary" build.

We fix it by downloading constraints locally when they are remote and
using them from there.

While this is being worked on in astral-sh/uv#2081
and likely to land in uv 0.1.14, it's also a good idea to actually
download the constraints and keep them around - this might be handy
if you want to later use constraints to install "golden" set of
dependencies wihtout necessity to build the right URL - you can always
use `${HOME}/constraints.txt`.

This PR fixes it and also changes the fallback mechanism to perform the
lowest-direct upgrade only in case the constraint build fails, rather
than always run the lowest-dirct upgrade even if constraints install
works fine - this will make sure that most PRs are using exactly the
constraint version of the dependencies (at least the version of
constraints that were generated last time when pyproject.toml changed).
@jannisko
Copy link
Contributor Author

jannisko commented Mar 5, 2024

@charliermarsh I moved the http handling into requirements-txt and I'm handing the RegistryClient over in there now too. Let me know what you think!

@charliermarsh charliermarsh self-requested a review March 5, 2024 23:13
Copy link
Member

@charliermarsh charliermarsh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@charliermarsh
Copy link
Member

@jannisko -- I mostly made cosmetic changes, but I did make one behavioral change, which is that I removed dialoguer. I'd like to see if this comes up in practice for users before adding support for interactively prompting credentials -- it adds complexity, and it's also yet another dependency in our tree. If we need to restore, we'll always have it here in the Git history.

@charliermarsh charliermarsh changed the title support remote http[s] constraints/requirements/overrides.txt files (#1332) Support remote https:// requirements files (#1332) Mar 6, 2024
@charliermarsh charliermarsh enabled auto-merge (squash) March 6, 2024 04:03
@jannisko
Copy link
Contributor Author

jannisko commented Mar 6, 2024

Thanks a lot @charliermarsh for all the feedback!

kdeldycke added a commit to kdeldycke/workflows that referenced this pull request Mar 16, 2024
utkarsharma2 pushed a commit to astronomer/airflow that referenced this pull request Apr 22, 2024
With the change to switch to uv, we skipped constraints being used
in CI image - in effect all PR were not using constraints, but they
were using not constraint dependencues but lowest-direct
mode of installation so direct dependencies would not be upgraded
in such case, only the transitive ones, so the risk of failure was
anyhow small even if someone released a new, breakong dependency.

The reason is that `uv` currently does not support installing
constraints from URL. We had been silently failing back to the
"no-constraints" way in such case (this is default mode if for any reason
constraint build fail in such case.

It introduced the risk that in case 3rd-party breaking dependency
was released it would also start breaking regular PRs,
not only the "canary" build.

We fix it by downloading constraints locally when they are remote and
using them from there.

While this is being worked on in astral-sh/uv#2081
and likely to land in uv 0.1.14, it's also a good idea to actually
download the constraints and keep them around - this might be handy
if you want to later use constraints to install "golden" set of
dependencies wihtout necessity to build the right URL - you can always
use `${HOME}/constraints.txt`.

This PR fixes it and also changes the fallback mechanism to perform the
lowest-direct upgrade only in case the constraint build fails, rather
than always run the lowest-dirct upgrade even if constraints install
works fine - this will make sure that most PRs are using exactly the
constraint version of the dependencies (at least the version of
constraints that were generated last time when pyproject.toml changed).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
configuration Settings and such
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support remote constraints files
3 participants