Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support conflicting URL in separate forks #4435

Merged
merged 10 commits into from
Jun 26, 2024
Merged

Conversation

konstin
Copy link
Member

@konstin konstin commented Jun 21, 2024

Downstack PR: #4481

Introduction

We support forking the dependency resolution to support conflicting registry requirements for different platforms, say on package range is required for an older python version while a newer is required for newer python versions, or dependencies that are different per platform. We need to extend this support to direct URL requirements.

dependencies = [
  "iniconfig @ https://files.pythonhosted.org/packages/ef/a6/62565a6e1cf69e10f5727360368e451d4b7f58beeac6173dc9db836a5b46/iniconfig-2.0.0-py3-none-any.whl ; python_version >= '3.12'",
  "iniconfig @ https://files.pythonhosted.org/packages/9b/dd/b3c12c6d707058fa947864b67f0c4e0c39ef8610988d7baea9578f3c48f3/iniconfig-1.1.1-py2.py3-none-any.whl ; python_version < '3.12'"
]

This did not work because Urls was built on the assumption that there is a single allowed URL per package. We collect all allowed URL ahead of resolution by following direct URL dependencies (including path dependencies) transitively, i.e. a registry distribution can't require a URL.

The same package can have Registry and URL requirements

Consider the following two cases:

requirements.in:

werkzeug==2.0.0
werkzeug @ https://files.pythonhosted.org/packages/ff/1d/960bb4017c68674a1cb099534840f18d3def3ce44aed12b5ed8b78e0153e/Werkzeug-2.0.0-py3-none-any.whl

pyproject.toml:

dependencies = [
  "iniconfig == 1.1.1 ; python_version < '3.12'",
  "iniconfig @ git+https://github.com/pytest-dev/iniconfig@93f5930e668c0d1ddf4597e38dd0dea4e2665e7a ; python_version >= '3.12'",
]

In the first case, we want the URL to override the registry dependency, in the second case we want to fork and have one branch use the registry and the other the URL. We have to know about this in PubGrubRequirement::from_registry_requirement, but we only fork after the current method.

Consider the following case too:

a:

c==1.0.0
b @ https://b.zip

b:

c @ https://c_new.zip ; python_version >= '3.12'",
c @ https://c_old.zip ; python_version < '3.12'",

When we convert the requirements of a, we can't know the url of c yet. The solution is to remove the Url from PubGrubPackage: The Url is redundant with PackageName, there can be only one url per package name per fork. We now do the following: We track the urls from requirements in PubGrubDependency. After forking, we call add_package_version_dependencies where we apply override URLs, check if the URL is allowed and check if the url is unique in this fork. When we request a distribution, we ask the fork urls for the real URL. Since we prioritize url dependencies over registry dependencies and skip packages with Urls entries in pre-visiting, we know that when fetching a package, we know if it has a url or not.

URL conflicts

pyproject.toml (invalid):

dependencies = [
  "iniconfig @ https://files.pythonhosted.org/packages/44/39/e96292c7f7068e58877f476908c5974dc76c37c623f1fa332fe4ed6dfbec/iniconfig-1.1.0.tar.gz",
  "iniconfig @ https://files.pythonhosted.org/packages/9b/dd/b3c12c6d707058fa947864b67f0c4e0c39ef8610988d7baea9578f3c48f3/iniconfig-1.1.1-py2.py3-none-any.whl ; python_version < '3.12'",
  "iniconfig @ https://files.pythonhosted.org/packages/ef/a6/62565a6e1cf69e10f5727360368e451d4b7f58beeac6173dc9db836a5b46/iniconfig-2.0.0-py3-none-any.whl ; python_version >= '3.12'",
]

On the fork state, we keep ForkUrls that check for conflicts after forking, rejecting the third case because we added two packages of the same name with different URLs.

We need to flatten out the requirements before transformation into pubgrub requirements to get the full list of other requirements which may contain a URL, which was changed in a previous PR: #4430.

Complex Example

a:

dependencies = [
  # Force a split
  "anyio==4.3.0 ; python_version >= '3.12'",
  "anyio==4.2.0 ; python_version < '3.12'",
  # Include URLs transitively
  "b"
]

b:

dependencies = [
  # Only one is used in each split.
  "b1 ; python_version < '3.12'",
  "b2 ; python_version >= '3.12'",
  "b3 ; python_version >= '3.12'",
]

b1:

dependencies = [
  "iniconfig @ https://files.pythonhosted.org/packages/9b/dd/b3c12c6d707058fa947864b67f0c4e0c39ef8610988d7baea9578f3c48f3/iniconfig-1.1.1-py2.py3-none-any.whl",
]

b2:

dependencies = [
  "iniconfig @ https://files.pythonhosted.org/packages/ef/a6/62565a6e1cf69e10f5727360368e451d4b7f58beeac6173dc9db836a5b46/iniconfig-2.0.0-py3-none-any.whl",
]

b3:

dependencies = [
  "iniconfig @ https://files.pythonhosted.org/packages/44/39/e96292c7f7068e58877f476908c5974dc76c37c623f1fa332fe4ed6dfbec/iniconfig-1.1.0.tar.gz",
]

In this example, all packages are url requirements (directory requirements) and the root package is a. We first split on a, b being in each split. In the first fork, we reach b1, the fork URLs are empty, we insert the iniconfig 1.1.1 URL, and then we skip over b2 and b3 since the mark is disjoint with the fork markers. In the second fork, we skip over b1, visit b2, insert the iniconfig 2.0.0 URL into the again empty fork URLs, then visit b3 and try to insert the iniconfig 1.1.0 URL. At this point we find a conflict for the iniconfig URL and error.

Closing

The git tests are slow, but they make the best example for different URL types i could find.

Part of #3927. This PR does not handle Locals or pre-releases yet.

@konstin konstin added the preview Experimental behavior label Jun 21, 2024
@konstin konstin requested a review from BurntSushi June 21, 2024 16:10
@konstin konstin marked this pull request as draft June 24, 2024 12:03
Copy link

codspeed-hq bot commented Jun 24, 2024

CodSpeed Performance Report

Merging #4435 will improve performances by 23.65%

Comparing konsti/branching-url-support (7647172) with main (ca92b55)

Summary

⚡ 1 improvements
✅ 12 untouched benchmarks

Benchmarks breakdown

Benchmark main konsti/branching-url-support Change
resolve_warm_airflow 1.3 s 1 s +23.65%

@konstin konstin force-pushed the konsti/branching-url-support branch from b1c9cfd to 714df58 Compare June 24, 2024 13:07
@konstin konstin force-pushed the konsti/fork-before-transforming branch from 4f77bc4 to 4965708 Compare June 24, 2024 13:08
@konstin konstin force-pushed the konsti/branching-url-support branch from 714df58 to aaa0acb Compare June 24, 2024 13:12
konstin added a commit that referenced this pull request Jun 24, 2024
I caused this error during development and having the name of the task
on it is helpful for debugging.

Split out from #4435
konstin added a commit that referenced this pull request Jun 24, 2024
I have to add yet another indentation level to the prerelease-available check in `PubGrubReportFormatter::hints` for #4435, so i've broken the code into methods and decreased indentation in this split out refactoring-only change.
konstin added a commit that referenced this pull request Jun 24, 2024
I have to add yet another indentation level to the prerelease-available
check in `PubGrubReportFormatter::hints` for #4435, so i've broken the
code into methods and decreased indentation in this split out
refactoring-only change.
@konstin konstin force-pushed the konsti/branching-url-support branch from 3a08323 to 8386382 Compare June 24, 2024 17:32
@konstin konstin force-pushed the konsti/fork-before-transforming branch from 4965708 to 8fb9c69 Compare June 24, 2024 17:33
konstin added a commit that referenced this pull request Jun 24, 2024
In the last PR (#4430), we flatten the requirements. In the next PR (#4435), we want to pass `Url` around next to `PubGrubPackage` and `Range<Version>` to keep track of which `Requirement`s added a url across forking. This PR is a refactoring split out from #4435 that rolls the dependency conversion into a single iterator and introduces a new `PubGrubDependency` struct as abstraction over `(PubGrubPackage, Range<Version>)` (or `(PubGrubPackage, Range<Version>, VerbatimParsedUrl)` in the next PR), and it removes the now unnecessary `PubGrubDependencies` abstraction.
konstin added a commit that referenced this pull request Jun 24, 2024
In the last PR (#4430), we flatten the requirements. In the next PR (#4435), we want to pass `Url` around next to `PubGrubPackage` and `Range<Version>` to keep track of which `Requirement`s added a url across forking. This PR is a refactoring split out from #4435 that rolls the dependency conversion into a single iterator and introduces a new `PubGrubDependency` struct as abstraction over `(PubGrubPackage, Range<Version>)` (or `(PubGrubPackage, Range<Version>, VerbatimParsedUrl)` in the next PR), and it removes the now unnecessary `PubGrubDependencies` abstraction.
@konstin konstin force-pushed the konsti/branching-url-support branch 2 times, most recently from 61e1fcc to 0a33469 Compare June 24, 2024 18:06
@konstin konstin changed the base branch from konsti/fork-before-transforming to konsti/PubGrubDependencies June 24, 2024 18:07
@konstin konstin force-pushed the konsti/branching-url-support branch from 0a33469 to 01883ea Compare June 24, 2024 20:07
// since we weren't sure whether it might also be a URL requirement when
// transforming the requirements. For that case, we do another request here
// (idempotent due to caching).
self.request_package(&state.next, url, &request_sink)?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this only necessary if url is None?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that's the majority of the time anyway.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need it if fork_urls return None and Urls return something, that is the case when it's ambiguous. I didn't add that check specifically because request_package already checks if the request was previously registered

@konstin
Copy link
Member Author

konstin commented Jun 25, 2024

The part I understand least is the whole override URLs thing.

Whenever we encounter an override URL, that takes precedence over everything else. We also only allow only a single URL per package from overrides, not supporting splitting on overrides. Does that make it clearer?

How do you think this will extend to locals and/or pre-releases?

Yes, i think we have to make the same changes for them too.

konstin added a commit that referenced this pull request Jun 25, 2024
In the last PR (#4430), we flatten the requirements. In the next PR (#4435), we want to pass `Url` around next to `PubGrubPackage` and `Range<Version>` to keep track of which `Requirement`s added a url across forking. This PR is a refactoring split out from #4435 that rolls the dependency conversion into a single iterator and introduces a new `PubGrubDependency` struct as abstraction over `(PubGrubPackage, Range<Version>)` (or `(PubGrubPackage, Range<Version>, VerbatimParsedUrl)` in the next PR), and it removes the now unnecessary `PubGrubDependencies` abstraction.
@konstin konstin force-pushed the konsti/PubGrubDependencies branch from 8362f3b to 9b7e47c Compare June 25, 2024 19:34
@konstin konstin force-pushed the konsti/branching-url-support branch from 30f4c2f to 0101939 Compare June 25, 2024 19:49
konstin added a commit that referenced this pull request Jun 25, 2024
In the last PR (#4430), we flatten the requirements. In the next PR (#4435), we want to pass `Url` around next to `PubGrubPackage` and `Range<Version>` to keep track of which `Requirement`s added a url across forking. This PR is a refactoring split out from #4435 that rolls the dependency conversion into a single iterator and introduces a new `PubGrubDependency` struct as abstraction over `(PubGrubPackage, Range<Version>)` (or `(PubGrubPackage, Range<Version>, VerbatimParsedUrl)` in the next PR), and it removes the now unnecessary `PubGrubDependencies` abstraction.
@konstin konstin force-pushed the konsti/PubGrubDependencies branch from 9b7e47c to c75b822 Compare June 25, 2024 22:03
konstin added a commit that referenced this pull request Jun 25, 2024
In the last PR (#4430), we flatten the requirements. In the next PR
(#4435), we want to pass `Url` around next to `PubGrubPackage` and
`Range<Version>` to keep track of which `Requirement`s added a url
across forking. This PR is a refactoring split out from #4435 that rolls
the dependency conversion into a single iterator and introduces a new
`PubGrubDependency` struct as abstraction over `(PubGrubPackage,
Range<Version>)` (or `(PubGrubPackage, Range<Version>,
VerbatimParsedUrl)` in the next PR), and it removes the now unnecessary
`PubGrubDependencies` abstraction.
Base automatically changed from konsti/PubGrubDependencies to main June 25, 2024 22:11
@konstin konstin force-pushed the konsti/branching-url-support branch from d21ab02 to cb32b21 Compare June 26, 2024 11:43
@konstin konstin merged commit d9dbb8a into main Jun 26, 2024
47 checks passed
@konstin konstin deleted the konsti/branching-url-support branch June 26, 2024 11:58
konstin added a commit that referenced this pull request Jun 26, 2024
`ResolverState::choose_version` had become huge, with an odd match due to the url handling from #4435. This refactoring breaks it into `choose_version`, `choose_version_registry` and `choose_version_url`. No functional changes.
konstin added a commit that referenced this pull request Jun 26, 2024
`ResolverState::choose_version` had become huge, with an odd match due to the url handling from #4435. This refactoring breaks it into `choose_version`, `choose_version_registry` and `choose_version_url`. No functional changes.
konstin added a commit that referenced this pull request Jun 26, 2024
`ResolverState::choose_version` had become huge, with an odd match due to the url handling from #4435. This refactoring breaks it into `choose_version`, `choose_version_registry` and `choose_version_url`. No functional changes.
konstin added a commit that referenced this pull request Jun 26, 2024
`ResolverState::choose_version` had become huge, with an odd match due
to the url handling from #4435. This refactoring breaks it into
`choose_version`, `choose_version_registry` and `choose_version_url`. No
functional changes.
ibraheemdev added a commit that referenced this pull request Jul 16, 2024
## Summary

Currently, the `Locals` type relies on there being a single local
version for a given package. With marker expressions this may not be
true, a similar problem to #4435.
This changes the `Locals` type to `ForkLocals`, which tracks locals for
a given fork. Local versions are now tracked on `PubGrubRequirement`
before forking.

Resolves #4580.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
preview Experimental behavior
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants