Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add basic tool.uv.sources support #3263

Merged
merged 31 commits into from
May 3, 2024
Merged

Conversation

konstin
Copy link
Member

@konstin konstin commented Apr 25, 2024

Introduction

PEP 621 is limited. Specifically, it lacks

  • Relative path support
  • Editable support
  • Workspace support
  • Index pinning or any sort of index specification

The semantics of urls are a custom extension, PEP 440 does not specify how to use git references or subdirectories, instead pip has a custom stringly format. We need to somehow support these while still stying compatible with PEP 621.

tool.uv.source

Drawing inspiration from cargo, poetry and rye, we add tool.uv.sources or (for now stub only) tool.uv.workspace:

[project]
name = "albatross"
version = "0.1.0"
dependencies = [
  "tqdm >=4.66.2,<5",
  "torch ==2.2.2",
  "transformers[torch] >=4.39.3,<5",
  "importlib_metadata >=7.1.0,<8; python_version < '3.10'",
  "mollymawk ==0.1.0"
]

[tool.uv.sources]
tqdm = { git = "https://github.com/tqdm/tqdm", rev = "cc372d09dcd5a5eabdc6ed4cf365bdb0be004d44" }
importlib_metadata = { url = "https://github.com/python/importlib_metadata/archive/refs/tags/v7.1.0.zip" }
torch = { index = "torch-cu118" }
mollymawk = { workspace = true }

[tool.uv.workspace]
include = [
  "packages/mollymawk"
]

[tool.uv.indexes]
torch-cu118 = "https://download.pytorch.org/whl/cu118"

See docs/specifying_dependencies.md for a detailed explanation of the format. The basic gist is that project.dependencies is what ends up on pypi, while tool.uv.sources are your non-published additions. We do support the full range or PEP 508, we just hide it in the docs and prefer the exploded table for easier readability and less confusing with actual url parts.

This format should eventually be able to subsume requirements.txt's current use cases. While we will continue to support the legacy uv pip interface, this is a piece of the uv's own top level interface. Together with uv run and a lockfile format, you should only need to write pyproject.toml and do uv run, which generates/uses/updates your lockfile behind the scenes, no more pip-style requirements involved. It also lays the groundwork for implementing index pinning.

Changes

This PR implements:

  • Reading and lowering project.dependencies, project.optional-dependencies and tool.uv.sources into a new requirements format, including:
    • Git dependencies
    • Url dependencies
    • Path dependencies, including relative and editable
  • pip install integration
  • Error reporting for invalid tool.uv.sources
  • Json schema integration (works in pycharm, see below)
  • Draft user-level docs (see docs/specifying_dependencies.md)

It does not implement:

  • No pip compile testing, deprioritizing towards our own lockfile
  • Index pinning (stub definitions only)
  • Development dependencies
  • Workspace support (stub definitions only)
  • Overrides in pyproject.toml
  • Patching/replacing dependencies

One technically breaking change is that we now require user provided pyproject.toml to be valid wrt to PEP 621. Included files still fall back to PEP 517. That means pip install -r requirements.txt requires it to be valid while pip install -r requirements.txt with -e . as content falls back to PEP 517 as before.

Implementation

The pep508 requirement is replaced by a new UvRequirement (name up for bikeshedding, not particularly attached to the uv prefix). The still existing pep508_rs::Requirement type is a url format copied from pip's requirements.txt and doesn't appropriately capture all features we want/need to support. The bulk of the diff is changing the requirement type throughout the codebase.

We still use VerbatimUrl in many places, where we would expect a parsed/decomposed url type, specifically:

  • Reading core metadata except top level pyproject.toml files, we fail a step later instead if the url isn't supported.
  • Allowed Urls.
  • PackageId with a custom CanonicalUrl comparison, instead of canonicalizing urls eagerly.
  • PubGrubPackage: We eventually convert the VerbatimUrl back to a Dist (Dist::from_url), instead of remembering the url.
  • Source dist types: We use verbatim url even though we know and require that these are supported urls we can and have parsed.

I tried to make improve the situation be replacing VerbatimUrl, but these changes would require massive invasive changes (see e.g. #3253). A main problem is the ref VersionOrUrl and applying overrides, which assume the same requirement/url type everywhere. In its current form, this PR increases this tech debt.

I've tried to split off PRs and commits, but the main refactoring is still a single monolith commit to make it compile and the tests pass.

Demo

Adding https://gist.githubusercontent.com/konstin/68f3e5ea5e8f1a30a0c9096ae72925c1/raw/d1ae3b85d5a0f3c4bb562c7dfd38c3dafc04ec08/pyproject.json as json schema (v7) to pycharm for pyproject.toml, you can try the IDE support already:

pycharm

dove.webm

@konstin konstin added the enhancement New feature or request label Apr 25, 2024
@konstin konstin force-pushed the konsti/tool-uv-sources-analysis branch 2 times, most recently from 46d6c68 to 33883ed Compare April 26, 2024 12:35
@konstin
Copy link
Member Author

konstin commented Apr 26, 2024

@konstin konstin added the preview Experimental behavior label Apr 26, 2024
crates/distribution-types/src/error.rs Outdated Show resolved Hide resolved
crates/distribution-types/src/installed.rs Show resolved Hide resolved
crates/distribution-types/src/lib.rs Outdated Show resolved Hide resolved
crates/distribution-types/src/parsed_url.rs Show resolved Hide resolved
crates/distribution-types/src/traits.rs Outdated Show resolved Hide resolved
crates/uv-requirements/src/specification.rs Outdated Show resolved Hide resolved
crates/uv-resolver/src/error.rs Outdated Show resolved Hide resolved
crates/uv-resolver/src/resolution.rs Outdated Show resolved Hide resolved
crates/uv-resolver/src/resolver/locals.rs Outdated Show resolved Hide resolved
crates/uv-resolver/src/resolver/urls.rs Show resolved Hide resolved
crates/distribution-types/src/uv_requirement.rs Outdated Show resolved Hide resolved
crates/distribution-types/src/uv_requirement.rs Outdated Show resolved Hide resolved
pub enum UvSource {
Registry {
version: VersionSpecifiers,
index: Option<String>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be IndexUrl?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still mostly to-be-designed and mainly a placeholder, but i'd expect this to become the name of an index, e.g. torch, which is then defined elsewhere, either in this file or in a global configuration. This doesn't have any usages yet to be easy to change once we have the design

crates/uv-requirements/src/pyproject.rs Show resolved Hide resolved
konstin added a commit that referenced this pull request Apr 30, 2024
Split out from #3263. No functional changes.
konstin added a commit that referenced this pull request Apr 30, 2024
Split out from #3263. No functional changes.
@konstin konstin force-pushed the konsti/tool-uv-sources-analysis branch from 33883ed to b8cb0db Compare April 30, 2024 11:22
konstin added a commit that referenced this pull request Apr 30, 2024
Split out from #3263
@konstin konstin mentioned this pull request Apr 30, 2024
@konstin konstin force-pushed the konsti/tool-uv-sources-analysis branch from b8cb0db to 8c11163 Compare April 30, 2024 14:32
konstin added a commit that referenced this pull request Apr 30, 2024
Split out from #3263
konstin added a commit that referenced this pull request Apr 30, 2024
Another split out from #3263. This abstracts the copy&pasted check whether an installed distribution satisfies a requirement used by both plan.rs and site_packages.rs into a shared module. It's less useful here than with the new requirement but helps with reducing #3263 diff size.
konstin added a commit that referenced this pull request Apr 30, 2024
Another split out from #3263. This abstracts the copy&pasted check whether an installed distribution satisfies a requirement used by both plan.rs and site_packages.rs into a shared module. It's less useful here than with the new requirement but helps with reducing #3263 diff size.
@konstin konstin force-pushed the konsti/tool-uv-sources-analysis branch 2 times, most recently from 348d11d to 1fae847 Compare April 30, 2024 16:29
konstin added a commit that referenced this pull request Apr 30, 2024
Another split out from #3263. This
abstracts the copy&pasted check whether an installed distribution
satisfies a requirement used by both plan.rs and site_packages.rs into a
shared module. It's less useful here than with the new requirement but
helps with reducing #3263 diff size.
konstin added a commit that referenced this pull request Apr 30, 2024
Another split out from #3263. This
abstracts the copy&pasted check whether an installed distribution
satisfies a requirement used by both plan.rs and site_packages.rs into a
shared module. It's less useful here than with the new requirement but
helps with reducing #3263 diff size.
@konstin konstin force-pushed the konsti/tool-uv-sources-analysis branch 2 times, most recently from 868ef3d to 2431020 Compare April 30, 2024 17:19
RequirementsTxtRequirement::Uv(requirement) => requirement.name.to_string(),
RequirementsTxtRequirement::Unnamed(unnamed) => unnamed.to_string(),
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this not implementing display?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which part? There's a impl Display for RequirementsTxtRequirement

@konstin konstin force-pushed the konsti/tool-uv-sources-analysis branch from b83969b to 83b6fe0 Compare May 3, 2024 15:25
@@ -5224,7 +5224,7 @@ fn unsupported_scheme() -> Result<()> {
----- stdout -----

----- stderr -----
error: Unsupported scheme `bzr+https` on URL: bzr+https://example.com/anyio (Bazaar is not supported)
error: Unsupported URL prefix `bzr` in URL: `bzr+https://example.com/anyio`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this error message got worse, can we restore it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least the part at the end.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#3403

I'd also like to track the origin of the invalid url, but that's a larger task.

));
};

if !Urls::is_allowed(expected, url) {
return Err(ResolveError::ConflictingUrlsTransitive(
requirement.name.clone(),
expected.verbatim().to_string(),
url.verbatim().to_string(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why lose the verbatim here?

@charliermarsh charliermarsh enabled auto-merge (squash) May 3, 2024 21:06
@charliermarsh charliermarsh merged commit 4f87edb into main May 3, 2024
43 checks passed
@charliermarsh charliermarsh deleted the konsti/tool-uv-sources-analysis branch May 3, 2024 21:10
};

let mut url = Url::parse(&format!("git+{git}"))?;
let mut given = git.to_string();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is incorrect. Don't we need the actual string here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@konstin - I guess we need to find a way to preserve the exact string from the TOML, right? Otherwise, how will we preserve environment variables?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imho toml shouldn't support env vars

_ => return Err(LoweringError::MoreThanOneGitRef),
};

let mut url = Url::parse(&format!("git+{git}"))?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs on RequirementSource::Git say it's the URL without the git+ prefix. Is this correct?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed here: #3365

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

return Ok(Self::Mismatch);
}

if &requested_url.to_string() != installed_url
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems off. Shouldn't we instead parse the URL, and use CanonicalUrl to compare these? Why are we comparing as strings?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Fixed in #3373)

konstin added a commit that referenced this pull request May 6, 2024
konstin added a commit that referenced this pull request May 8, 2024
When using `tool.uv.sources`, we enforce that requirements have a bound, i.e. at least a lower version constraint.

When using a library, the symbols you import were introduced in different versions, creating an implicit lower bound. This forces to make that bound explicit. This is crucial to prevent backtracking resolvers from selecting an ancient versions that is not compatible (or worse, doesn't build), and a performance optimization on top.

This feature is gated to `tool.uv.sources` (as it should have been for #3263/#3443) to not unnecessarily break legacy workflows. It is also helpful specifically when using a `tool.uv.sources` section that contains constraints that are not published to pypi, e.g. for workspace dependencies. We can adjust those later to e.g. not constrain workspace dependencies with `publish = false`, but i think it's the right setting to start with.
konstin added a commit that referenced this pull request May 8, 2024
When using `tool.uv.sources`, we enforce that requirements have a bound, i.e. at least a lower version constraint.

When using a library, the symbols you import were introduced in different versions, creating an implicit lower bound. This forces to make that bound explicit. This is crucial to prevent backtracking resolvers from selecting an ancient versions that is not compatible (or worse, doesn't build), and a performance optimization on top.

This feature is gated to `tool.uv.sources` (as it should have been for #3263/#3443) to not unnecessarily break legacy workflows. It is also helpful specifically when using a `tool.uv.sources` section that contains constraints that are not published to pypi, e.g. for workspace dependencies. We can adjust those later to e.g. not constrain workspace dependencies with `publish = false`, but i think it's the right setting to start with.
konstin added a commit that referenced this pull request May 8, 2024
When using `tool.uv.sources`, we warn that requirements have a bound, i.e. at least a lower version constraint.

When using a library, the symbols you import were introduced in different versions, creating an implicit lower bound. This forces to make that bound explicit. This is crucial to prevent backtracking resolvers from selecting an ancient versions that is not compatible (or worse, doesn't build), and a performance optimization on top.

This feature is gated to `tool.uv.sources` (as it should have been for #3263/#3443) to not unnecessarily break legacy workflows. It is also helpful specifically when using a `tool.uv.sources` section that contains constraints that are not published to pypi, e.g. for workspace dependencies. We can adjust those later to e.g. not constrain workspace dependencies with `publish = false`, but i think it's the right setting to start with.
charliermarsh pushed a commit that referenced this pull request May 8, 2024
When using `tool.uv.sources`, we warn that requirements have a bound,
i.e. at least a lower version constraint.

When using a library, the symbols you import were introduced in
different versions, creating an implicit lower bound. This warning makes
this explicit. This is crucial to prevent backtracking resolvers from
selecting an ancient versions that is not compatible (or worse, doesn't
build), and a performance optimization on top.

This feature is gated to `tool.uv.sources` (as it should have been to
begin with for #3263/#3443) to not unnecessarily break legacy workflows.
It is also helpful specifically when using a `tool.uv.sources` section
that contains constraints that are not published to pypi, e.g. for
workspace dependencies. We can adjust those later to e.g. not constrain
workspace dependencies with `publish = false`, but i think it's the
right setting to start with.
@zanieb zanieb mentioned this pull request Jun 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request preview Experimental behavior
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants