-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementation of constraints for the new resolver #8136
Conversation
A thought. Now that a version of the new resolver is released, should we start actually including news items in PRs? Or would we still be cluttering the changelog with too many bits about the resolver work? (I'm inclined to keep marking PRs as trivial, but I worry that just reflects the fact that I don't like writing docs 😉 ) |
Yea... The changelog is not the right place to do that. We'd want to communicate about the new resolver via a separate communication channel -- since there's likely specific things we want to call out, things we want users to do, setup alternative communication channels etc. #8099 would serve as that communication channel, and we'd want to say "we have a beta of pip's next-gen resolver" in the 20.2 changelog. Or if we're happy enough to flip the defaults, just say "we're shipping the new resolver as a default". :) (fun fact, since #8099 is locked, we don't see cross-references to that issue) |
def __init__(self): | ||
# type: () -> None | ||
self._name = "<Dummy>" | ||
self._version = Version("0") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😮 Would it work to raise in the version property instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Er, maybe? Does it matter, though? I might be missing your point here...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems awkward to me this class is having a property that’s never called, but returns an arbitrary placeholder value. It would be more obvious to raise in version
instead to make it clear that we don’t plan to call it, and it’s a clear error if we ever do.
I think an even better way to do this is to re-work the class hierarchy, and let type checking handle this. Maybe DummyCandidate
should inherit from something else that does not have version
instead, or maybe we should remove version
from Candidate
. I’m not sure yet, so it’s likely a better idea to merge this first, and work on refactoring later. (I do want to try some of this.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I get what you mean. We need a dummy name, because the resolver uses name to identify things, but version is unused.
I'll switch to a propery that asserts (assuming that mypy doesn't whine at me).
if ireq.constraint: | ||
return ConstraintRequirement(specifier, factory=self) | ||
|
||
return specifier |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got confused for a while by this name (and the attribute ConstraintRequirement._specifier
). The name yells packaging.specifiers.SpecifierSet
but they are totally not 😖
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically it felt like specifier_requirement
is too long. But I can't think of a reasonable abbreviation (I'd prefer not to use req
as we use that for an InstallRequirement
all over the place).
I've changed it to specifier_requirement
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. One minor comment.
@@ -117,3 +117,36 @@ def is_satisfied_by(self, candidate): | |||
# already implements the prerelease logic, and would have filtered out | |||
# prerelease candidates if the user does not expect them. | |||
return self.specifier.contains(candidate.version, prereleases=True) | |||
|
|||
|
|||
class ConstraintRequirement(Requirement): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add a docstring here, detailing how this works / "flows" through the resolver.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about this?
"""A requirement that represents a constraint.
This requirement acts just like a SpecifierRequirement, except that
it doesn't generate any real matches of its own. This is to ensure that
if the user doesn't actually request that package X gets installed, a
constraint on X doesn't trigger installation.
ConstraintRequirement objects generate a singleton "dummy" candidate,
so that they satisfy the condition that any requirement has at least
one match - this is a resolvelib condition.
"""
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds great, though we're going in a different direction now. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we are. But I suspect you've just not caught up on my various ramblings below yet, as you're doing release management at the moment ;-)
There's a further complication here still to be addressed. If resolvelib picks the "constraint" requirement to try first, the dummy candidate will be what goes into the candidate list. When we then check the actual requirement, it complains because the dummy candidate doesn't have the right project name (and we can't just say let it through anyway, as it'll break as soon as we try to get its version). I think the fix here is to modify the provider's This was exposed by merging this PR with #8061. |
Alternative plan. Just have No, that doesn't work. We try the constraint, then try the real requirement, then bomb out because there are no candidates left. I'm not 100% sure why we don't backtrack and try the real requirement first. Backtracking is probably discarding it as the thing that caused the problem. I feel like there's an unstated constraint on providers in resolvelib, that there has to be a "pool" of candidates, where a requirement must return from find_matches all of the candidates in that pool that satisfy the requirement (and nothing else). At the very least, this constraint needs to be explicitly stated. And I'd very strongly prefer if it wasn't a constraint, but resolvelib did support more flexible requirements like I'm implementing here, so that my "alternative plan" worked. I'll try to turn this case into a bug report for resolvelib. [It's actually really hard to even describe the issue I'm hitting here, as resolvelib explicitly doesn't have a concept of "a set of candidates that we're considering"...] @uranusjr @pradyunsg thoughts? |
It's fixable by providing the root requirements so that non-constraints come first and constraints come last. But this feels really nasty, as it implies that we'll still have (at least some form of) order-dependency in how the resolver sees the requirements. I'm going to go ahead and fix it by ordering the root reqs for now, but I'll add a comment explaining the issue. |
Hmm... Semantically, I feel like we should add constraints first. My mental model for them is that they are for constraining what we end up with, and eliminating them earlier would result in avoiding exploration of that part of the search space. That said, I'm also fine with doing this for now, and leaving this as a "we'll do this later if there's trouble" task. |
If we could do that I'd be more than happy. My mental model says that That's why I raised the resolvelib issue as a bug. The fix of putting constraints at the end relies on an undocumented implementation detail of resolvelib. I'd like it to be fixed so order didn't matter, but I'd be OK with it just ending up documented (someone needs to work out how to explain the behaviour, though, and I don't know how to do that without locking down resolvelib's behaviour). BTW, internally, the error is coming from resolvelib's very first pass that effectively says "are the root requirements consistent?". It's definitely arguable that this implementation of constraints is broken, because a requirement "A" and a constraint "A" are only consistent if you do (Note: This is all only an issue because getting project metadata is costly - we could avoid all of the issues here if we just treated constraints exactly like any other requirement. But that would mean we potentially had to download and prepare a whole load of projects that were only mentioned in constraints, and never in actual requirements.) |
OK, there's still a potential problem even if we put the candidates at the end. Imagine: Constraint B<3.0 Then we feed requirements to the resolver in the order A==1.0 (non-constraint root), B<3.0 (root, constraint), B (dependency). We still get the constraint first, and that "poisons" the candidate set for B. I just added a test that confirms this issue 🙁 So I think we have to abandon this approach, and accept that we can't avoid treating constraints as "normal" requirements as far as the resolver is concerned. That means we hit the index to get the available candidates, even for constraints that we're never going to install. The best I think we can do is to minimise the impact by returning an empty list for the dependencies - so we only ever hit the index page to get name and version data, but never download an actual file. OK, my brain is fried for the day. I've written this here so that I don't forget where I've got to, but I'm going to leave it now and work on a replacement implementation tomorrow. |
self.factory.make_requirement_from_install_req(r) | ||
for r in root_reqs | ||
] | ||
# Build the list of root requirements, making sure that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this imply that constraints are only used for the root requirements, and not secondary requirements?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, constraints only come into pip as root requirements (i.e. from a constraints file). They never appear as dependencies - dependencies are always "full" requirements, which will be installed.
Constraints are applied along with normal requirements to determine what are the allowable versions for any project.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I think that makes sense.
This aligns with my understanding of how this needs to work and with how I initially implemented resolution in the iteration of the resolver that predates
I'm not totally sure about how this impacts resolution overall in terms of whether that will need to trigger a restart or not, but this is how I have modeled resolution since I've been working on it |
Hi @techalchemy that fits with my understanding too. Essentially a constraint acts exactly like a requirement as far as the resolver is concerned. The only difference is when it comes to the install phase, when we don't want to install selected candidates that are only defined by constraints and not by "real" requirements. Ideally the resolver itself would have a first-class "constraint" concept - which restricts selection but doesn't contribute on its own accord to the candidates in the final solution - but it doesn't, so most of the games I am playing are designed to capture that information so that we can strip these unneeded candidates out of the final solve. |
I don't think it's possible to rescue this approach. So I'm going to abandon this PR in favour of a new one (#8170) which takes a different approach. Further discussion should probably be on that PR. I'll leave this open for now, and close it once a working approach goes in. |
Hello! I am an automated bot and I have noticed that this pull request is not currently able to be merged. If you are able to either merge the |
Closing this as an alternative approach has been implemented. |
No description provided.