Implementation of constraints for the new resolver #8136

pfmoore · 2020-04-25T13:11:44Z

No description provided.

pfmoore · 2020-04-25T13:28:52Z

A thought. Now that a version of the new resolver is released, should we start actually including news items in PRs? Or would we still be cluttering the changelog with too many bits about the resolver work? (I'm inclined to keep marking PRs as trivial, but I worry that just reflects the fact that I don't like writing docs 😉 )

pradyunsg · 2020-04-25T20:56:50Z

Yea... The changelog is not the right place to do that. We'd want to communicate about the new resolver via a separate communication channel -- since there's likely specific things we want to call out, things we want users to do, setup alternative communication channels etc.

#8099 would serve as that communication channel, and we'd want to say "we have a beta of pip's next-gen resolver" in the 20.2 changelog. Or if we're happy enough to flip the defaults, just say "we're shipping the new resolver as a default". :)

(fun fact, since #8099 is locked, we don't see cross-references to that issue)

uranusjr · 2020-04-27T13:46:06Z

src/pip/_internal/resolution/resolvelib/candidates.py

+    def __init__(self):
+        # type: () -> None
+        self._name = "<Dummy>"
+        self._version = Version("0")


😮 Would it work to raise in the version property instead?

Er, maybe? Does it matter, though? I might be missing your point here...

It seems awkward to me this class is having a property that’s never called, but returns an arbitrary placeholder value. It would be more obvious to raise in version instead to make it clear that we don’t plan to call it, and it’s a clear error if we ever do.

I think an even better way to do this is to re-work the class hierarchy, and let type checking handle this. Maybe DummyCandidate should inherit from something else that does not have version instead, or maybe we should remove version from Candidate. I’m not sure yet, so it’s likely a better idea to merge this first, and work on refactoring later. (I do want to try some of this.)

Ah, I get what you mean. We need a dummy name, because the resolver uses name to identify things, but version is unused.

I'll switch to a propery that asserts (assuming that mypy doesn't whine at me).

uranusjr · 2020-04-27T20:56:51Z

src/pip/_internal/resolution/resolvelib/factory.py

+        if ireq.constraint:
+            return ConstraintRequirement(specifier, factory=self)
+
+        return specifier


I got confused for a while by this name (and the attribute ConstraintRequirement._specifier). The name yells packaging.specifiers.SpecifierSet but they are totally not 😖

Basically it felt like specifier_requirement is too long. But I can't think of a reasonable abbreviation (I'd prefer not to use req as we use that for an InstallRequirement all over the place).

I've changed it to specifier_requirement.

pradyunsg

LGTM. One minor comment.

pradyunsg · 2020-04-28T11:39:51Z

src/pip/_internal/resolution/resolvelib/requirements.py

@@ -117,3 +117,36 @@ def is_satisfied_by(self, candidate):
        # already implements the prerelease logic, and would have filtered out
        # prerelease candidates if the user does not expect them.
        return self.specifier.contains(candidate.version, prereleases=True)
+
+
+class ConstraintRequirement(Requirement):


We should add a docstring here, detailing how this works / "flows" through the resolver.

https://github.com/pypa/pip/projects/5#card-37162340

How about this?

"""A requirement that represents a constraint. This requirement acts just like a SpecifierRequirement, except that it doesn't generate any real matches of its own. This is to ensure that if the user doesn't actually request that package X gets installed, a constraint on X doesn't trigger installation. ConstraintRequirement objects generate a singleton "dummy" candidate, so that they satisfy the condition that any requirement has at least one match - this is a resolvelib condition. """

Sounds great, though we're going in a different direction now. :)

I don't think we are. But I suspect you've just not caught up on my various ramblings below yet, as you're doing release management at the moment ;-)

pfmoore · 2020-04-28T12:26:43Z

There's a further complication here still to be addressed.

If resolvelib picks the "constraint" requirement to try first, the dummy candidate will be what goes into the candidate list. When we then check the actual requirement, it complains because the dummy candidate doesn't have the right project name (and we can't just say let it through anyway, as it'll break as soon as we try to get its version).

I think the fix here is to modify the provider's get_preference to never choose a constraint when there's a real requirement to pick first. But I need to think about that, as I don't really understand get_preference yet.

This was exposed by merging this PR with #8061.

pfmoore · 2020-04-28T12:47:46Z

Alternative plan. Just have SpecifierRequirement not match dummy candidates, before we test the name. I may be overthinking things above 🤷

No, that doesn't work. We try the constraint, then try the real requirement, then bomb out because there are no candidates left. I'm not 100% sure why we don't backtrack and try the real requirement first. Backtracking is probably discarding it as the thing that caused the problem.

I feel like there's an unstated constraint on providers in resolvelib, that there has to be a "pool" of candidates, where a requirement must return from find_matches all of the candidates in that pool that satisfy the requirement (and nothing else).

At the very least, this constraint needs to be explicitly stated. And I'd very strongly prefer if it wasn't a constraint, but resolvelib did support more flexible requirements like I'm implementing here, so that my "alternative plan" worked.

I'll try to turn this case into a bug report for resolvelib.

[It's actually really hard to even describe the issue I'm hitting here, as resolvelib explicitly doesn't have a concept of "a set of candidates that we're considering"...]

@uranusjr @pradyunsg thoughts?

pfmoore · 2020-04-28T13:10:29Z

sarugaku/resolvelib#48

pfmoore · 2020-04-28T13:22:31Z

It's fixable by providing the root requirements so that non-constraints come first and constraints come last. But this feels really nasty, as it implies that we'll still have (at least some form of) order-dependency in how the resolver sees the requirements.

I'm going to go ahead and fix it by ordering the root reqs for now, but I'll add a comment explaining the issue.

pradyunsg · 2020-04-28T14:00:27Z

Hmm... Semantically, I feel like we should add constraints first. My mental model for them is that they are for constraining what we end up with, and eliminating them earlier would result in avoiding exploration of that part of the search space.

That said, I'm also fine with doing this for now, and leaving this as a "we'll do this later if there's trouble" task.

pfmoore · 2020-04-28T15:06:44Z

Semantically, I feel like we should add constraints first. My mental model for them is that they are for constraining what we end up with, and eliminating them earlier would result in avoiding exploration of that part of the search space.

If we could do that I'd be more than happy. My mental model says that resolve(reqs) shouldn't depend on the order of reqs at all, although certain orderings might help the resolver be more efficient (in theory that should be the job of get_preference, but no-one understands that 😉).

That's why I raised the resolvelib issue as a bug. The fix of putting constraints at the end relies on an undocumented implementation detail of resolvelib. I'd like it to be fixed so order didn't matter, but I'd be OK with it just ending up documented (someone needs to work out how to explain the behaviour, though, and I don't know how to do that without locking down resolvelib's behaviour).

BTW, internally, the error is coming from resolvelib's very first pass that effectively says "are the root requirements consistent?". It's definitely arguable that this implementation of constraints is broken, because a requirement "A" and a constraint "A" are only consistent if you do find_matches on the requirement, and merge in the constraint, but not if you do find_matches on the constraint and then merge the requirement. But the idea of finding matches and merging is an internal detail of resolvelib, and I'm not sure how to express this without discussing the internals.

(Note: This is all only an issue because getting project metadata is costly - we could avoid all of the issues here if we just treated constraints exactly like any other requirement. But that would mean we potentially had to download and prepare a whole load of projects that were only mentioned in constraints, and never in actual requirements.)

pfmoore · 2020-04-28T15:54:47Z

OK, there's still a potential problem even if we put the candidates at the end. Imagine:

Constraint B<3.0
Requirement A==1.0
A 1.0 depends on B

Then we feed requirements to the resolver in the order A==1.0 (non-constraint root), B<3.0 (root, constraint), B (dependency). We still get the constraint first, and that "poisons" the candidate set for B. I just added a test that confirms this issue 🙁

So I think we have to abandon this approach, and accept that we can't avoid treating constraints as "normal" requirements as far as the resolver is concerned. That means we hit the index to get the available candidates, even for constraints that we're never going to install. The best I think we can do is to minimise the impact by returning an empty list for the dependencies - so we only ever hit the index page to get name and version data, but never download an actual file.

OK, my brain is fried for the day. I've written this here so that I don't forget where I've got to, but I'm going to leave it now and work on a replacement implementation tomorrow.

dhellmann · 2020-04-28T17:24:19Z

src/pip/_internal/resolution/resolvelib/resolver.py

-            self.factory.make_requirement_from_install_req(r)
-            for r in root_reqs
-        ]
+        # Build the list of root requirements, making sure that


Does this imply that constraints are only used for the root requirements, and not secondary requirements?

No, constraints only come into pip as root requirements (i.e. from a constraints file). They never appear as dependencies - dependencies are always "full" requirements, which will be installed.

Constraints are applied along with normal requirements to determine what are the allowable versions for any project.

OK, I think that makes sense.

techalchemy · 2020-04-28T19:23:45Z

The best I think we can do is to minimise the impact by returning an empty list for the dependencies - so we only ever hit the index page to get name and version data, but never download an actual file.

This aligns with my understanding of how this needs to work and with how I initially implemented resolution in the iteration of the resolver that predates resolvelib. AS far as I understood, constraints need to be permitted at any time during resolution and should result in one of the following:

An existing pool of matching candidates is narrowed, due to the constraint having a more strict specifier;
An existing pool of candidates is left untouched due to a less narrow constraint;
A resolution error (i.e. the constraint is not satisfied by the available candidate pool) leads to backtracking in attempt to find an alternative satisfiable path; or
No candidate pool for the specified constraint exists, so one must be created (i.e. by retrieving it)

I'm not totally sure about how this impacts resolution overall in terms of whether that will need to trigger a restart or not, but this is how I have modeled resolution since I've been working on it

pfmoore · 2020-04-28T20:08:56Z

Hi @techalchemy that fits with my understanding too. Essentially a constraint acts exactly like a requirement as far as the resolver is concerned. The only difference is when it comes to the install phase, when we don't want to install selected candidates that are only defined by constraints and not by "real" requirements.

Ideally the resolver itself would have a first-class "constraint" concept - which restricts selection but doesn't contribute on its own accord to the candidates in the final solution - but it doesn't, so most of the games I am playing are designed to capture that information so that we can strip these unneeded candidates out of the final solve.

pfmoore · 2020-04-29T12:32:06Z

I don't think it's possible to rescue this approach. So I'm going to abandon this PR in favour of a new one (#8170) which takes a different approach. Further discussion should probably be on that PR.

I'll leave this open for now, and close it once a working approach goes in.

BrownTruck · 2020-05-03T15:00:04Z

Hello!

I am an automated bot and I have noticed that this pull request is not currently able to be merged. If you are able to either merge the master branch into this pull request or rebase this pull request against master then it will be eligible for code review and hopefully merging!

pfmoore · 2020-05-15T13:06:33Z

Closing this as an alternative approach has been implemented.

pfmoore added the skip news Does not need a NEWS file entry (eg: trivial changes) label Apr 25, 2020

pfmoore added 5 commits April 27, 2020 13:45

Add a test for constraints

7b79124

Implement constraints

d2f37a5

Fix name of attribute in ConstraintRequirement

37f89f4

Switch off the check that fails if we have constraints

9b1802c

Tidying up

291e049

pfmoore force-pushed the nr_constraints branch from 36bcb9a to 291e049 Compare April 27, 2020 12:47

pfmoore marked this pull request as ready for review April 27, 2020 12:48

pfmoore requested review from uranusjr and pradyunsg April 27, 2020 12:48

uranusjr reviewed Apr 27, 2020

View reviewed changes

pfmoore added 2 commits April 28, 2020 11:16

Make version property of DummyCandidate assert

8f3efb7

Change specifier -> specifier_requirement to avoid confusion

2598450

pradyunsg reviewed Apr 28, 2020

View reviewed changes

pfmoore mentioned this pull request Apr 28, 2020

Resolving depends on the order of the root requirements in some cases sarugaku/resolvelib#48

Closed

pfmoore added 3 commits April 28, 2020 14:35

Avoid checking dummy candidate name in is_satisfied_by

fe8f212

Add docstring for ConstraintRequirement

3ada784

Put constraints at the end of the list of root requirements

0c6fdea

dhellmann reviewed Apr 28, 2020

View reviewed changes

BrownTruck added the needs rebase or merge PR has conflicts with current master label May 3, 2020

pfmoore closed this May 15, 2020

pfmoore deleted the nr_constraints branch May 15, 2020 13:06

github-actions bot locked as resolved and limited conversation to collaborators Oct 15, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementation of constraints for the new resolver #8136

Implementation of constraints for the new resolver #8136

pfmoore commented Apr 25, 2020

pfmoore commented Apr 25, 2020

pradyunsg commented Apr 25, 2020 •

edited

Loading

uranusjr Apr 27, 2020

pfmoore Apr 27, 2020

uranusjr Apr 27, 2020

pfmoore Apr 28, 2020

uranusjr Apr 27, 2020

pfmoore Apr 28, 2020

pradyunsg left a comment

pradyunsg Apr 28, 2020

pfmoore Apr 28, 2020

pradyunsg Apr 28, 2020

pfmoore Apr 28, 2020

pfmoore commented Apr 28, 2020

pfmoore commented Apr 28, 2020

pfmoore commented Apr 28, 2020

pfmoore commented Apr 28, 2020

pradyunsg commented Apr 28, 2020

pfmoore commented Apr 28, 2020

pfmoore commented Apr 28, 2020

dhellmann Apr 28, 2020

pfmoore Apr 28, 2020

dhellmann Apr 28, 2020

techalchemy commented Apr 28, 2020 •

edited

Loading

pfmoore commented Apr 28, 2020

pfmoore commented Apr 29, 2020

BrownTruck commented May 3, 2020

pfmoore commented May 15, 2020

Implementation of constraints for the new resolver #8136

Implementation of constraints for the new resolver #8136

Conversation

pfmoore commented Apr 25, 2020

pfmoore commented Apr 25, 2020

pradyunsg commented Apr 25, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pradyunsg left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pfmoore commented Apr 28, 2020

pfmoore commented Apr 28, 2020

pfmoore commented Apr 28, 2020

pfmoore commented Apr 28, 2020

pradyunsg commented Apr 28, 2020

pfmoore commented Apr 28, 2020

pfmoore commented Apr 28, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

techalchemy commented Apr 28, 2020 • edited Loading

pfmoore commented Apr 28, 2020

pfmoore commented Apr 29, 2020

BrownTruck commented May 3, 2020

pfmoore commented May 15, 2020

pradyunsg commented Apr 25, 2020 •

edited

Loading

techalchemy commented Apr 28, 2020 •

edited

Loading