Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify and define install/upgrade behaviour for the new resolver #8115

Open
pfmoore opened this issue Apr 23, 2020 · 17 comments
Open

Clarify and define install/upgrade behaviour for the new resolver #8115

pfmoore opened this issue Apr 23, 2020 · 17 comments
Labels
C: dependency resolution About choosing which dependencies to install type: docs Documentation related UX: functionality research epic Temporary label to link tickets to #8516

Comments

@pfmoore
Copy link
Member

pfmoore commented Apr 23, 2020

What's the problem this feature will solve?
Currently, pip doesn't really define what it will install, beyond this section in the docs. And the documentation for --upgrade says basically the same. Nowhere is the fact that "we prefer to leave the installed version alone unless --upgrade is specified" documented, that I could find.

Describe the solution you'd like
Clearly document pip's behaviour when deciding what to install, and implement that behaviour for the new resolver. Note that I specifically do not propose trying to either make the old resolver foillow the new definition, or make the definition match every quirk of the old resolver (although obviously in broad terms, the behaviour will be the same). The old resolver's behaviour grew over time, and is heavily influenced by what was feasible given the way it was implemented (that's a polite way of saying "it's broken" 🙂). We should focus on getting well-specified and understandable behaviour in the new resolver.

Proposed Behaviour
I propose the following rules for what to install based on a given set of requirements:

  1. Normal behaviour for pip is to look for solutions to the resolution problem defined by the set of requirements given on the command line. That means that all packages named in the command line, plus packages named in their dependencies, recursively, will be considered.
  2. We explicitly note that installed packages that are not part of the set identified above are not considered. This is debatable - it is possible to add an extra set of requirements that any installed packages must remain unchanged, and their dependencies must also be considered. See below in "Alternative Solutions".
  3. When identifying solutions, pip will normally prioritise the installed version of a given package over any other. However, note that this does not mean that pip will certainly retain that version - other constraints may influence this. Also, "prioritise" is a slightly subtle term here - it is about influencing the order the resolver searches for solutions, not about "scoring" possible solutions and choosing the "best". I'd prefer to just use the somewhat-vague term "prioritise" here to avoid getting into technical details that users won't care about.
  4. The --upgrade flag changes pip's behaviour to not prioritise installed versions for packages named on the command line. By default installed versions are still prioritised for packages which are only identified as dependencies. That's the "only-if-needed" upgrade strategy. The "eager" upgrade strategy extends this to all packages in the dependency tree (the installed version is never prioritised).
  5. Apart from the special case handling of installed versions, newer versions will be preferred over older versions.

Alternative Solutions
One key alternative here is that we might want to also include dependency information for packages that are installed but are not part of the dependency graph otherwise. This is the question noted in (2) above. The benefit of this is that pip would avoid installing a set of packages that conflicts with something that is already installed. The downside is that the behaviour is more complex (both to implement and to describe) and may have odd corner cases.

I'm mildly inclined to prefer not adding that complexity, because I suspect the benefits will be marginal, but I can easily be persuaded otherwise. See #7744 for extra context.

Additional context
There's further information that we don't even have access to when looking at the bare fact that "A 1.0 is installed" alongside the fact that A 2.0 is now available. Did the user do pip install A and A 2.0 got released since? Did the user install some package that depended on A<2.0, which they have now uninstalled? Did the user explicitly do pip install A==1.0? This sort of links into the suggestion made elsewhere about using/extending REQUESTED, but I think it goes beyond that. Ultimately, it's about managing an environment definition and that's not pip's role - it's what pipenv and poetry do, and we don't want to get into that area.

@triage-new-issues triage-new-issues bot added the S: needs triage Issues/PRs that need to be triaged label Apr 23, 2020
@pfmoore
Copy link
Member Author

pfmoore commented Apr 23, 2020

The docs should also include an explicit note that pip will happily reinstall packages, upgrading or downgrading, without needing any additional command line options, if necessary to satisfy constraints.

That's implied by the given rules, but IMO important enough to call out explicitly, because the existence of the --upgrade and --force-reinstall options could make people think that pip won't reinstall in normal usage.

@pfmoore pfmoore self-assigned this Apr 23, 2020
@pfmoore pfmoore added C: dependency resolution About choosing which dependencies to install C: new resolver type: docs Documentation related labels Apr 23, 2020
@triage-new-issues triage-new-issues bot removed the S: needs triage Issues/PRs that need to be triaged label Apr 23, 2020
@uranusjr
Copy link
Member

The strategy sounds good to me. I think it is exactly what I have in mind (if I understand it correctly), but you put them to words better than I ever can.

We explicitly note that installed packages that are not part of the set identified above are not considered.

So what does pip do after the resolution if this happens? Does it go on to install things anyway (the current behaviour), or abort immediately with an error? I think we should commit to one of these for now, and work on adding additional choices with flags when we finish the upgrade functionality as a whole.

Also, "prioritise" is a slightly subtle term here - it is about influencing the order the resolver searches for solutions, not about "scoring" possible solutions and choosing the "best". I'd prefer to just use the somewhat-vague term "prioritise" here to avoid getting into technical details that users won't care about.

I think it is a good idea to use a vague term here. But to be technical here (because that’s what I do best), does this mean the resolver would always be working on the same set of candidates regardless of th mode, only order them differently? “Scoring” here does not ring a bell to me here, and I suspect this is due to English not being my first language.

@pradyunsg
Copy link
Member

pradyunsg commented Apr 23, 2020

Sounds great to me!

Nowhere is the fact that "we prefer to leave the installed version alone unless --upgrade is specified" documented, that I could find.

OOoo! This is true and I think I know why: We never added docs related to #3972. :)

The old resolver's behaviour grew over time, and is heavily influenced by what was feasible given the way it was implemented (that's a polite way of saying "it's broken" 🙂). We should focus on getting well-specified and understandable behaviour in the new resolver.

Well said and I am in strong agreement on the "broken" point.

2. This is debatable

I wanna debate about this. :)

But we've gone round-and-round in circles on this enough that I think we should first implement and document what you've proposed and then, I'll bring this up. Maybe it'll also seem like less work than it does currently by then, which I'd say is a good thing. :P

I still think start-strict-and-relax is a better approach for us than start-loose-and-tighten in this entire area TBH.

does this mean the resolver would always be working on the same set of candidates regardless of the mode, only order them differently?

IIUC, yes.

@pfmoore
Copy link
Member Author

pfmoore commented Apr 23, 2020

Does it go on to install things anyway (the current behaviour), or abort immediately with an error?

My feeling is install anyway is what will happen unless we deliberately choose otherwise. But aborting with an error would need us to check all of the installed packages that we've just said we won't be including, so it gains us nothing over just deciding to take those into account "properly".

So I'd go with "install anyway" (and that might mean stuff gets broken). If we don't like that, we should go for the full "add X==installed_version" dependencies for all of the rest of the installed stuff" approach (which as I say might not be worth the cost, I'm not sure).

does this mean the resolver would always be working on the same set of candidates regardless of th mode, only order them differently

Precisely.

Sorry, I made up the term "scoring". What I meant by it was that I don't want people to assume that we generate all possible solutions, calculate a value for "how good they are" (the "score") and then pick the globally "best" answer. That's not how things work - we "prefer" the latest version on a very "local" basis while we're deciding what to process next. It's very unlikely to make a difference in practice though (hence my point about being a bit vague).

@pfmoore
Copy link
Member Author

pfmoore commented Apr 23, 2020

I wanna debate about this. :)

But we've gone round-and-round in circles on this enough that I think we should first implement and document what you've proposed and then, I'll bring this up. Maybe it'll also seem like less work than it does currently by then, which I'd say is a good thing. :P

+1 on that. My gut feel is that it will be reasonably easy to add this behaviour, and at a very high level it's better ("pip doesn't ever introduce broken dependencies"). But I worry it might have weird implications and/or cause a much higher rate of failed installations1 in ways that will end up being more confusing for users.

Let's discuss later, and/or make it a UX problem and leave it to @ei8fdb 🙂

1 Although any failures would presumably have been broken in the alternative approach, so maybe I'm being over-paranoid here...

@pfmoore
Copy link
Member Author

pfmoore commented Apr 23, 2020

Sounds like we have a plan. I'll work on a doc patch first, and then follow up with implementation.

@uranusjr
Copy link
Member

uranusjr commented Apr 23, 2020

So I'd go with "install anyway" (and that might mean stuff gets broken).

And fail with a non-zero code, I persume. I’m fine with that.


Edit: We probably want to open a separate issue for this? I believe the implementation of this is not related to the resolver.

@pfmoore
Copy link
Member Author

pfmoore commented Apr 23, 2020

And fail with a non-zero code, I persume. I’m fine with that.

Huh? If we install anyway, we're not failing. Let's defer discussing this until we have an implementation to relate it back to. I'm getting much more inclined to go for the whole "include all installed stuff" option.

@pfmoore
Copy link
Member Author

pfmoore commented Apr 23, 2020

I'm really struggling to clearly describe what --upgrade does (in the help text for the option). It really doesn't mean "Upgrade all specified packages to the newest available version" (and actually never has...). It more means "permit upgrades when deciding what to upgrade" but even that's not accurate, because "pip install foo>1.0" will upgrade foo 1.0 without needing any "permission"...

I'm going to ignore the issue for now and just not change the docs for --upgrade. But we need to go back to it - there's a fundamental problem here in that --upgrade isn't actually what people think of as an "upgrade"...

@pradyunsg
Copy link
Member

pradyunsg commented Apr 23, 2020

Do we want to throw this flag away and add an upgrade command instead (ALA #59)?

@uranusjr
Copy link
Member

I do think this is a good idea, but we’d want to wait until the resolver is at least in beta state.

@brainwane
Copy link
Member

@pfmoore Now that your contract is ending, should we un-assign this from you?

@brainwane brainwane added this to Needs triage in New Resolver Implementation Jun 30, 2020
@pfmoore pfmoore removed their assignment Jun 30, 2020
@pfmoore
Copy link
Member Author

pfmoore commented Jun 30, 2020

Done. I may still take a look at it if no-one else does, but probably not in a timeframe that will be much use if this is on the critical path for the release :-)

@brainwane
Copy link
Member

Clarifying and defining this will help us establish consistency, so let's do this before the 20.3 release. Also related to some conversations on how to clarify pip behavior in dependency interaction/conflict.

@brainwane brainwane added the UX: functionality research epic Temporary label to link tickets to #8516 label Sep 2, 2020
@brainwane brainwane moved this from Needs triage to Post-release work in New Resolver Implementation Sep 2, 2020
@pfmoore
Copy link
Member Author

pfmoore commented Sep 2, 2020

@brainwane What action do you expect here? As far as I'm aware, my "Proposed behaviour" is still our best description of what we'd all like to happen. I don't know whether it matches what pip actually does, though, and I'm not clear whether you are suggesting should happen before 20.3:

  1. We change pip to match the proposed behaviour. That may or may not be a lot of work, depending on how close the current behaviour is.
  2. We document what pip currently does and how it differs from the proposed behaviour, which is what I failed to manage in the PR I referenced when I said "I'm really struggling to clearly describe what --upgrade does".
  3. We should check that pip matches the proposed behaviour and think again if we find that it doesn't.

IMO, our ultimate goal remains (1). Documenting the current behaviour prior to us achieving (1) is something I'm ambivalent about, because I doubt it would be helpful to end users (all it did was confuse me, and I'm supposed to be an expert!) Maybe having it in a "Developers only" section of the docs would be OK, though?

@brainwane brainwane moved this from Post-release work to Needs triage in New Resolver Implementation Oct 7, 2020
@bersbersbers
Copy link

bersbersbers commented Oct 8, 2020

Since I haven't found any issue including the phrase

This may or may not be true of the new resolver.

I think this is a fitting place to remind that this phrase is still in the current documentation, here:
https://github.com/pypa/pip/blob/0f513ddbea078f2a03ef1dee9f1cae62254c92ba/docs/html/development/architecture/upgrade-options.rst

(It seems to me the statement is still true for the new resolver.)

@brainwane brainwane moved this from Needs triage to Post-release work in New Resolver Implementation Oct 26, 2020
brainwane added a commit to brainwane/pip that referenced this issue Oct 26, 2020
brainwane added a commit to brainwane/pip that referenced this issue Oct 27, 2020
@brainwane
Copy link
Member

In a meeting earlier this month we decided to try to further document the quirks and edge cases people are likely to run into with the new resolver in our migration guide. #9056 is that PR. Hope to get it finished and merged in the next day or so.

brainwane added a commit to brainwane/pip that referenced this issue Oct 27, 2020
brainwane added a commit to brainwane/pip that referenced this issue Oct 27, 2020
brainwane added a commit to brainwane/pip that referenced this issue Oct 27, 2020
brainwane added a commit to brainwane/pip that referenced this issue Oct 27, 2020
brainwane added a commit to brainwane/pip that referenced this issue Oct 27, 2020
brainwane added a commit to brainwane/pip that referenced this issue Oct 28, 2020
brainwane added a commit to brainwane/pip that referenced this issue Oct 28, 2020
brainwane added a commit to brainwane/pip that referenced this issue Oct 29, 2020
brainwane added a commit to brainwane/pip that referenced this issue Oct 29, 2020
brainwane added a commit to brainwane/pip that referenced this issue Oct 29, 2020
brainwane added a commit to brainwane/pip that referenced this issue Oct 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C: dependency resolution About choosing which dependencies to install type: docs Documentation related UX: functionality research epic Temporary label to link tickets to #8516
Projects
No open projects
New Resolver Implementation
  
Post-release work
Development

No branches or pull requests

5 participants