Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a resolver option to use the specified minimum version for a dependency #8085

Open
dhellmann opened this issue Apr 19, 2020 · 76 comments
Open
Labels
C: dependency resolution About choosing which dependencies to install resolution: deferred till PR Further discussion will happen when a PR is made type: feature request Request for a new feature

Comments

@dhellmann
Copy link

What's the problem this feature will solve?

I would like to be able to install my project using its "lower bounds" requirements and run the test suite to ensure that (a) I have those lower bounds specified properly and (b) the tests pass.

Describe the solution you'd like

A new command line option --prefer-minimum-versions would change the resolver behavior to choose the earliest version supported by a requirement specification. For example if versions 1.0 and 2.0 of package foo are available and the specification is foo>=1.0 then when the flag is used version 1.0 would be installed and when the flag is not used version 2.0 would be installed.

Large applications such as OpenStack have a lot of dependencies and verifying the accuracy of the complete set is complex. Providing a way to install the earliest set of packages expected to work would make this easier.

Alternative Solutions

The existing constraints file option does help, but building a valid constraints file is complicated.

Additional context

There is a recent discussion about this need on the openstack-discuss mailing list.

I will work on the implementation.

@triage-new-issues triage-new-issues bot added the S: needs triage Issues/PRs that need to be triaged label Apr 19, 2020
@uranusjr
Copy link
Member

uranusjr commented Apr 20, 2020

I like this idea. Preferring a minimum version is a valid tactic in many scenarios, and is even the default in some package managers. I don’t think it’s a good idea to default to the lowest possible version, but it’s reasonable to have it as a configurable option.

The tricky part is how to expose the functionality to the user though. Having a separate --prefer-minimum-versions feels wrong to me, since the flag doese not make sense in some cases. Maybe this should be included as a part of the --upgrade-strategy redesign process. For example, introduce a new --strategy flag as the replacement, and have this as one of the possible values.

@dhellmann
Copy link
Author

I don't see the upgrade_strategy argument to the new resolver being used at all. Is that part of other work that someone else is doing?

It does seem to make sense to fold the behavior change into the strategy, as long as it isn't something that we would want to combine with other strategies. It looks like other strategies include only-if-needed, eager, and to-satisfy-only. I'm not sure what the distinction is between only-if-needed and to-satisfy-only. During an upgrade, I could see someone wanting to say the equivalent of "update if you have to, but move to the oldest possible version you can". How would someone express that if "prefer-minimum" is a separate strategy from those other options?

@pfmoore
Copy link
Member

pfmoore commented Apr 22, 2020

I don't see the upgrade_strategy argument to the new resolver being used at all.

It isn't yet. But we expect to add that soon (subject to some questions over how well the existing strategies fit with how the new resolver works).

In terms of the new resolver, "eager" means "don't prioritise already-installed versions over other versions". And "only-if-needed" would prioritise already-installed versions. The "to-satisfy-only" option isn't really relevant as it's more of an "internal" state (its behaviour is a bit weird, so I won't confuse things by explaining here).

Minimum version would be easy enough to specify by preferring older versions over newer ones.

The big question, as I see it, is how to let the user specify their intent correctly. Suppose there's some dependency in the tree that doesn't specify a minimum version. Would you want to install version 0.0.1 (or whatever ancient version) in that case? And surely "upgrade to minimum version possible" is just "don't upgrade" - the currently installed version is pretty much by definition the minimum version allowed...

So I think that technically, this is relatively straightforward to implement, but we'd need help in designing a user interface, in terms of command line options, to allow the user to make meaningful requests, while not turning things into a complex mess that no-one can understand :-)

@dhellmann
Copy link
Author

I don't see the upgrade_strategy argument to the new resolver being used at all.

It isn't yet. But we expect to add that soon (subject to some questions over how well the existing strategies fit with how the new resolver works).

OK. I was asking because I wasn't sure how to fit this in. I can add a --strategy option to replace the --upgrade-stragey option as @uranusjr suggested, and ensure the strategy is passed to the resolver as upgrade_strategy. After that, I'm less sure what to do. :-)

Is the plan to define some classes to represent the behaviors of the strategy so that the code can call methods instead of checking string literals in different places? Or do you think the strategies would be completely encompassed by the resolver itself, so the string literals would be fine? Either way, I expect we would need some changes in resolverlib, too. How much of the definition of the strategies should be owned by the library instead of pip itself?

If there's anything written down that I can look at to come up to speed, feel free to respond just with links. I have a little time this week so I'd like to help, if I can.

In terms of the new resolver, "eager" means "don't prioritise already-installed versions over other versions". And "only-if-needed" would prioritise already-installed versions. The "to-satisfy-only" option isn't really relevant as it's more of an "internal" state (its behaviour is a bit weird, so I won't confuse things by explaining here).

Minimum version would be easy enough to specify by preferring older versions over newer ones.

The big question, as I see it, is how to let the user specify their intent correctly. Suppose there's some dependency in the tree that doesn't specify a minimum version. Would you want to install version 0.0.1 (or whatever ancient version) in that case? And surely "upgrade to minimum version possible" is just "don't upgrade" - the currently installed version is pretty much by definition the minimum version allowed...

I would say, yes, install 0.0.1. I consider not specifying a minimum version a bug in the packaging specs, and if 0.0.1 doesn't work then the tests run with this new flag set would expose the bug. I realize other folks may not have quite that strict an interpretation, though. :-) I guess saying that the strategy would install the "earliest version that can be found" in that case would at least be clear and easy to understand. Maybe that means a better name for the strategy is something like "earliest-compatible"?

So I think that technically, this is relatively straightforward to implement, but we'd need help in designing a user interface, in terms of command line options, to allow the user to make meaningful requests, while not turning things into a complex mess that no-one can understand :-)

I agree, the implementation in #8086 was quite straightforward, and the harder part will be the UI and internal API changes.

@dhellmann
Copy link
Author

I've joined #pypa-dev on freenode as dhellmann, in case anyone wants to chat about this with less latency. I can summarize anything said there here in the ticket for easier reference later.

@dhellmann
Copy link
Author

dhellmann commented Apr 22, 2020

Is the plan to define some classes to represent the behaviors of the strategy so that the code can call methods instead of checking string literals in different places? Or do you think the strategies would be completely encompassed by the resolver itself, so the string literals would be fine? Either way, I expect we would need some changes in resolverlib, too. How much of the definition of the strategies should be owned by the library instead of pip itself?

As an example of what I mean here, I could see a Strategy class hierarchy defining a method get_preferred_candidate() to implement the PipProvider method get_preference() so the provider doesn't have to be aware of all of the strategies. The Strategy would also need to define a method like sort_candidates() to be used by resolvelib.Resolution._attempt_to_pin_criterion().

I'm sure other strategies would cause the API for Strategy to need to expand in other ways.

@pfmoore
Copy link
Member

pfmoore commented Apr 22, 2020

I've joined #pypa-dev on freenode as dhellmann, in case anyone wants to chat about this with less latency.

We're discussing resolver things on Zulip rather than IRC.

As an example of what I mean here, I could see a Strategy class hierarchy defining a method get_preferred_candidate() to implement the PipProvider method get_preference() so the provider doesn't have to be aware of all of the strategies.

The get_preference method isn't related to this. It's a "which thing should we check next" tuning knob to control the internal progress of the resolver. The method that matters here is find_matches (and specifically the order of the candidates it returns).

I'm planning on looking at this myself tomorrow, as I've had upgrade strategies on my task list for a week or so now :-) At the moment, I'm a fairly strong -1 on strategy classes - I feel that they'd likely just be over-engineering at the moment. IMO we've already got probably more classes in the new resolver code than we really need...

But I've shut down my "working on pip" PC for the day now, so I'll refrain from going into any further detail just from memory.

@dhellmann
Copy link
Author

I've joined #pypa-dev on freenode as dhellmann, in case anyone wants to chat about this with less latency.

We're discussing resolver things on Zulip rather than IRC.

Ah. I don't know what that is. The docs pointed me to IRC. How do I get to the right place in Zulip?

As an example of what I mean here, I could see a Strategy class hierarchy defining a method get_preferred_candidate() to implement the PipProvider method get_preference() so the provider doesn't have to be aware of all of the strategies.

The get_preference method isn't related to this. It's a "which thing should we check next" tuning knob to control the internal progress of the resolver. The method that matters here is find_matches (and specifically the order of the candidates it returns).

OK. That wasn't what I found when looking at the implementation I've already one, but I'll take a look at find_matches().

I'm planning on looking at this myself tomorrow, as I've had upgrade strategies on my task list for a week or so now :-) At the moment, I'm a fairly strong -1 on strategy classes - I feel that they'd likely just be over-engineering at the moment. IMO we've already got probably more classes in the new resolver code than we really need...

OK, I can understand that. There do seem to be a lot of different parts working together and dhellmann@de6e70d didn't come out particularly clean. :-)

@dhellmann
Copy link
Author

I've joined #pypa-dev on freenode as dhellmann, in case anyone wants to chat about this with less latency.

We're discussing resolver things on Zulip rather than IRC.

Ah. I don't know what that is. The docs pointed me to IRC. How do I get to the right place in Zulip?

Nevermind, found it.

@pradyunsg pradyunsg added the type: feature request Request for a new feature label Apr 22, 2020
@triage-new-issues triage-new-issues bot removed the S: needs triage Issues/PRs that need to be triaged label Apr 22, 2020
@pradyunsg pradyunsg added the state: needs discussion This needs some more discussion label Apr 22, 2020
@pradyunsg
Copy link
Member

@dhellmann Glad to hear that you're willing to help out with the implementation. ^>^

To set expectations early, this is a feature request for adding new functionality to pip. As per pip's release cadence, the next release with new features would be in July (pip 20.2) so arguably, IMO there's no hurry toward implementing this.

Further, as you've discovered, implementing this feature will be significantly easier to do with the new resolver's architecture than with the old resolver, however, our priority currently is to get the new resolver to feature parity with the existing resolver and roll it out to become the default this year. IMO implementing new features related to dependency resolution is going to be significantly lower priority for us, in the short term, while we work on replacing a core component of pip.

@pradyunsg pradyunsg added the C: dependency resolution About choosing which dependencies to install label Apr 22, 2020
@dhellmann
Copy link
Author

I understand the priorities, and am not in a particular hurry for a release. That said, I have more time to work on pip in the next few days than I’m likely to have later. So let’s see where we get with things as you have time, too.

Are any of the higher priority tasks things I might be able to help with?

@pradyunsg
Copy link
Member

This is pointing in a different direction from dependency resolution, but #4625 would be great to solve and would be a significant usability improvement for users (especially ones that are on Linux and using the system Python with sudo).

@pohlt
Copy link

pohlt commented Oct 5, 2021

I would be very interested in this feature. Any updates on its implementation status?

@pfmoore
Copy link
Member

pfmoore commented Oct 5, 2021

Basically, no-one is currently working on it, and the discussion in this thread is all there is. The biggest questions remain how to design a user-friendly interface for this, and make the behaviour intuitive for people (for example, I'm still not at all sure that if no lower bound is specified somewhere in the dependency tree, getting a version from 15 years ago and progressively working forward through the versions until you reach something that works, is a good user experience).

But nothing will happen unless someone is willing to do the design and implementation work, so any such discussions are pointless at the moment.

@pohlt
Copy link

pohlt commented Oct 5, 2021

Thanks for the update. There was an initial PR #8086 from @dhellmann which didn't get a lot of attention or feedback from the package owners. Busy times, I know. 😉

So without at least some commitment from the package owners, nobody will go down the same road and starve again, I guess. Well, at least I wouldn't.

If no lower bound is given, just try the 15 years old version and watch how everything goes up in flames. I don't think that's a likely scenario. Anyone who knows about the "minimum version" option and activates it, will be clever enough to know that a missing lower bound is bound to break. Or you could simply stop and tell the user to add a minimum version.

@davegaeddert
Copy link

I made a couple more changes to my branch and opened a PR here: #11336

Regardless of the details of a possible third strategy, it sounds like there's potential for more so I left it as a string option. Seems like those could be discussed down the road if someone wants to flesh them out, and won't make for an annoying rename / backwards compatibility issue if they do. I did rename it to --version-selection though because I do agree that some of these other "strategies" and veering (orthogonal-ly) outside of "version selection" and I think some more precise terminology would be helpful (whether that's the term you want to go with, I don't know).

I also removed the interactive prompt when there's a lower bound missing — it will throw an exception now, so you'll have to do something to address it. I can see how constraints could apply to both the application and package use cases, and it's simpler to leave the prompt out for now anyway.

@davegaeddert
Copy link

davegaeddert commented Aug 2, 2022

FWIW, I did try using this in CI roughly how I'd plan to use it for the packaging use case. Found some dependency ranges that I needed to modify... The workflow for figuring out the minimum versions was a little rough (for a first time anyway) but so long as pip could do the version selection, I'd guess that you could probably make some tooling around/outside of pip to help identify minimum requirements (for some I read changelogs, for others I did a git bisect-style approach of bumping versions + running tests).

Here's my GitHub workflow example:
https://github.com/dropseed/combine/pull/89/files#diff-faff1af3d8ff408964a57b2e475f69a6b7c7b71c9978cccc8f471798caac2c88

@thomasf
Copy link

thomasf commented Aug 2, 2022

I moved a few smaller (non public) projects over from pipenv or pure requirements.txt on experimental branches. It's going pretty well so far.

It should probably be allowed to use --version-selection in a requirements.txt file.

@Asday
Copy link

Asday commented Sep 28, 2022

It should probably be allowed to use --version-selection in a requirements.txt file.

I'd love to back up my 👍 on this by pointing out the insane things PHP developers have had to do to deal with the fact their applications run differently based on their hosting server's PHP interpreter configuration. Obviously slightly different here as the normality is that whoever writes the requirements file is likely the one running pip, and this is declarative configuration rather than code, but the analogue refused to leave my brain.

@jayaddison
Copy link

Realistically, what version selection algorithms besides minimum or maximum is even viable? Sounds to me that a --strategy=... might be an over preparedness for something that likely won't be needed?

@thomasf I've arrived on this issue thread after encountering a use case where a different dependency installation version selection mechanism could have been useful.

Here's an attempt to explain:

Scenario

After checking out an old Python project's release tag, I attempted to pip install dependencies into a local venv and then to run its unit tests. However: some of the dependencies have undergone breaking changes, and so the resulting installation set didn't produce a working result.

I don't want to install minimum versions of everything, because I expect that some/many of the components involved have had useful improvements (performance, security, bugfix, ...) since the requirements/constraints were defined.

Desire

What I'd like to handle that use case would be a combination of: install the maximum compatible version within an upper-bounding timestamp (which I might select as, for example, the time the release tag was created - or a few months later). That should install a set of dependencies that worked (and was considered freshest) at that moment-in-time.

@pfmoore
Copy link
Member

pfmoore commented Sep 29, 2022

What I'd like to handle that use case would be a combination of: install the maximum compatible version within an upper-bounding timestamp

Time-based version bounds seem like a relatively common and understandable requirement, but it's not really one that pip (or indeed the Python package versioning scheme) is designed to handle particularly well as things stand.

One thing that could work for this sort of scenario would be a tool that took a date, and read the PyPI metadata for a series of packages and wrote out a constraints file based on the upload times of the files, to constrain the package to only versions released before the given date. That should be a relatively easy tool to write, and wuld give the effect of upper-bounded timestamp constraints without needing any changes to existing infrastructure.

I know that people are typically resistant to building solutions from co-operating tools like this, preferring a "one tool does everything" approach, but if anyone is interested in creating a short-term solution (and possibly a generally useful tool) then maybe it would be worth looking into this possibility.

@pradyunsg
Copy link
Member

https://github.com/astrofrog/pypi-timemachine exists!

@henryiii
Copy link
Contributor

It's a bit more work than the proposed tool (it creates a PyPI proxy), but https://pypi.org/project/pypi-timemachine/ does this.

@pradyunsg
Copy link
Member

pradyunsg commented Sep 29, 2022

It's coupled to PyPI, but someone could take that forward and extend it to support Artifactory and custom variants similar to how https://github.com/uranusjr/simpleindex allows you to have custom routing strategies.

@notatallshaw
Copy link
Member

Small update on anyone who needs this feature, uv has a pip-like install interface and has the option --resolution with possible values "highest", "lowest", and "lowest-direct".

@pohlt
Copy link

pohlt commented Feb 17, 2024

PDM also offers this functionality.

robin-nitrokey added a commit to Nitrokey/nitrokey-sdk-py that referenced this issue Jul 22, 2024
This patch adds the initial project structure, including licenses,
linters and their configuration, and a basic test case that ensures that
the module can be loaded.

Most tooling is taken over from pynitrokey.  One notable exception is
that flit is replaced by poetry.  This only affects developers, not end
users.  We recently made the same change in nitrokey-app2 [0], so I
think it makes sense to use the same tooling here.

[0] Nitrokey/nitrokey-app2#172

In the CI builds, we check that both the lockfile and the latest
dependency versions work.  It would good to also run it for the minimum
versions specified in pyproject.toml, but this is currently not
supported by poetry or pip:

python-poetry/poetry#3527
pypa/pip#8085
robin-nitrokey added a commit to Nitrokey/nitrokey-sdk-py that referenced this issue Jul 22, 2024
This patch adds the initial project structure, including licenses,
linters and their configuration, and a basic test case that ensures that
the module can be loaded.

Most tooling is taken over from pynitrokey.  One notable exception is
that flit is replaced by poetry.  This only affects developers, not end
users.  We recently made the same change in nitrokey-app2 [0], so I
think it makes sense to use the same tooling here.

[0] Nitrokey/nitrokey-app2#172

In the CI builds, we check that both the lockfile and the latest
dependency versions work.  It would good to also run it for the minimum
versions specified in pyproject.toml, but this is currently not
supported by poetry or pip:

python-poetry/poetry#3527
pypa/pip#8085
robin-nitrokey added a commit to Nitrokey/nitrokey-sdk-py that referenced this issue Jul 22, 2024
This patch adds the initial project structure, including licenses,
linters and their configuration, and a basic test case that ensures that
the module can be loaded.

Most tooling is taken over from pynitrokey.  One notable exception is
that flit is replaced by poetry.  This only affects developers, not end
users.  We recently made the same change in nitrokey-app2 [0], so I
think it makes sense to use the same tooling here.

[0] Nitrokey/nitrokey-app2#172

In the CI builds, we check that both the lockfile and the latest
dependency versions work.  It would good to also run it for the minimum
versions specified in pyproject.toml, but this is currently not
supported by poetry or pip:

python-poetry/poetry#3527
pypa/pip#8085
robin-nitrokey added a commit to Nitrokey/nitrokey-sdk-py that referenced this issue Jul 22, 2024
This patch adds the initial project structure, including licenses,
linters and their configuration, and a basic test case that ensures that
the module can be loaded.

Most tooling is taken over from pynitrokey.  One notable exception is
that flit is replaced by poetry.  This only affects developers, not end
users.  We recently made the same change in nitrokey-app2 [0], so I
think it makes sense to use the same tooling here.

[0] Nitrokey/nitrokey-app2#172

In the CI builds, we check that both the lockfile and the latest
dependency versions work.  It would good to also run it for the minimum
versions specified in pyproject.toml, but this is currently not
supported by poetry or pip:

python-poetry/poetry#3527
pypa/pip#8085
robin-nitrokey added a commit to Nitrokey/nitrokey-sdk-py that referenced this issue Jul 22, 2024
This patch adds the initial project structure, including licenses,
linters and their configuration, and a basic test case that ensures that
the module can be loaded.

Most tooling is taken over from pynitrokey.  One notable exception is
that flit is replaced by poetry.  This only affects developers, not end
users.  We recently made the same change in nitrokey-app2 [0], so I
think it makes sense to use the same tooling here.

[0] Nitrokey/nitrokey-app2#172

In the CI builds, we check that both the lockfile and the latest
dependency versions work.  It would good to also run it for the minimum
versions specified in pyproject.toml, but this is currently not
supported by poetry or pip:

python-poetry/poetry#3527
pypa/pip#8085
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C: dependency resolution About choosing which dependencies to install resolution: deferred till PR Further discussion will happen when a PR is made type: feature request Request for a new feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.