Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add per-requirement --no-deps option support in requirements.txt #10837

Open
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

q0w
Copy link
Contributor

@q0w q0w commented Jan 28, 2022

Closes #9948

@q0w q0w marked this pull request as draft January 28, 2022 13:19
@q0w q0w marked this pull request as ready for review January 28, 2022 14:08
@DiddiLeija DiddiLeija closed this Feb 22, 2022
@DiddiLeija DiddiLeija reopened this Feb 22, 2022
@DiddiLeija
Copy link
Member

Closed and re-opened to enable CI runs.

@q0w q0w marked this pull request as ready for review March 23, 2022 08:46
src/pip/_internal/operations/check.py Outdated Show resolved Hide resolved
src/pip/_internal/req/req_install.py Outdated Show resolved Hide resolved
src/pip/_internal/resolution/resolvelib/candidates.py Outdated Show resolved Hide resolved
@q0w

This comment was marked as resolved.

news/9948.feature.rst Outdated Show resolved Hide resolved
Co-authored-by: Tzu-ping Chung <uranusjr@gmail.com>
@q0w q0w closed this Mar 26, 2022
@q0w q0w reopened this Mar 26, 2022
@q0w q0w changed the title Add a per-requirement --no-deps option. Add per-requirement --no-deps option support in requirements.txt. Mar 26, 2022
@groodt
Copy link
Contributor

groodt commented Aug 5, 2023

We also have to split up into at least two pip installs, one for the normal dependencies and one for the problematic package. Being able to combine those into one will also reduce useless CPU time in CI runs and thus ultimately be good for the environment as well.

This argument seems unconvincing to me. Do you have data to support this? I can't see how a handful of repeated pip install --no-deps invocations would have any measurable CPU or environmental saving. Pip startup is very fast so I doubt you would see any difference beyond a few milliseconds at best. Is there a reproducible example?

@groodt
Copy link
Contributor

groodt commented Aug 5, 2023

I tend to align with @pfmoore point of view here.

Myself and my team struggle to install tricky dependencies or poorly packaged dependencies on a regular basis and we often need to patch or workaround things to get them installed. We prefer having pip being stricter by default (we wish it was stricter in some cases) and view the friction required in a positive way because it makes us stop to pause and reflect. We either push fixes to upstream packaging, or look for better packaged alternatives, or put in effort to vendor or patch dependencies to make them work in our environment.

Don't get me wrong. It is sometimes very frustrating, but I'm not sure making it easy to ignore guard-rails is the right approach.

If it was deemed valuable to reduce the friction, I would suggest adding a similar flag to "--break-system-packages" so that it's clear that the user is going to install a possibly broken environment intentionally and that they're willing to deal with the consequences. I would expect most bugs raised where a user has this flag enabled to be politely closed as "not a bug with pip".

@adam-urbanczyk
Copy link

@groodt the guard rails are already removed - you can run pip install --no-deps manually few times. The request is about making it more ergonomic for the people who need this functionality.

Additionally, it is not always the case that some upstream packages are broken. I need this functionality for testing with part of the deps installed by different means (e.g. conda).

@pfmoore
Copy link
Member

pfmoore commented Aug 5, 2023

I need this functionality for testing with part of the deps installed by different means (e.g. conda).

If conda is installing your dependencies without the necessary standard metadata to allow pip to recognise that they are installed, then that is a bug in conda which needs to be fixed.

@notatallshaw
Copy link
Contributor

notatallshaw commented Aug 5, 2023

then that is a bug in conda which needs to be fixed.

FYI Conda is a generic package installer, not a Python package installer, e.g. you can install Rust, or Nodejs, or OpenSSL.

Which means it can not enforce packages implement standard Python metadata and it is up to package author to follow best practices on building a Python package. Which their documentation lays out correctly on how to do, but obviously mistakes can be made.

That is to say it will almost certainly not be a bug in conda but the individual package needs to fix it.

@adam-urbanczyk
Copy link

I need this functionality for testing with part of the deps installed by different means (e.g. conda).

If conda is installing your dependencies without the necessary standard metadata to allow pip to recognise that they are installed, then that is a bug in conda which needs to be fixed.

No, there is no bug. The package ecosytems are simply not aligned (certain artifacts are provided by different packages).

@pfmoore
Copy link
Member

pfmoore commented Aug 5, 2023

Nevertheless, pip works with Python's standard metadata. If you want pip to recognise a package that is installed on your system, it needs to have that metadata, and we don't consider "so that we can use pip with packages that don't follow the Python standards" as a compelling argument in favour of a pip feature.

@adam-urbanczyk
Copy link

Nevertheless, pip works with Python's standard metadata. If you want pip to recognise a package that is installed on your system, it needs to have that metadata, and we don't consider "so that we can use pip with packages that don't follow the Python standards" as a compelling argument in favour of a pip feature.

Let's say I have the following situation:

package a depends on artifacts b and c. In pip world those artifacts (native libraries BTW) are provided by package d. In another package manger world (e.g. conda) they are provided by packages bb and cc. I want to be able to install package a without d via pip using requirements.txt since I already installed bb and cc. What do the "Python standards" advise me to do?

@dereksz
Copy link

dereksz commented Aug 6, 2023

I've ended up here following some thread from StachExchange around this, and just wanted to contribute a concrete example.

I've a team that have been using catboost to develop ML models. When they use it, they need to validate and visualise thier model builds in Jupyter notebooks, so catboost includes dependencies on matplotlib and plotly. These are large-ish libs that then are not needed at run-time when the model is being used just to calc a score for input data. In this instance, I really wanted to add a --no-deps in the requirements file for the production server.

I suppose it could be argued (by purists?) that the people packaging catboost "should be doing it another way", and they might be right. But, while striving for "the right way", we do also live in an imperfect world, and I see this PR as being a useful addition to navigating that world.

@groodt
Copy link
Contributor

groodt commented Aug 7, 2023

I've encountered similar issues mentioned above and more.

If I were to categorize the scenarios, they would fall into these buckets:

A) Indicating that certain direct dependencies (and their transitives) are expected to be already "provided" - that is, a situation where one wants to build and test code locally with everything installed, but at point of runtime elsewhere, some of the dependencies may already be provided by the environment. This occurs in situations such as pyspark or notebook kernels or AWS Lambda layers, where some dependencies are already installed. In this scenario, I would want an installer to warn me or fail if the dependencies were expected to be provided but were not provided when the installer was run
B) Indicating that certain transitive or indirect dependencies should be "excluded" because they are known to be unnecessary by the person doing the installation. In this scenario, I am taking responsibility and am confident that at runtime the code-paths will avoid hitting the dependency that I am excluding.
C) Working around metadata issues of a package. It can happen that a package has overly restrictive constraints or bugs in the constraints that need workarounds on particular platforms or environments.
D) Install all dependencies from a "lockfile" (produced via something like pip-tools) where I already know that I don't want the resolver to run or discover additional dependencies. I just want the installer to install things.

My understanding of this PR is that most are talking about B? I'm not sure that --no-deps was originally intended for this purpose, but I could be wrong.

Scenario A

For this scenario, I still want the resolver to "solve" the puzzle of the dependencies and to be able to read metadata of whatever environment it will install into. I would consider it a bug if at the end of installation process that my environment was inconsistent with the instructions that I provided to the installer. Here I still want the resolver to run fully, but the installer to run partially and error if the resulting environment is deemed to be broken.

Scenario B

This is intentionally installing a possibly broken environment. For this, I think it's important to make it clear to the user that they are removing guard-rails. Here, I think the intention is for the resolver to run fully, but the installer should only run partially. The difference here is that it shouldn't be considered an error if the resulting environment is deemed to be broken?

Scenario C

This seems different to --no-deps I presume? Here, the intention is to provide guidance to the resolver (not the installer) that the constraints provided by a dependency are wrong. So this seems close to B, but its also different because it's not only asking pip to accept or be unaware of installation into a broken environment, but its also asking pip to create a "broken" installation plan from the resolver.

Scenario D

Here, I am saying that I've previously created a valid installation plan according to a resolver and "frozen" or "locked" it. I now want to run the installation plan elsewhere and just run installation. I would like an error at the end if the resulting environment is deemed to be broken.

Question

--no-deps can play a role in all of these different scenarios. It feels like this PR and most recent comments relate to Scenario B. Is that correct? Does it then impact the other scenarios and should the CLI options be overloaded or are they indeed solving similar, but different problems? I'm not claiming the list of Scenarios is exhaustive or even mutually exclusive, so there may even be other impacts that I'm not thinking about. So I understand why the pip maintainers may want more time to think on the UX of these scenarios.

@adam-urbanczyk
Copy link

--no-deps can play a role in all of these different scenarios. It feels like this PR and most recent comments relate to Scenario B. Is that correct? Does it then impact the other scenarios and should the CLI options be overloaded or are they indeed solving similar, but different problems? I'm not claiming the list of Scenarios is exhaustive or even mutually exclusive, so there may even be other impacts that I'm not thinking about. So I understand why the pip maintainers may want more time to think on the UX of these scenarios.

My scenario is more like A, though your description has many details that do not match.

@mfansler
Copy link

I need this functionality for testing with part of the deps installed by different means (e.g. conda).

If conda is installing your dependencies without the necessary standard metadata to allow pip to recognise that they are installed, then that is a bug in conda which needs to be fixed.

Rather than a bug, I'd assess it's that Conda is being consistent with Pip by deferring to Pip behavior. Conda similarly provides a YAML format that includes an optional pip: section. Users have correspondingly clamored for having --no-deps per specification here, but Conda treats this section as a requirements.txt, so it has whatever behavior Pip has implemented.

Moreover, I think the argument being made here for not merging this PR would be equally valid for Conda not changing its implementation (i.e., should Conda undertake reimplementing its pip install strategy just to enable users to create invalid environments?). Hence, if Pip doesn't want to do it, I wouldn't expect Conda would either.

@dereksz
Copy link

dereksz commented Jan 31, 2024

Seems like this has stalled, so trying to add some new thoughts.

I can see the tension between some desiring the "quick fix" for easy Docker files & co., and others concerned about making it too easy for the unwary to fall of the edge of the peverbial cliff. If the "Scenario B" is indeed the dominant use case now, we could perhaps be a lot more specific with a slightly different option it to make it less likely to do something unexpected. So, maybe we are --exclude-deps=matplotlib,plotly (to go back to my earlier example). This would make it much more obvious and explicit that the "guard rails are coming off".

The other part of the convo. (which I don't fully understand) is the dirrerence between the installer and the resolver. Taking a guess at these, I'm guessing that the resolver might penetrate underneath matplotlib and plotly and find other dependancies that would still be resolved (and installed?). I don't really like this option, but it would be workable (there may just be a few more I need to add to my explicit --exclude-deps list). AND it would be in keeping with maintaining "tight" guard rails. I'd then expect pip to behave as if the package tagged with the --exclude-deps, simply han't mentioned these packages as dependancies. If a different package has a dependancy on them, they should still be included (unless that also uses an --exclude-deps option). In terms of a dependancy DAG, we make a cut, but it may not be that the whole branch falls off because of other dependancies.

Might this help resolve the tension?

@philipqueen
Copy link

Hi, I would like to add a use case this PR would fix that I believe has not been mentioned above or in #9948.

OpenCV maintains 4 different versions of their software (see 3 here: https://github.com/opencv/opencv-python?tab=readme-ov-file#installation-and-usage). Unfortunately, they do not officially conflict, in that I can install multiple versions (opencv-python, opencv-contrib-python for example) in the same environment, and pip check will return No broken requirements found. However, calling cv2 in the program will throw an error. In this way, I do not believe #8076 describes the issue - I would like to be able to manage a conflict that pip is not able to see as a conflict.

This issue arose for me in working on Freemocap. We have multiple different ml libraries we use (for example, MediaPipe and YOLO) and there is no standard for which opencv distribution is used (nor is it feasible to ask every ml library to agree on one). Our goal would be to install the superset opencv-contrib-python, which contains everything needed to satisfy every other opencv distribution. Adding this option to the requirements.txt will allow us to keep simple install instructions for our users, and will remove this conflict as a barrier in distributing our software as an application through PyApp.

The OpenCV discussion around this hasn't pointed towards any work on fixing the issue, for example:
opencv/opencv-python#896
opencv/opencv-python#467
opencv/opencv-python#388
There are also other packages with a similar problem mentioned in the pip issues linked above.

@adam-urbanczyk
Copy link

Nevertheless, pip works with Python's standard metadata. If you want pip to recognise a package that is installed on your system, it needs to have that metadata, and we don't consider "so that we can use pip with packages that don't follow the Python standards" as a compelling argument in favour of a pip feature.

Let's say I have the following situation:

package a depends on artifacts b and c. In pip world those artifacts (native libraries BTW) are provided by package d. In another package manger world (e.g. conda) they are provided by packages bb and cc. I want to be able to install package a without d via pip using requirements.txt since I already installed bb and cc. What do the "Python standards" advise me to do?

@pfmoore @groodt you seem to be against merging this PR, but you never proposed an alternative advice for the described situation.

@pfmoore
Copy link
Member

pfmoore commented Feb 1, 2024

@pfmoore @groodt you seem to be against merging this PR, but you never proposed an alternative advice for the described situation.

Don't mix package managers, basically. Pip and conda use different metadata, so this sort of situation can arise. I'm sorry if that's not the answer you hoped for, but it's the best I can offer. You could try asking conda to include standard Python packaging metadata for their native library packages, but I suspect they would say (quite reasonably) that Python metadata isn't designed for native libraries (correct, it isn't) and so they don't plan on doing so.

@adam-urbanczyk
Copy link

Don't mix package managers, basically. Pip and conda use different metadata, so this sort of situation can arise. I'm sorry if that's not the answer you hoped for, but it's the best I can offer. You could try asking conda to include standard Python packaging metadata for their native library packages, but I suspect they would say (quite reasonably) that Python metadata isn't designed for native libraries (correct, it isn't) and so they don't plan on doing so.

To be clear, it is not about conda per se. The artifacts could very well be provided by pacman, scoop, spack,... But I guess the answer is clear, there is currently no solution.

@jonmatthis
Copy link

jonmatthis commented Feb 21, 2024

This PR would really help us over in https://github.com/freemocap/freemocap

We have two dependencies, one of which includes opencv-python as a dependency, the other of which includes opencv-contrib-python

The end result is that my users get BOTH versions of opencv in their environment when they run pip install freemocap, which causes crashes (because of known problems with cv2 when there are multiple version installed)

There is a quick fix, which is to uninstall both versions after the initial freemocap install and then manually install opencv-contrib, but my whole project is education/student focused so asking them to do any additional environment wrangling is pretty heavy pull for them. We follow a 'Universal Design' principle aiming at maximizing accessibility to low XP folks, so a lot of our users are very young and new to any kind of CLI

Heres's a link to the hacky pop up nonsense we implemented to handle this issue - https://github.com/freemocap/freemocap/blob/256f8d89ea332b255ff6f41e96e4892595f8319b/freemocap/gui/qt/widgets/opencv_conflict_dialog.py lol

Please merge this, would really appreciate it, thaaanks!

(see @philipqueen's comment above for details - #10837 (comment) )

@robertofalk
Copy link

is there a blocker for this PR? This would also really help us

@rsxdalv
Copy link

rsxdalv commented Mar 16, 2024

Since there's a massive discussion around this, FYI what it really terrorizes is pytorch. Each project has to include it as a dependency if it wants to be plug and play. However, this in about 70% of the cases results in a different version being demanded. So at this point it must be tens of thousands of people who have wrecked their environment's pytorch versions by installing something that didn't make pip happy, which leads to a gigabyte or two download of the wrong version and minutes of installing (and reinstalling).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

--no-deps flag inside requirements.txt