Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pip is completely ignoring build-system requirements when build isolation is disabled #9314

Closed
mmerickel opened this issue Dec 18, 2020 · 18 comments
Labels
C: build logic Stuff related to metadata generation / wheel generation type: support User Support

Comments

@mmerickel
Copy link

mmerickel commented Dec 18, 2020

Environment

  • pip version: 20.3.3
  • Python version: 3.6.8 and 3.9.1
  • OS: macOS 11.1 and Ubuntu 20.04

Description

When build isolation is disabled via pip install --no-build-isolation the build requirements defined in pyproject.toml [build-system] requires= are completely ignored by pip and not treated as dependencies that need to be installed.

Expected behavior

When build isolation is disabled, pip should still try to install missing build requirements defined in [build-system] requires = and sort them appropriately so that they are installed prior to the package that needs them. If the dependencies are already installed then it should treat them as resolved and skip them.

How to Reproduce

  1. Define a setup.py that requires a build dependency (pyramid in this case):
from setuptools import setup

import pyramid

setup(
    name='foo',
    version='0.0',
)
  1. Define a pyproject.toml to install that dependency.
[build-system]
requires = [
    "pyramid",
    "setuptools>=42",
    "wheel",
]
build-backend = "setuptools.build_meta"
  1. Test this out in normal circumstances with build isolation:
$ python3 -m venv env
$ env/bin/pip install -U pip
$ env/bin/pip install -e .
  1. Note no problems, worked great.

  2. Test this out without build isolation:

$ python3 -m venv env
$ env/bin/pip install -U pip
$ env/bin/pip install -e . --no-build-isolation
Obtaining file:///Users/michael/work/oss/pip-bug
    Preparing wheel metadata ... error
    ERROR: Command errored out with exit status 1:
     command: /Users/michael/work/oss/pip-bug/env/bin/python /Users/michael/work/oss/pip-bug/env/lib/python3.9/site-packages/pip/_vendor/pep517/_in_process.py prepare_metadata_for_build_wheel /var/folders/49/7s4l1ljx5rbgy_m8fzn26bh40000gn/T/tmpp9mdoxf4
         cwd: /Users/michael/work/oss/pip-bug
    Complete output (14 lines):
    Traceback (most recent call last):
      File "/Users/michael/work/oss/pip-bug/env/lib/python3.9/site-packages/pip/_vendor/pep517/_in_process.py", line 280, in <module>
        main()
      File "/Users/michael/work/oss/pip-bug/env/lib/python3.9/site-packages/pip/_vendor/pep517/_in_process.py", line 263, in main
        json_out['return_val'] = hook(**hook_input['kwargs'])
      File "/Users/michael/work/oss/pip-bug/env/lib/python3.9/site-packages/pip/_vendor/pep517/_in_process.py", line 133, in prepare_metadata_for_build_wheel
        return hook(metadata_directory, config_settings)
      File "/Users/michael/work/oss/pip-bug/env/lib/python3.9/site-packages/setuptools/build_meta.py", line 157, in prepare_metadata_for_build_wheel
        self.run_setup()
      File "/Users/michael/work/oss/pip-bug/env/lib/python3.9/site-packages/setuptools/build_meta.py", line 142, in run_setup
        exec(compile(code, __file__, 'exec'), locals())
      File "setup.py", line 3, in <module>
        import pyramid
    ModuleNotFoundError: No module named 'pyramid'
    ----------------------------------------
ERROR: Command errored out with exit status 1: /Users/michael/work/oss/pip-bug/env/bin/python /Users/michael/work/oss/pip-bug/env/lib/python3.9/site-packages/pip/_vendor/pep517/_in_process.py prepare_metadata_for_build_wheel /var/folders/49/7s4l1ljx5rbgy_m8fzn26bh40000gn/T/tmpp9mdoxf4 Check the logs for full command output.

Of course, if I install pyramid before the pip install -e . manually in a separate invocation of pip then it works... the problem here is that when build isolation is disabled pip ignores build-dependencies completely and you're on your own. Finally it's worth noting that if I defined a requirements.txt that installed both pyramid and -e . in it, pip does not sort them properly such that pyramid is installed first to respect the build dependencies. Pip is completely ignoring them.

@pfmoore
Copy link
Member

pfmoore commented Dec 18, 2020

This is the way --no-build-isolation works. The document explains it as follows: "Build dependencies specified by PEP 518 must be already installed if this option is used." I can understand that you might prefer otherwise, but in any but the most basic cases, it would be very hard to ensure we didn't disrupt the user's system environment if we started installing build dependencies into it.

@pradyunsg pradyunsg added C: build logic Stuff related to metadata generation / wheel generation S: awaiting response Waiting for a response/more information type: support User Support labels Dec 18, 2020
@mmerickel
Copy link
Author

mmerickel commented Dec 18, 2020

Well the problem is that it causes some regressions when combined with the new default resolver.

In other examples (I tried to distill this down to the simplest possible example), I have one package (PackageA) with pyramid in build-requires and another with it in install-requires (PackageB), and before the new resolver was introduced this wasn't a big deal because in requirements.txt I would simply put PackageB before PackageA and pip would install them in that order, such that when it got to PackageB, pyramid was already installed.

The behavior with the new resolver combined with completely ignoring build-requires, pip is now re-ordering things and ignoring the order defined in requirements.txt which is causing PackageB to get installed after PackageA without an easy way to sort that out. This requires me to now use two invocations of pip where I was able to use one before with the same requirements.txt and package metadata.

Currently I can fix things if I put pyramid into "install_requires" for PackageA, which ensures it is installed first, but that is obviously the incorrect intent. Pip should know that I want pyramid from the build-requires, and ensure it's installed before both PackageA and PackageB.

What is the goal in not installing them by default if they are not already installed? How is an error preferable here? I thought the purpose of this nice static metadata in the toml was that it would be totally trivial for pip to add it in as "thing that needs to be available before installing package" just like install_requires and with --no-build-isolation the only difference would be that those things get installed into the same virtualenv as everything else. Then we'd be good to go, having it all sorted out in a single pip invocation and with an awesome new resolver and without requiring strict ordering in the requirements.txt.

@no-response no-response bot removed the S: awaiting response Waiting for a response/more information label Dec 18, 2020
@uranusjr
Copy link
Member

Why not do this in two steps? Something like pip install -r build-requirements.txt && pip install -r requirements.txt would always work, and is the recommended way even with the legacy resolver. There was never guarentee a package listed first would be installed first; that’s only an implementation detail.

@mmerickel
Copy link
Author

  1. pip is being given enough information to do it in one step.
  2. pip has done it in one step for many many years.

I can't make anyone fix it, but it is a regression that does feel unnecessary to me so I'm pointing it out here and I fail to see how moving to two steps is a good thing. :-)

@mmerickel
Copy link
Author

mmerickel commented Dec 18, 2020

Finally, as someone who has been craving an improvement to setuptools' terrible setup_requires feature, I have placed all my hopes and dreams on pip using this static metadata to finally solve this chicken-and-egg problem in a nice ordered way. I'm trying to avoid getting into all the problems with pip and why I'm disabling build isolation in the first place because I think they are separate problems from what I'm identifying here!

@pfmoore
Copy link
Member

pfmoore commented Dec 18, 2020

It's not a regression because you were previously relying on behaviour that wasn't documented (and hence wasn't guaranteed). It's fair to say that it's an inconvenience for you, and we'd rather not inconvenience you if possible, but in this case the benefits of the change (to the new resolver) are sufficient that we're not planning on reverting it. We're not "moving to two steps" as you describe it. You are forcing the need for two steps by disabling build isolation - if you were using build isolation, pip would manage the build automatically.

We've never aimed to make everything possible in a single invocation of pip (even though it often is possible). That isn't, and won't ever be, a design principle we will try to achieve. But conversely, if you use the default mechanism, pip will build and install in one step. It's not actually one step, but pip manages the various steps for you. If you tell pip not to do some of the management (the build isolation) then you're taking responsibility for doing that management yourself. That means installing build dependencies, specifically.

The reason pip doesn't try to install build requirements if you're using --no-build-isolation is that we can't safely do so. If you were building a project that needed an old version of numpy to build, and you had the latest version of numpy installed, would you be OK with pip downgrading your system, just to build your package (which you may not even intend to install locally)? What if you were building two packages, one that needs an old numpy and another which needs the latest numpy? Would you expect pip to repeatedly reinstall numpy, leaving your system in a random state depending on what order pip builds your packages? These, and similar questions, are why we leave the handling of them to the user (who knows what they want).

Basically, build isolation is fundamental to how we expect pip to build packages from source going forward. If you're disabling build isolation, it should only be because for some reason, it's not a workable option for you. Disabling build isolation should never be anyone's first choice.

@pradyunsg
Copy link
Member

pradyunsg commented Dec 18, 2020

Build isolation is required for pip to ensure that the build environment for a package based on pyproject.toml has the dependencies declared. From pip's "perspective", you're explicitly opting out of build isolation and hence taking up the responsibility of ensuring that the build environment has everything necessary for the build.

Your options for resolving this issue are either (a) enabling build isolation which means that pip will use the available information to run the builds or (b) manually ensuring that build dependencies are met when you run pip with --no-build-isolation.

@mmerickel
Copy link
Author

The reason pip doesn't try to install build requirements if you're using --no-build-isolation is that we can't safely do so. If you were building a project that needed an old version of numpy to build, and you had the latest version of numpy installed, would you be OK with pip downgrading your system, just to build your package (which you may not even intend to install locally)? What if you were building two packages, one that needs an old numpy and another which needs the latest numpy? Would you expect pip to repeatedly reinstall numpy, leaving your system in a random state depending on what order pip builds your packages? These, and similar questions, are why we leave the handling of them to the user (who knows what they want).

Basically, build isolation is fundamental to how we expect pip to build packages from source going forward. If you're disabling build isolation, it should only be because for some reason, it's not a workable option for you. Disabling build isolation should never be anyone's first choice.

Yes, that's the risk in disabling build isolation is that you're saying only one version of the package is allowed. Pip has all that information up front and can yell if there's weird conflicts - obviously I wouldn't want it to install one numpy then downgrade to another one later in the same process but the resolver should allow it to find a compatible one or error out based on the constraints it sees. The resolver should make this totally possible, no?

Trust me I don't want to disable build isolation. I haven't done the proper research on this in the issue tracker but the reason I'm doing it is because when it is enabled pip has no way to install stuff from anywhere but a package index (pypi) into the isolated build environment. I have local packages in the source tree that are build requirements and they need to be installed into the build environment - pip simply does not have a way to do this that I'm aware of.

@dstufft
Copy link
Member

dstufft commented Dec 18, 2020

Trust me I don't want to disable build isolation. I haven't done the proper research on this in the issue tracker but the reason I'm doing it is because when it is enabled pip has no way to install stuff from anywhere but a package index (pypi) into the isolated build environment. I have local packages in the source tree that are build requirements and they need to be installed into the build environment - pip simply does not have a way to do this that I'm aware of.

I could be wrong, but I think if you use --find-links, that's passed down into the build environment resolver as well.

@mmerickel
Copy link
Author

I could be wrong, but I think if you use --find-links, that's passed down into the build environment resolver as well.

Ok maybe that works, so to hypothesize - I think the workflow would be to build wheels locally of my source packages, then install the rest of the dependencies using that wheelhouse in --find-links. I've only ever used --find-links with --no-index. Will it work to pull in locally built packages that don't exist in the package index if I leave off --no-index? Secondly, if I understand this correctly then the other packages in my environment using install_requires will not necessarily use the wheelhouse depending on the versions they are pinning? This issue has made me fear undocumented behavior so I'd rather ask now than test it out!!

@dstufft
Copy link
Member

dstufft commented Dec 18, 2020

pip will look at all configured repositories and sources of find links to locate packages, they don't have to appear at all of them. So yes, you can do that, and they won't necessarily use the wheelhouse if something "better" is available on PyPI that still matches the version constraints.

@dstufft
Copy link
Member

dstufft commented Dec 18, 2020

To be clear, I didn't work on the feature so I might be misremembering, but I'm pretty sure all those options get passed down into the build-requirements resolver.. and the behavior I mentioned about combining multiple sources of packages is standard pip behavior for like a decade now :)

@mmerickel
Copy link
Author

Ok, I'll give this a try and maybe I can get back to enabling build isolation. Obviously since the packages in question in my case are in my source tree I don't actually care if the build is isolated but it's clear from this issue that there isn't really interest in improving the situation around --no-build-isolation. The other thing that was surprising to me was that I couldn't disable build isolation on a per-package basis like I can with --no-binary in reqiurements.txt. Thank you all for your responses!

@archoversight
Copy link

It's not a regression because you were previously relying on behaviour that wasn't documented (and hence wasn't guaranteed).

pip has historically been wholly under documented and solutions are found all over the internet, from blog posts to stackoverflow. Solutions such as "place it first in requirements.txt" are very much the norm. I know it sucks, but sadly that is a legacy that pip has to live with, and users have come to expect it.

I am all for improvements, and the new resolver certainly is that, but it should be able to take into account additional information and add it to the dependency tree so that if it did re-order requirements.txt everything would install correctly, even with --no-build-isolation.

With the new resolver pip should be able to squawk if the required installation would fail because two different packages require two different versions of numpy or not (even as a build dependency, going into the same virtual env).

Breaking a pre-existing workflow whether you like it or not is a regression, especially when it is one that is so thoroughly used by the community.

Disabling build isolation should never be anyone's first choice.

Having had to disable build isolation plenty of times in the past for a variety of reasons, it is not anyones first choice. But if you have your own internal packages that you are attempting to build locally or source checkouts where you want to install them as editable (especially where package B depends on package A and you want package A to be an editable install because you are debugging something) sometimes pip doesn't do the right thing or attempts to download from the PyPI with build isolation instead of the thing you have locally.

Having to first build a wheel, then pass that through find-links or some other mechanism means you spend a lot more time and effort doing the thing.

@pfmoore
Copy link
Member

pfmoore commented Dec 18, 2020

I'm going to stop participating in this discussion, because I don't think anything further that I can add will be helpful. But before I do that, I will make one comment on "documented vs undocumented behaviour".

pip has historically been wholly under documented and solutions are found all over the internet, from blog posts to stackoverflow. Solutions such as "place it first in requirements.txt" are very much the norm. I know it sucks, but sadly that is a legacy that pip has to live with, and users have come to expect it.

I understand the point here, and I have a lot of sympathy. It's certainly true that a lot of pip's features aren't sufficiently well documented, even things that are intended to be relied on. So you're absolutely right that just because something isn't documented, doesn't mean that we can arbitrarily change it.

However, the converse is also untrue - just because pip has historically behaved in a certain way, doesn't mean we can't change it. That's a very dangerous precedent, and is basically what resulted in distutils becoming an unmaintainable mess, because it became impossible to change anything, because someone, somewhere, might be depending on it. We don't want pip to fall into that trap.

So we have to be careful here what we accept as guaranteed behaviour. The best way I know of doing that is to start with the principle that "if it's not documented, it's not guaranteed". However, that doesn't mean we have carte blanche to mess with everything else - it's entirely reasonable for someone to submit a request that a certain behaviour gets documented, so that it becomes a guarantee. But clearly, people aren't going to check everything they do, so that's at best a partial solution. So we do, in general, try not to break backward compatibility unnecessarily - and if we can update the documentation as we go along, so much the better. But it's very much a judgement call, and one that we take seriously.

In this particular case, the change in behaviour was part of a much bigger change to fix a long-standing problem, that pip's old resolver was badly broken in many ways. We publicised the rollout of the new resolver widely, but there's no way we could reach everyone - a lot of people simply wouldn't have realised that the behaviour they rely on is linked to the resolver at all. So we're balancing the breakage here against a massive improvement for the whole of the Python ecosystem. If we can get the best of both worlds, we'll try to - but ultimately, having the new resolver is going to be the priority.

Hopefully that's useful background. As I say, I'm going to drop out of this conversation now, as I don't think I can add anything more that's helpful.

@mmerickel
Copy link
Author

mmerickel commented Dec 18, 2020

No one is upset about the new resolver - it's great in my experience. My complaints are with the regression (whether people want to call it that or not - it's behavior that was documented in the community via blogs etc - that requirements.txt is ordered) and that with the new resolver interacting with --no-build-isolation it is causing an issue that didn't used to exist.

Anyway, I have managed to switch away from build isolation and it is not particularly simple or straightforward imo. Pip is just not very nice when it comes to unpublished build dependencies. My general workflow right now is this, for anyone who cares:

$ python3 -m venv env
$ env/bin/pip install -U pip wheel
$ env/bin/pip wheel -w wheels -r requirements/setup.txt
$ env/bin/pip install --find-links wheels -r requirements/src.txt -r requirements/dev.txt
$ # I now have packages installed in editable mode in my environment.

With respect to updating those packages (pip is forcing me here to use multiple requirements files, which it provides no features around updating, so I'm back to using pip-tools to manage those:

$ env/bin/pip-compile --find-links wheels --no-emit-find-links requirements/setup.in
$ env/bin/pip-compile --find-links wheels --no-emit-find-links requirements/src.in
$ env/bin/pip-compile --find-links wheels --no-emit-find-links requirements/dev.in

Takeaways:

  • Yay no build isolation.
  • Sad I still have to use pip-tools even more now than I did before.
  • Ugh a wheels folder in development.
  • Ugh find-links all over the place.

I hope this is helpful for anyone, I will close the issue as it doesn't sound like it's going to directly result in any changes to pip.

@bocklund
Copy link

bocklund commented Jun 5, 2021

I know this is closed, but I think @mmerickel was hitting on something that would be useful.

The "Expected Behavior" form @mmerickel in the OP:

When build isolation is disabled, pip should still try to install missing build requirements defined in [build-system] requires = and sort them appropriately so that they are installed prior to the package that needs them. If the dependencies are already installed then it should treat them as resolved and skip them.

and from @pfmoore:

it would be very hard to ensure we didn't disrupt the user's system environment if we started installing build dependencies into it.

It seems like the mismatch in what @mmerickel was saying and what some of the maintainers were saying is due to a miscommunication about the purpose of --no-build-isolation. --no-build-isolation is intended for the use case where the build environment is managed by the user. That's a good reason for --no-build-isolation to operate the way that it does.

I think the point of @mmerickel is that the user who doesn't understand this expects --no-build-isolation to disrupt the environment. From the naming of the option, it seems like --no-build-isolation is opting you in to that, i.e. build isolation promises that a user's environment won't be affected from a build, opting out of build isolation would affect a user's environment.

Therefore, I think the question isn't really about changing the semantics of --no-build-isolation, but rather: is there room for another option that will install the build dependencies into the the user's environment from pyproject.toml? Having the build requirements duplicated in requirements-build.txt and pyproject.toml is error prone because contributors have to remember to change the dependencies in multiple files.

@bocklund
Copy link

bocklund commented Jun 5, 2021

It seems to me like the main request in this issue is duplicated by these two open issues: #6718 and #9794 (which proposes just checking and not actually installing). Sorry again for adding more noise to a closed issue.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 27, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
C: build logic Stuff related to metadata generation / wheel generation type: support User Support
Projects
None yet
Development

No branches or pull requests

7 participants