Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pyproject.toml (PEP518), build isolation (PEP517) and optional build dependencies #6144

Closed
paugier opened this issue Jan 17, 2019 · 50 comments
Closed
Labels
C: PEP 517 impact Affected by PEP 517 processing type: feature request Request for a new feature

Comments

@paugier
Copy link

paugier commented Jan 17, 2019

What's the problem this feature will solve?

Projects that have build requirements and optional build requirements can not use pyproject.toml because of the build isolation.

Example (https://bitbucket.org/fluiddyn/fluidfft): depending if mpi4py is installed, MPI extensions are built or not. It's very important to be able to install/use the package without MPI so mpi4py can not be added in build-system.requires.

If a pyproject.toml is used, we get ImportError when importing mpi4py in the setup.py even when mpi4py is installed because of the isolation! So the MPI extensions are never build.

Describe the solution you'd like

Something like this could work:

[build-system]
# Minimum requirements for the build system to execute.
requires = [
    "setuptools", "wheel", "numpy", "cython", "jinja2", "fluidpythran>=0.1.7"
]
# packages that are used during the build only if they are installed
optional = ["mpi4py", "pythran"]

Then mpi4py would be available during the build only if it is installed. We keep the advantages of the isolation (discussed in https://www.python.org/dev/peps/pep-0517/) but optional build dependencies are allowed.

Alternative Solutions

[build-system]
# Minimum requirements for the build system to execute.
requires = [
    "setuptools", "wheel", "numpy", "cython", "jinja2", "fluidpythran>=0.1.7"
]
isolation = False
@pradyunsg
Copy link
Member

You can use the --no-build-isolation flag.

@pradyunsg pradyunsg added the type: support User Support label Jan 17, 2019
@paugier
Copy link
Author

paugier commented Jan 17, 2019

From the help message "--no-build-isolation Disable isolation when building a modern source distribution. Build dependencies specified by PEP 518 must be already installed if this option is used." i don't think it is a solution !

The easy solution is to get rid of the pyproject.toml file. But it would actually be good to be able to use it (because of real build dependencies).

More generally, the solution can not be a pip option. There are many situations (for example install with tox) where you cannot use a pip option.

@pradyunsg
Copy link
Member

pradyunsg commented Jan 17, 2019

Here's a workaround I suggest to you: install all build-system.requires items that you want available manually (for eg, via pip install setuptools wheel numpy cython jinja2 fluidpythran>=0.1.7) prior to running pip install --no-isolation ....


Please include the pip version you are using and other details. They're in the template for good reasons. They're not in this template. Aha!

What version of pip are you using? You are mentioning PEP 517, but PEP 517 isn't out in a public release yet.

i don't think it is a solution !

If you disable isolation, the user has to manually install the build-time dependencies that they want to be available. The docstring is written assuming that all PEP 518 build-system.requires are indeed required, as defined in the PEP. Perhaps, the doc-string can use improvement.

There are many situations (for example install with tox) where you cannot use a pip option.

tox can: https://tox.readthedocs.io/en/latest/example/basic.html#further-customizing-installation

Honestly, if your CLI tool doesn't allow passing custom options to pip, I don't think that would be pip's problem.

@pradyunsg pradyunsg added the PEP implementation Involves some PEP label Jan 17, 2019
@paugier
Copy link
Author

paugier commented Jan 17, 2019

I get this (correct) behavior with the last release of pip (18.1). it is why I didn't mention it.

@dstufft
Copy link
Member

dstufft commented Jan 17, 2019

Adding more general features to pyproject.toml requires a PEP to extend the standard.

@paugier
Copy link
Author

paugier commented Jan 17, 2019

Really, this option --no-isolation is not a solution for us.

  1. I'd like to use the build-system.requires option of pyproject.toml. It would allow users to just do pip install fluidfft. Of course I don't want to use --no-isolation because with this option, build-system.requires is not taken into account and we loose all the benefice of pyproject.toml.

  2. We need to be able to detect in setup.py if some packages are installed and if they are installed to use them during the build.

@paugier
Copy link
Author

paugier commented Jan 17, 2019

tox can: https://tox.readthedocs.io/en/latest/example/basic.html#further-customizing-installation

Good to know! But once again, it is not a solution to this problem.

@paugier
Copy link
Author

paugier commented Jan 17, 2019

Adding more general features to pyproject.toml requires a PEP to extend the standard.

PEP 518 is still "Provisional". Could it be modified to add a way to declare optional dependencies?

@dstufft
Copy link
Member

dstufft commented Jan 17, 2019

Not likely, Provisional in this case pretty much just means it's accepted, but it hasn't been implemented/released yet so we might need to make small tweaks still in order to make it functional in the real world.

Adding new features is generally out of scope.

@paugier
Copy link
Author

paugier commented Jan 17, 2019

Then packages with optional build dependencies CAN NOT (and won't be able to) use pyproject.toml ?

it is a problem! it is not something completely crazy to have a try: import ... except ImportError: in a setup.py.

Note that the solution with build-system.optional seems very simple to implement.

@dstufft
Copy link
Member

dstufft commented Jan 17, 2019

I mean, then the question becomes, what if there are two things listed in optional? Do we need to install both or neither? Or can we install just one of them? Should we be trying to opportunistically install them and fail silently if we cant, or should they be opt in or opt out?

What if we want to have groups of optional dependencies? Should that be supported? All of the same questions asked above apply to that too.

None of these questions are, on the surface, super hard. But they all require sitting down and working through the options and making a decision about how they should function. That's something that deserves wider discussion.

By all means, you're free to try and raise a discussion on the discuss.python.org Packaging category or on distutils-sig to see if other people are open to amending PEP 518 for this feature. I just think it's more likely going to be the case that we want an additional PEP for that feature.

@pfmoore
Copy link
Member

pfmoore commented Jan 17, 2019

I agree with @dstufft. There's nothing fundamentally difficult to decide here, but it needs to be driven by use cases and a reasonable sense of what's important and what is generalisation for its own sake.

I see this as being a follow-up PEP that extends PEP 518, and not suitable as an amendment to PEP 518. If you want to propose and champion such a PEP, then as @dstufft says, you need to start the discussion on distutils-sig or the Discource category.

@dstufft
Copy link
Member

dstufft commented Jan 17, 2019

That being said, PEP 518 is released in pip now I think yea? We should probably move it out of provisional anyways.

@paugier
Copy link
Author

paugier commented Jan 17, 2019

It seems very complicated whereas the need is really simple.

It could be that the name "optional" is not appropriate. Maybe something like "used_if_importable" would be better. pip does not have to try to install packages listed in used_if_importable. For each package, if it is importable, it has to be importable during the build. Otherwise, nothing to do. It's both very simple and very useful.

I hope the build-system.requires is implemented in a way that package already installed are reused (with symbolic links in the isolation virtual env or something like this)? is it?

if yes, a simple implementation of build-system.used_if_importable could be as simple as adding at line 55 of pyproject.py:

if "used_if_importable" in build_system:
    requires = build_system.setdefault("requires", [])
    for package_name in build_system["used_if_importable"]:
        try:
            import_module(package_name)
        except ImportError:
            continue
        requires.append(package_name)

It is sad that this very useful feature pyproject.py, which has been presented as the new clean universal way to declare build dependencies can not be used for all packages having optional build dependencies.

For example, I think for mpi4py it's bad news. There is the function mpi4py.get_include(), which is mainly used only during the build. Without a proper fix of this pyproject.toml feature, all projects using mpi4py.get_include() won't be able to use pyproject.toml.

@pradyunsg
Copy link
Member

@dstufft Yeah.

@paugier Fair enough. I understand why the current functionality isn't enough for you. I agree with @dstufft here - then this'll need more discussion in distutils-sig or on discuss.python.org, since it affects more tools than pip.

@pfmoore
Copy link
Member

pfmoore commented Jan 17, 2019

@paugier But... if a project is "used if importable", then why isn't --no-build-isolation sufficient? You install all the mandatory requirements, and then you decide which of the "used if importable" ones you want to install. Once that's done, build your wheel.

I guess I don't follow how you expect pip to choose whether to install "used if importable" build dependencies? What criteria should we use? Your sample code sets requires to include things that are already present, which seems pointless because we then simply won't install them (because they are already there...). It's not like the value of requires is available outside of pip for further use.

Apologies if I'm being dumb here somehow. (Note that "to explain the use case clearly to people who don't understand the requirement" is another benefit of writing this stuff up in a PEP, by the way 😉)

@pganssle
Copy link
Member

pganssle commented Jan 17, 2019

But... if a project is "used if importable", then why isn't --no-build-isolation sufficient?

This was my thought as well, but I see where @paugier is coming from. They want the "hard dependencies" installed automatically, and the optional dependencies used opportunistically. Installing the "hard dependencies" separately requires either parsing pyproject.toml or specifying build dependencies separately.

Basically what they are asking for (which I think is reasonable) is a way for the "isolated" build environment still be created, but as an environment similar to a virtualenv created with --system-site-packages.

I don't think the PEP needs to be modified in order for pip to support this use case.

@paugier
Copy link
Author

paugier commented Jan 17, 2019

Let's take an example. Fluidsim is a package to run numerical simulations. it is used for research with very big simulations on clusters. it is also used for education by students with small simulations on laptops.

It would be good if users could install with just pip install fluidsim from a clean environment. build-system.requires of pyproject.toml would be very useful for that because we have build requirements.

Without pyproject.toml, the minimal installation with pip is something like

pip install numpy cython fluidpythran
pip install fluidsim

Then, we have 2 optional build requirements that are used only if they have been installed before by the user.

  • mpi4py: just used for MPI simulations (on clusters).
  • Pythran: a Python-Numpy compiler.

So that from a clean environment, i would like to have:

  1. pip install fluidsim -> no MPI support and no Pythran compilation
  2. pip install mpi4py; pip install fluidsim -> MPI support and no Pythran compilation
  3. pip install pythran; pip install fluidsim -> no MPI support and Pythran compilation
  4. pip install mpi4py pythran; pip install fluidsim -> MPI support and Pythran compilation

if we want to use pyproject (to get 1.), we would need for case 3. to tell users to install with:

pip install numpy cython fluidpythran pythran
pip install fluidsim --no-build-isolation

which is worth that what we have now without pyproject.toml. Users will run the first line and forget the --no-build-isolation which will lead to no Pythran compilation even though Pythran has been installed by the user just before.

To summarize,

  1. I'd like to use pyproject.toml because i don't want the user to care about cython or fluidpythran even if it is required by setup.py.
  2. but a build isolated from everything else than the strict build requirements is not an option.

We need to be able to tell pip not to isolate from some packages, in my case mpi4py and pythran, maybe with something like:

[build-system]
...
# packages that are used during the build only if they are installed
no_isolation_for = ["mpi4py", "pythran"]

@pfmoore
Copy link
Member

pfmoore commented Jan 17, 2019

Basically what they are asking for (which I think is reasonable) is a way for the "isolated" build environment still be created, but as an environment similar to a virtualenv created with --system-site-packages.

OK, thanks, I see what is being requested now.

That sounds like a new pip flag, that enables behaviour somewhere between the default build isolation and --no-build-isolation, that creates a build environment and installs the dependencies from pyproject.toml, but doesn't isolate that environment.

That's not what the original request said, though (it was asking for a new key in pyproject.toml)

I don't think the PEP needs to be modified in order for pip to support this use case.

The relevant PEP for build isolation is PEP 517, specifically here. For a pyproject.toml change, it would be PEP 518.

Agreed, the isolation change does not require a change to the PEP. The originally requested new key in pyproject.toml would, though. What pip currently provides for build isolation is PEP-compliant, but it's certainly reasonable that pip provide additional features above the minimum required by the PEP.

It's not necessarily simple to provide the new feature (pip doesn't use virtualenv, so there's no existing means to provide --system-site-packages but it could be done. It would need coding (along with tests and documentation), so it's basically a case of "PRs welcome".

@paugier
Copy link
Author

paugier commented Jan 17, 2019

"hard dependencies" installed automatically, and the optional dependencies used opportunistically

Exactly!

It seems to me that it would be much better to write this information (that some packages need to be accessible to setup.py opportunistically) in the repository (in practice in the pyproject.toml file) rather than to ask all users to add an option to the pip command line.

It's a property of the package not of one installation. And we have a file to contain such types of information, called pyproject.toml... Moreover, it would be nearly useless to install packages that need this feature without this feature.

The maintainers would get several installation issues just because users forget to use this unknown pip option, especially if we have --no-build-isolation (no build isolation and no automatic installation of hard dependencies) and another option ??? for just no build isolation (with automatic installation of hard dependencies)...

How would it work for dependencies? All dependencies would be installed with this mode even if they don't need that? If not, how would one add this option for one dependency?

@pganssle
Copy link
Member

It seems to me that it would be much better to write this information (that some packages need to be accessible to setup.py opportunistically) in the repository (in practice in the pyproject.toml file) rather than to ask all users to add an option to the pip command line.

This is not something I would expect you to want end users to do at all. It seems like it would be a terribly bad idea to have the capabilities of your deployed application depend on the build-time environment of the users, except in very rare cases.

I don't like the idea of opportunistic dependencies at build time - that means what gets installed depends on what happened to be installed when you were doing the build, which is something I think should be opt-in for end users. What may be useful would be a build-time equivalent of extras, so that you could maybe have something like pip install mypkg and pip install mypkg[mpi], which would also install the "mpi" dependencies into the build environment when doing a build.

I think the traditional way to do this is to break out the "optional" parts of your package into separate packages and make them extras dependencies, though I don't think that would work in the case of something like pyyaml, which for some time (maybe still) had a Pure Python version that was installed if you didn't have libyaml available at build time and a C extension that builds otherwise (that wouldn't matter here because it's not pip installable anyway, but you get the idea - an enhancement that can't easily be broken into a separate package).

@pfmoore
Copy link
Member

pfmoore commented Jan 17, 2019

It seems to me that it would be much better to write this information (that some packages need to be accessible to setup.py opportunistically) in the repository (in practice in the pyproject.toml file) rather than to ask all users to add an option to the pip command line.

OK, then if you want that you definitely need a PEP revision/update. Some things that will need to be considered:

  1. PEP 517 doesn't require build tools to offer an isolation mode like this, so the proposal needs to cover how tools that don't implement this sort of build isolation should behave. Either that, or PEP 517 needs to make significantly greater demands on the forms of isolation PEP compliant tools must provide. As an example of this, the pep517 project provides wrappers to implement build_wheel with build isolation, so a standard update will mean changes to that as well as to pip.
  2. PEP 518 needs an update to add the new standard key, explaining its semantics in a way that doesn't refer to any pip-specific behaviours. It also needs to explain it in a way that doesn't rely on PEP 517 (tools can implement PEP 518 and not PEP 517).

As a pip option, this is simply a quality of implementation option for pip. As a standard pyproject.toml key, it has significant implications for the whole ecosystem (in theory).

As another possibility, maybe there's an option to make it a pip option, but add a feature to pip to read project-specific options from a tool-specific [tool.pip] section in pyproject.toml? That would still be a lot of work, but it would be limited to affecting pip, and it could be handled within the existing standards.

How would it work for dependencies?

Good question.

Conversely, for your option, how would something like pip install mpi4py fluidsim work - given that pip makes no guarantees that it will install the 2 listed items in any particular order? I don't think we want the installed result to be different depending on an arbitrary decision that pip makes.

@xavfernandez
Copy link
Member

Optional build dependencies seem to me like a bad idea.
If A is an optional build dependency of B, then pip install A; pip install B and pip install B; pip install A would both end up with A & B installed but somehow the installed code of B would be different ?
This seems like a perfect recipe for painful debugging.

This would also means that the wheel cache should be deactivated for such packages. Or we would need to have different caches depending on the currently installed packages 🙅‍♂️

@pradyunsg pradyunsg added state: needs discussion This needs some more discussion type: feature request Request for a new feature and removed PEP implementation Involves some PEP type: support User Support labels Jan 18, 2019
@paugier
Copy link
Author

paugier commented Jan 18, 2019

Optional build dependencies seem to me like a bad idea.

Interesting that you think it's bad. Can you please propose a better approach?

I didn't experience so many "painful debugging" related to this aspect of some packages I work with.

If someone see that a feature is not present (for example no MPI support), it's pretty simple to rebuild the package (pip install fluidsim --force-reinstall --no-cache-dir) and it's solved. Of course, it is less simple than for simpler packages without optional build dependencies but it is not so bad. I think it is not the solution which is bad or too complicated for the use case, it is the use case which is complicated.

Note that optional build dependencies are quite common for non-Python dependencies. See for example how mpi4py chooses which MPI library should be used. Or how projects choose which Blas implementation should be used.

I use optional build dependencies for (1) MPI support and (2) Python-Numpy compilers (here, only Pythran).

  1. MPI and mpi4py: note that it is not possible to add mpi4py as a build dependency in a pyproject.toml file. There is no wheel available on PyPI for mpi4py and it's a deliberate choice of the maintainer (https://bitbucket.org/mpi4py/mpi4py/issues/96/linux-wheels-using-manylinux#comment-44905699).

Also pip is of course not able to install the MPI library so pip can not install mpi4py on machines for which the user has not previously install OpenMPI or another implementation.

Then, we can think about alternatives to my "bad idea" (build depending on installed packages) :

  • build depending on environment variables (same bugs, more complicated for the users)
  • build configuration files (same problems)
  • pip install fluidsim[mpi]. Could be ok but not so simple (because no wheel for mpi4py and pip can't install OpenMPI).

@xavfernandez How would you solve this problem then?

  1. Python-Numpy ahead-of-time compilers (for example Pythran)

I think it's going to be more and more important to be able to compile the same Python-Numpy code with different tools, for example to target different hardware.

So the package should be able to use opportunistically such tools. Note that such tools are quite new (not 100% sure every works everywhere) and can lead to long and memory consuming build. We really don't want to add pythran as a hard build dependency (except if we provide a wheel, but here I can't).

I agree that in that case, we could use pip install fluidsim[pythran], but we would have the same problems than with my solution:

pip install pythran
pip install fluidsim

(which is reasonable from the user point of view) would lead to no Pythran compilation. "This seems like a perfect recipe for painful debugging" as you say 🙂

@webknjaz
Copy link
Member

@paugier FTR I've hacked this via another PEP517 backend which under the hood tries doing pip install on the optional deps and ignores any failures: https://github.com/aio-libs/aiohttp/pull/3589/files#diff-522adf759addbd3b193c74ca85243f7dR21

@Ckarles
Copy link

Ckarles commented Dec 17, 2020

Hello,

I'm not sure about the current state of this subject, and although it's over my head and I'm unsure if it's relevant, I cannot read about this optional dependencies subject without mentioning how Gentoo handles them (start to look at "USE-Conditional Dependencies") https://devmanual.gentoo.org/general-concepts/dependencies/

@webknjaz
Copy link
Member

@Ckarles yea, when we started the discussion back at PyCon 2019 I mentioned those to folks. The most dynamic discussions are happening at https://discuss.python.org so I recommend subscribing to those and maybe participating in pushing the efforts forward :)

@uranusjr
Copy link
Member

uranusjr commented Dec 17, 2020

I’m sure build-time feature detection and conditional compilation has its use cases, but I believe it is, in the most majority of cases, a suboptimal design, and would encourage people wanting this behaviour looking into alternatives if possible.

Modern Python packages are always installed via wheels, which is also a distribution format. With conditional compilation, two wheels with the same name may contain logically very different things, which makes me feel quite uncomfortable. But maybe I’m just misunderstanding the wheel format here.

But the more important issue to me is that, the choice whether to include a functionality in a library should always be an explicit flag exposed to the user, instead of implicitly set based on the build-time environment. This is kind of a problem with pip, in fact; CPython’s configure script by default disables parts of the stdlib if the underlying C library is not available, e.g. OpenSSL for ssl. And eventually some user would mis-configure the system and compile CPython (probably via pyenv) without ssl, and pip is stuck in a place where we cannot distinguish whether people are disabling ssl intentionally or by accidently, and has to deal with confused users who cannot understand our error messages. This kind of support issues could have been eliminated at large if the choice to disable OpenSSL is made explicit instead.

Following this line of thought, I feel the best approach to the issue, if viable, is to split the features into separate packages, and manage them with extras.

  • foo-core holds essential things that must compile.
  • foo-feature-bar holds things that only compile if libbar is available. foo-core declares foo-feature-bar as a dependency of extra bar so people can pip install foo-core[bar]
  • A foo package that depends on “recommended” feature packages for people to use this by default instead of listing many extras.

Now this does not work too well in pip’s current environment isolation approach, since foo-feature-* packages would likely all need to declare build-time dependency on foo-core and cause it to be compiled multiple times. But this is more of a technical issue that can be improved purely with a better implementation, which can be much easier achieved than working on a specification that opens up problems described above and may require even more specifications to patch up.

This got a bit longer than I initially anticipated. Sorry for sticking with my rambling.

@webknjaz
Copy link
Member

@uranusjr I think the idea is that we need a better naming convention for wheels so that feature-flagged builds could be distinguishable.

@pfmoore
Copy link
Member

pfmoore commented Dec 17, 2020

I think the idea is that we need a better naming convention for wheels so that feature-flagged builds could be distinguishable.

That's indeed the key thing here, but no-one has yet come up with a suitable naming/tagging convention. That's the first step, and only once that's been done can we really start looking at adding tool support. Such a convention needs to be proposed as a PEP, and discussed with the whole packaging community, not just on the pip tracker.

@webknjaz
Copy link
Member

I'd like to share that I now solved the case we had in @aio-libs (and some other places) by using an in-tree backend that wraps the setuptools' one (which is documented https://setuptools.pypa.io/en/latest/build_meta.html#dynamic-build-dependencies-and-other-build-meta-tweaks). My implementation is somewhat more invasive, though.

The feature toggling interface is exposed via PEP 517 config_settings (pure-python=true / pure-python=true) that auto-inject Cython through hooks so it's only added for wheels/editables: https://yarl.aio-libs.org/en/latest/changes/#released-versions. Check out the Git repo if you're curious.

This case doesn't really need special tagging support for wheels since it's either a pure-python wheel of one of the platform-specific ones. And the Cython line tracing is only useful in the context of development/testing/contributing to the project itself, so it's not like people would be using that when installing from PyPI, especially with that they would hit the published wheels almost certainly.

@pradyunsg pradyunsg removed the state: needs discussion This needs some more discussion label Dec 20, 2023
@pradyunsg
Copy link
Member

I think I'm going to go ahead and close this out on the basis that the proper solution here is to have some mechanism to convey "features" in a generated wheel such that it is possible to indicate whether or not it is compatible with a more complex requirement specification than what exists already for Python packages.

As Paul said in #6144 (comment), this is something that needs wider design discussion than pip's issue tracker.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jan 20, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
C: PEP 517 impact Affected by PEP 517 processing type: feature request Request for a new feature
Projects
None yet
Development

No branches or pull requests

10 participants