New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to bootstrap setuptools from source #980

Closed
adamjstewart opened this Issue Feb 20, 2017 · 66 comments

Comments

Projects
None yet
@adamjstewart

adamjstewart commented Feb 20, 2017

Hi, I'm a developer for the Spack package manager. We've had a setuptools package that has worked fine for a while now, and the vast majority of our Python packages depend on it. However, I went to update to the latest version of setuptools and noticed that it now depends on six, packaging, and appdirs. Unfortunately all 3 of these packages also depend on setuptools. six and packaging can fall back on distutils.core if necessary, but the setup.py for appdirs has a hard setuptools import.

It seems to me like it is no longer possible to build setuptools from source. Is this correct??

@adamjstewart

This comment has been minimized.

adamjstewart commented Feb 20, 2017

An easy solution to this problem would be to replace:

from setuptools import setup

with:

try:                                                                            
    from setuptools import setup                                                
except ImportError:                                                             
    from distutils.core import setup   

in the appdirs setup.py. That's what six and packaging do. If you agree, I can reach out to the appdirs devs and see what they think.

@adamjstewart

This comment has been minimized.

adamjstewart commented Feb 21, 2017

Ok, I hacked our package manager so that it can still install setuptools. spack/spack#3198 contains all of the changes necessary to build setuptools from source now that the dependencies are no longer vendored. I understand the frustration that comes with having to maintain vendors of all of your dependencies; I just hope that you don't add any dependencies that cannot be built with distutils.core instead of setuptools.

@jaraco

This comment has been minimized.

Member

jaraco commented Feb 21, 2017

Can you use pip to install the dependencies from wheels (which don't require setuptools to install)? That's the recommended procedure. Or can your package manager install the wheels?

@adamjstewart

This comment has been minimized.

adamjstewart commented Feb 21, 2017

Unfortunately, a package manager that requires another package manager kind of defeats the purpose, don't you think? It looks like other developers have already contributed patches to fix appdirs and pyparsing so that they can be built without setuptools. Assuming the developers approve those patches, we should be good for now.

@jaraco

This comment has been minimized.

Member

jaraco commented Feb 21, 2017

Yes, I sort of see that. I worry this approach may conflict with the long-term strategy for setuptools, which is that it might require arbitrary dependencies, some of which require setuptools. In fact, an early goal is to move pkg_resources into its own packages, and that package would probably require setuptools to install. Is there a reason your package manager can't have pre-built versions of these packages available or vendor them itself (into the setuptools build recipe)?

@adamjstewart

This comment has been minimized.

adamjstewart commented Feb 21, 2017

Is there a reason your package manager can't have pre-built versions of these packages available or vendor them itself (into the setuptools build recipe)?

In general, yes, there is a reason. Our package manager is designed to install software in the world of HPC, where you may need to use exotic compilers like xlc to install something on a bluegene-q or cray supercomputer. Since we need to support hundreds of operating systems and dozens of compilers, we haven't spent much time on supporting binary package installation like Python's wheels or Homebrew's taps. We have plans to do so in the future, but it has been low priority. Obviously, the compiler/OS doesn't matter much for non-compiled packages like setuptools, but the mechanism would have to be the same.

We have dealt with circular dependencies before. For example, pkg-config depends on glib and vice versa. Luckily the developers realized this, and pkg-config comes with it's own internal copy of glib, much like setuptools used to come with its dependencies. This can be annoying since we end up having to add the same patches for glib to both packages, but it prevents a circular dependency which is nice.

We could theoretically vendor the dependencies ourselves. That seems like the easiest solution to me if we ever run into a setuptools dependency where distutils.core is insufficient to handle the installation. I see a lot of packages that use distutils.core as a fallback. Out of curiousity, how common is it to find a package that uses features of setuptools that are not available in distutils.core?

@jaraco

This comment has been minimized.

Member

jaraco commented Feb 21, 2017

I'd say it's fairly common. And some packagers might be depending on those features and not realizing it, because pip will pre-import setuptools and setuptools monkey-patches distutils, so even a package that only imports distutils.core will be installed with all the setuptools hooks when installed under pip. Of course, this all should change when PEP 518 lands and makes it possible to declare build dependencies and thus enable dropping the implicit invocation of setuptools.

@adamjstewart

This comment has been minimized.

adamjstewart commented Feb 21, 2017

@tgamblin @svenevs may also want to get in on this discussion.

@svenevs

This comment has been minimized.

svenevs commented Feb 21, 2017

Muahahahaha! @adamjstewart I hope this discussion pushes spack over the edge and you ditch the efforts of replicating PyPi. Things like numpy and scipy -- totally, keep them, if people pip install them then they are at fault (spack has never been easy, you can make minimal assumptions).

For things like ipdb or sphinx, there's literally no reason for spack to be providing those.

The below may not be appropriate for this thread, and some of it gets convoluted quickly. Please disregard that which you feel is not appropriate here, apologies in advance!

@jaraco is it correct that the general expectation will be

  1. A user installs python (aka their OS, which came with python in most cases)
  2. setuptools will never be packaged directly into this (the whole distutils split -- different conversation), but with the changes in #581 are now expected to install a pre-built version.
  3. This pre-built version should be independent of anything managed by pip in the ideal case.
    • AKA frown upon ye who pip install setuptools?

Is this correct, or should pip be used to install setuptools as the preferred approach on systems that have it available in both their native package manager and they received pip out of the box? The conversation got a little confusing on the other thread, setuptools is fundamental to so many things on pip, it seems reasonable to just assume it exists after a certain point.

Edit: I was browsing related issues, things seem clearer now. When PEP 518 is standardized, the pyproject.toml is the thing spack should ultimately be checking for setuptools? e.g. when pkg_resources splits, it will then be listed in the .toml for setuptools?

Do you happen to know if there is a "safe" way of configuring pip to not install / touch anything underneath the site-packages directory? E.g. only supporting virtualenv or ~/.pip? Adam and I had both looked into doing this, but it sounds like you are saying even if we found a way to prevent pip from behaving this way, that might cause problems (the "monkeypatching" business).

@tgamblin

This comment has been minimized.

tgamblin commented Feb 21, 2017

@jaraco: how do the plans for setuptools affect efforts by OS distro folks to create reproducible builds (podcast) of packages? It seems like these changes will make it very hard to validate a Python package entirely from source, and there are good reasons to make that possible.

Even if you have build dependencies and can install the build deps from wheels, that is still a binary in the chain that someone has to trust. So I worry a bit about this direction. I suppose that we could implement some bootstrapping logic that goes back to older versions of setuptools, if we need to, or we could rely on a baseline set of "trusted" wheels for setuptools (adamjstewart is right that we don't have binary installs yet, but they're not that hard to add). But I'd rather reproduce from source, and I suspect other distros would too.

@svenevs: other than reproducibility, the main issue with relying on pip in spack is that we need to be able to mirror builds on airgapped networks. Spack traverses a dependency DAG, downloads all the sources, and lets you build that on networks that aren't connected to the internet. If we rely on pip, we can't do that.

@adamjstewart

This comment has been minimized.

adamjstewart commented Feb 21, 2017

In general, requiring yourself as a dependency to bootstrap yourself leads to a lot of headaches. For example, we are having a lot of nightmares trying to package openjdk/icedtea right now. If you need a java compiler to build a java compiler, how do you install the first java compiler? C/C++ compilers generally come with the operating system, but I don't think java compilers do.

@jaraco

This comment has been minimized.

Member

jaraco commented Feb 23, 2017

suppose that we could implement some bootstrapping logic that goes back to older versions of setuptools, if we need to, or we could rely on a baseline set of "trusted" wheels for setuptools

I wouldn't recommend using older versions of setuptools. That's an unsustainable approach in the long run.

You could rely on trusted wheels, or you could even vendor your own bootstrapping logic. That is, you could write your own routine that resolves the setuptools build dependencies (or hard-codes them), builds them into a viable form for import, and injects them onto sys.path before building setuptools.

Hmm. This makes me wonder if setuptools should bundle its build dependencies. Rather than vendor its dependencies in general, it could bundle its build dependencies. I'll give that a go.

@jaraco

This comment has been minimized.

Member

jaraco commented Feb 23, 2017

I've drafted a new release in the feature/bundle-build-deps branch and released b1. Install it with pip install -i https://devpi.net/jaraco/dev/+simple/ setuptools==34.3.0b1 or grab the sdist and test it in your build environment.

I notice that this change won't affect the most common use-case, that of pip installing setuptools from source when dependencies aren't met. It fails because the setuptools shim is invoked before setuptools has a chance to inject its build dependencies.

@adamjstewart

This comment has been minimized.

adamjstewart commented Feb 23, 2017

This makes me wonder if setuptools should bundle its build dependencies. Rather than vendor its dependencies in general, it could bundle its build dependencies.

Can you elaborate on the distinction between the two options?

@jaraco

This comment has been minimized.

Member

jaraco commented Feb 23, 2017

Prior to Setuptools 34, setuptools would vendor its dependencies, including them as hacked-in copies of the dependencies. Thus import pkg_resources.extern.six would give the vendored version of six if present as pkg_resources._vendor.six or would fall back to the top-level six otherwise. It would do this even when installed. In this model, the dependencies were undeclared (nothing in install_requires).

With this new proposed technique, the dependencies are bundled into the sdist only to facilitate building and installing from source... but the dependencies are still declared as install_requires, thus relying pip or easy_install or whatever install those requirements, but providing a zip of those dependencies to use to build the package prior to installation. In the installed environment, there are no import hacks and the packages must have been installed properly and naturally.

@jakirkham

This comment has been minimized.

Contributor

jakirkham commented Feb 23, 2017

To summarize, this would allow us to build setuptools from an sdist on PyPI without other dependencies, correct?

@tgamblin

This comment has been minimized.

tgamblin commented Feb 23, 2017

With this new proposed technique, the dependencies are bundled into the sdist only to facilitate building and installing from source... but the dependencies are still declared as install_requires, thus relying pip or easy_install or whatever install those requirements, but providing a zip of those dependencies to use to build the package prior to installation. In the installed environment, there are no import hacks and the packages must have been installed properly and naturally.

@jaraco: Ok, so if understand this correctly, we could then declare setuptools to have no (circular) build dependencies, and it would install fine from source. That is good. Then we could use the installed setuptools to install other packages as needed.

One question: are any of the build dependencies also run dependencies? If they are, it seems like we'll still need to make sure we install the bundled run dependencies along with setuptools when bootstrapping, or the installed setuptools won't run.

@jaraco

This comment has been minimized.

Member

jaraco commented Feb 23, 2017

are any of the build dependencies also run dependencies?

Yes. At the moment, the build dependencies and install dependencies are identical. So you would need those installed to invoke setuptools... which I can see is problemmatic if you then want to install the setuptools dependencies from source and they require setuptools.

I guess if you're using setup.py install, that will invoke easy_install, which will grab the installed dependencies and install them, but I don't recommend that.

I guess what it's coming down to is that whatever installs from source needs a way to install setuptools and its dependencies in one operation (as pip does with wheels or the setuptools did when the packages were vendored).

@adamjstewart

This comment has been minimized.

adamjstewart commented Feb 23, 2017

I guess if you're using setup.py install, that will invoke easy_install, which will grab the installed dependencies and install them

Won't work on air-gapped clusters that have no connection to the outside world. We've been very careful about making sure that all of our packages can be installed without an internet connection.

@tgamblin

This comment has been minimized.

tgamblin commented Feb 23, 2017

I guess what it's coming down to is that whatever installs from source needs a way to install setuptools and its dependencies in one operation (as pip does with wheels or the setuptools did when the packages were vendored).

Ok, so I guess I have two questions:

  1. Could we have an option to install the bundled dependencies along with setuptools? That would get us started.

  2. Is that not basically the same as vendoring the dependencies? If so, what was so bad about vendoring, especially since setuptools is effectively a leaf in every DAG? I'm trying to figure out what problem is being solved by vendoring (feel free to point me at some other discussion...)

@dstufft

This comment has been minimized.

Member

dstufft commented Feb 23, 2017

It's different than just vendoring because it only takes place at build time, thus it allows people to say, upgrade the packaging that setuptools is using without having to also upgrade setuptools itself.

@tgamblin

This comment has been minimized.

tgamblin commented Feb 23, 2017

@dstufft: ok, so if we can get the bundled dependencies installed from source initially, so that we can use setuptools to install the other packages from source, does that fit the model, assuming the installed setuptools will use the newer versions when it finds them?

@FRidh

This comment has been minimized.

FRidh commented Mar 16, 2017

The Nix package manager is quite similar to Spack and so in Nixpkgs we've encountered the exact same issue. We prefer to build from source/sdist but we have also a function to build packages from wheels so that gives us a bit more flexibility.

Some time ago I looked into how we could bootstrap setuptools, now that it doesn't vendor its dependencies anymore. The easiest solution for us would be to use the wheel for all its dependencies, but we prefer not to because we ideally refer to a "true" source and not a build product.

I agree that pip and setuptools should not be merged. They really have different functionality and now we're also getting more methods for building Python packages, like e.g. flit, it becomes even more important to have them separated.

@jaraco

This comment has been minimized.

Member

jaraco commented Mar 25, 2017

Taking another stab at this issue, I have some thoughts, which ramble a bit, so I've broken it down into three segments.

The Two-Package Solution

I have another idea, almost a proposal. What if setuptools provided two packages, the current, natural one as is currently in place, and another, perhaps named setuptools_built or setuptools_bundle or setuptools_all, which is essentially a pre-built, dependency-free package (perhaps just setuptools with its dependencies vendored). This special package would be "build dependency" of setuptools itself and any of its dependencies. As a build-only dependency, installers would be advised only to install this package in a transient manner for the duration of the build step, enabling packages like appdirs and requests and pyparsing to rely on setuptools for their builds (as they did before). This package, much like Setuptools 33 and earlier, would be buildable and installable with distutils only from source.

This approach is potentially messy as it relies on installers not to install this package permanently, and might create weird behaviors that might prove difficult to troubleshoot (i.e. which setuptools is present when an error occurred). It does, however, nicely solve the bootstrapping problem (providing a bootstrapping package) and also elegantly solves the "which setuptools is required to install a given setuptools" problem, as setuptools would declare its own build dependency, allowing setuptools to eschew much of the logic in setuptools.sandbox which was created to support the self-install model.

To some extent, this idea is dependent on the implementation of PEP 518 and would required build system like Spack and system packages to honor that spec when building packages. That feels like a fairly small investment if it ultimately means that setuptools (and any other build tool) can declare its dependencies naturally, allow itself and its dependencies to depend on it (or other build tools), but can also own the responsibility for providing bootstrapping.

Build Systems Should Special-Case Build Dependency Resolution

Alternatively, these build systems could be adapted to have a special handling for build-time dependencies, such that if a package has a build-time dependency on setuptools (or other build tools), the resolution isn't to build/install those packages, but instead has a distinct mechanism for providing those packages. This approach would allow each build tools to have control over how these bootstrapping issues are addressed. Some might just vendor the known build dependencies (namely setuptools today, but sure to include others). Others might do some clever trick to resolve the dependency tree and put the sources on sys.path while building the build dependencies.

PyPA Provides Build Dependency Resolver

As I think about it more, perhaps this clever trick is exactly what we should develop - a tool for building and installing (transiently or permanently) a set of packages that are mutually dependent on each other. This tool would have special handling for working with source packages and making them available without installing them.

For the following example, I'll call this tool "source-resolver".

So for example, a system package builder wants to build 'appdirs'. 'appdirs' declares it has a build-time dependency on "setuptools" (any version), so it wraps the build step in call to source_resolver('setuptools'). That routine will retrieve the setuptools source, extract it to a temporary directory, add it to sys.path, and then recurse on its dependencies. It will do this in a context and undo its changes at the close of the context. In this context, the builder invokes the build steps on appdirs, producing the package (and possibly installing it).

source_resolver, like pip, could take an argument of a local directory where the sources are already present.

This approach would require that all build dependencies are importable from source without an intermediate build or install step. Or alternatively, the spec could be expanded to allow packages to declare which of their build dependencies require this special handling. Or maybe this routine is used as a fallback when a build fails using the normal process.


There's a lot going on here, so I'm going to break for now and let these thoughts continue to simmer.

@jimfulton

This comment has been minimized.

Contributor

jimfulton commented Apr 2, 2017

Reposted from distutils-sig:

Today, I ran into trouble working with an old project that had six pinned to version 1.1.0. The install failed because buildout tried to install it as 1.10.0 and failed because 1.10.0 was already installed.

The problem arose because six's setup.py imports setuptools and then imports six to get version. When Buildout runs a setup script, it puts it's own path ahead of the distribution, so the setup script would get whatever version buildout was running. IMO, this is a six bug, but wait, there's more.

I tried installing a pinned version with pip, using pip install -U six==1.9.0. This worked. I then tried with version 1.1.0, and this failed, because setuptools wouldn't work with 1.1.0. Pip puts the distribution ahead of it's own path when running a setup script. setuptools requires six >= 1.6, so pip can't be used to install pinned versions (in requirements.txt) earlier than 1.6. Six is a wildly popular package and has been around for a long time. Earlier pins are likely.

I raise this here in the broader context of managing clashes between setuptools requirements and requirements of libraries (and applications using them) it's installing. I think Buildout's approach of putting it's path first is better, although it was more painful in this instance.

I look forward to a time when we don't run scripts at install time (or are at least wildly less likely to).

Buildout is growing wheel support. It should have provided a work around, but:

  • I happened to be trying to install a 1.1 pin and the earliest six wheel is for 1..
  • I tried installing six 1.8. Buildout's wheel extension depended on pip, which depends on setuptools and six. When buildout tries to load the extension, it tries to get the extension's dependencies, which includes six while honoring the version pin, which means it has to install six before it has wheel support. Obviously, this is Buildout's problem, but it illustrates the complexity that arises when packaging dependencies overlap dependencies of packages being managed.

IDK what the answer is. I'm just (re-)raising the issue and providing a data point. I suspect that packaging tools should manage their own dependencies independently. That's what was happening until recently IIUC for the pypa tools through vendoring. I didn't like vendoring, but I'm starting to see the wisdom of it. :)

@kennethreitz

This comment has been minimized.

kennethreitz commented May 29, 2017

This (depending on six, appdirs, pyparsing, etc) really complicates packaging on Heroku, FWIW.

@jaraco

This comment has been minimized.

Member

jaraco commented May 30, 2017

@kennethreitz: Is that because Heroku builds these packages from source? Why not install from wheels using pip? What is the complication?

@dstufft

This comment has been minimized.

Member

dstufft commented May 30, 2017

@jaraco Heroku installs setuptools (and pip) by default into those environments, but people tend to also depend on libraries like six, appdirs, etc in their own projects hosted on Heroku. The unvendoring means that setuptools and the project code are now competing over who gets to define what an acceptable version is for these libraries to be installed with.

@FRidh

This comment has been minimized.

FRidh commented May 30, 2017

This is an issue every other distribution is going to face eventually as well. Therefore, I recommend that each build tool vendors its dependencies. At least setuptools should because to install a setuptools-based package one needs to import from setuptools.

@kennethreitz

This comment has been minimized.

kennethreitz commented May 30, 2017

@jaraco people do pip freeze > requirements.txt from older setuptools installations, which then conflicts with our installation.

Here's what I'm doing about it. heroku/heroku-buildpack-python#397

@jakirkham

This comment has been minimized.

Contributor

jakirkham commented May 30, 2017

Sorry if this has already been said, this discussion is quite long and have unfortunately lost the time to read it fully. That said, am still very much struggling with this issue on many fronts.

What if setuptools provided a mechanism to vendor in-place and then used that functionality for managing these dependencies? Given the discussion here it seems that vendoring needs to happen for setuptools, but also this needs to be done in a way where dependencies can be tracked and easily update. This could also be a feature for those that also need vendoring for other reasons. Admittedly vendoring is not a great thing to use, but some cases (like setuptools) seem to need it. Might as well make it less painful.

@jaraco

This comment has been minimized.

Member

jaraco commented May 31, 2017

With Setuptools 36, it once again vendors its dependencies.

@jaraco jaraco closed this May 31, 2017

@kennethreitz

This comment has been minimized.

kennethreitz commented May 31, 2017

\o/

@reinout

This comment has been minimized.

Contributor

reinout commented May 31, 2017

If I'm not too overenthusiastic... "when will it be released?" (I don't see it on pypi yet).

@jaraco

This comment has been minimized.

Member

jaraco commented May 31, 2017

Well, it should have been released automatically when I tagged the commit. I'll have to investigate why.

@jaraco

This comment has been minimized.

Member

jaraco commented May 31, 2017

Aah, there was an error in the deploy step due to the removal of requirements.txt.

@jaraco

This comment has been minimized.

Member

jaraco commented May 31, 2017

It should now be on PyPI.

@jakirkham

This comment has been minimized.

Contributor

jakirkham commented May 31, 2017

Seems to not have been deployed as that change was not on a tag. Would it make sense to add a 36.0.1 tag?

ref: https://travis-ci.org/pypa/setuptools/jobs/237980870#L381

@nicoddemus

This comment has been minimized.

Contributor

nicoddemus commented May 31, 2017

IMHO it would make sense to just upload the release manually using twine, as it was a deployment problem and not a problem with the package.

@jaraco

This comment has been minimized.

Member

jaraco commented Jun 1, 2017

That's weird. I definitely ran the commands to cut a release. Oh, I ran setup.py release which built the dists but didn't upload them. :/

@jaraco

This comment has been minimized.

Member

jaraco commented Jun 1, 2017

Using twine, as recommended, the dists are now (verifiably) in PyPI.

@jakirkham

This comment has been minimized.

Contributor

jakirkham commented Jun 1, 2017

Great thanks for doing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment