-
-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Pin only minimum versions of pip and setuptools #2
Comments
Related: StackStorm/st2#4750 |
Looks like master is failing: https://travis-ci.org/github/StackStorm/st2/jobs/680685353#L1059 We just noticed our internal CI started failing last night with the same error. Looks like not pinning |
As the person who just had to fix this as a "hot fix". I've changed my mind and would prefer to keep this pinned in the future. Then, we can upgrade and fix "on our own time" |
There are many problems with unpinning though related to build reproducibility and consistency. And it's not only about pip. Even if we unpin, the https://github.com/stackstorm/st2packaging-dockerfiles are generated on a build time. This will still pin a version during the Dockerfile build and make pip version inconsistent across the repos. We started to pin pip and other python build tools following the operational knowledge and fires we fought before. Looks like @nmaludy just caught one of them after pip unpinning in master. What makes sense to me today, keeping in mind previous incidents:
This will give us more control and cyclic approach in pip maintenance, rather then more chances of sudden fire at anytime with pip unpinning. I'd prefer us to keep the maintenance minimal which would potentially win us more time to spend on other important activities. |
TL;DR
If we use our tools properly, we might be able to relax their version requirements a bit, take advantage of their new features, and allow our transitive dependencies to take advantage of the new features as well, all without breaking the build all of the time.
Background
PR #4895
This came up in PR StackStorm/st2#4895.
In that PR, pip and setuptools were pinned to specific versions:
The setuptools version being pinned that tightly caused a bug when a transitive dependency (read: a dependency of a dependency) was updated to only support Python 3. This happened twice in recent history:
First case
python-mistralclient
depended onosc-lib >= 1.8.0
(resolved to2.0.0
) which depended onopenstacksdk >= 0.15.0
(resolved to0.44.0
) which depended onfuturist >= 2.1.0
which does not support for Python 2Second case
kombu
requiresimportlib-metadata
>=0.18
importlib-metadata
is1.5.2
importlib-metadata
requiresconfigparser
>=3.5
configparser
is5.0
configparser
that supports Python 2 is version4.0.2
As noted in this PR to importlib-metadata, all of these dependencies properly support the
Requires-Python
metadata field.Example
It's easier to give an example than describe it in prose. This means that in a Python project's
setup.py
file, they can specify therequires_python
keyword argument tosetup()
, like so:That means that pip version 9 or later, when installing packages in a Python 2.7 environment, will ignore any potential packages that do not specify they can run in Python 2.7.
Building ST2 in Travis
However, some of our "build" (eg: CI) commands involve installing packages, but do not involve pip. We run
python setup.py develop
andpython setup.py install
in a few different parts of the `Makefile:st2common
driversst2common
metrics driversst2client
installationst2common
metrics drivers againAll of these commands don't rely on pip to install packages, they rely on setuptools to do so itself. However, the problem comes in when...
The problem
setuptools < 42 doesn't support
requires_python
keyword argumentThe setuptools project added support for
requires_python
in version 42.0.0.This means that when we pin setuptools to a specific version, we are guaranteeing that, when transitive dependencies of ST2 start relying on features of more modern versions of our tools, we will miss out on those changes, and our transitive dependencies will break our build.
Problems with other solutions
There are some potential workarounds.
Pin transitive dependency versions
We could pin the versions of the
futurist
andconfigparser
to versions that still support Python 2. This would ensure that both older versions of pip and setuptools do not break the Python 2.7 build.However, this approach is not scalable. We would have to track layers upon layers of dependencies, even though we do not directly depend on or import those packages. Furthermore, this problem is only going to get worse and worse as more and more projects drop support for Python 2.7. Each one will cause a build failure for us, and we will have to troubleshoot, solve, test, review, merge, and test every time that happens. This is not a good use of anybody's time.
Use
pip-compile
(from pip-tools) to catch dependency conflictsAs implemented in StackStorm/st2#4895, we now use
pip-compile
to check for dependency version conflicts across our entire installation. The output ofpip-compile
is much more complete and explicit than the previous output ofpipconflictchecker
.But this solution only checks that our dependencies don't conflict. Our setuptools commands will continue to try to install dependencies that are Python 3-only, even in our Python 2.7 tests.
The proposal
The only workable solution was to update setuptools to version
42.0.0
(or later). Now, both pip and setuptools respected project'srequires_python
field, and when run in Python 2.7 environments, only try to install packages that denote compatibility with that version of Python.However, pinning setuptools, or pip as well, to a specific version just creates the same situation in the future. If there is some other feature that is introduced to pip or setuptools, and any of our transitive dependencies start using that feature before we can update our build to rely on up-to-date versions of pip and setuptools, those changes will again cause our builds to fail.
Instead of pinning pip and setuptools to specific versions during the build, I propose that we instead only specify their known minimum versions. Specifying
pip >= 9.0
andsetuptools >= 42.0.0
would ensure that whatever version of pip is installed at least supportsRequires-Python
. As pip and setuptools release new versions that support new features, our builds would automatically incorporate those versions in, and our transitive dependencies could use those features as soon as they were released without breaking our build.Potential drawbacks
So what if pip or setuptools release versions that break things?
That's a valid concern, and it's something that has happened in the past, and it has also - wait for it - broken our build.
But I think pip and setuptools are used by nearly every single project that they have a pretty good test infrastructure, and excellent canaries if they do inadvertently release anything that isn't backwards compatible.
This isn't the only way for updates to break our stuff. We used to rely on the internals of pip being available at build time to parse
requirements.txt
files:Previous workaround
However, that must have broken when pip updated, because we I assume that's why we now parse
requirements.txt
files ourselves:Problems with the workaround
But note that parsing
requirements.txt
files ourselves also didn't work when I tried to use environment markers inrequirements.txt
files:I think that this is one reason why we think we have to specify a specific version of pip: because we import from the private module
pip._internals
.A better way
But we can actually refactor
fetch_requirements
to use the packaging library to parserequirements.txt
lines. Doing so would have automatically helped us parse environment markers instead of doing it ourselves (which, even with my improvement, is still brittle code that probably still has bugs).Not so fast...
However, the
dist_utils.py
file also comes with an ominous note:# NOTE: This script can't rely on any 3rd party dependency so we need to use this code here
There is no further definitive explanation for that note that I have been able to find. However, my current working theory is that when that code is being run, the virtualenv has been created and activated, but third party dependencies have not yet been installed via pip or setuptools, so all imports from third party packages will fail. We need a way to install dependencies during "build" time...
Using setuptools properly
After a little digging, I found that setuptools supports specifying build-time dependencies in the
setup_requires
keyword argument tosetup()
(copying and pasting here since there isn't an anchor to that section on the linked page):So, instead of writing all of this brittle NIH code, and instead of pinning pip to a specific version so we can import its private internals (and if that sounds gross to you - good, that's very much the intent!), I believe all we need to do is specify the
setup_requires
argument tosetup()
in our subpackages:and then specifying a minimum version constraint for both pip and setuptools.
Other Reasons
There are probably other reasons why we pin to specific versions, and I'd like a chance to investigate them fully and discuss them here. As demonstrated, some of our code is working around problems that have been solved in better ways by more recent versions of our tools, and we should use our tools as best we can instead of trying (and failing) to duplicate them.
I've realized that some of our inline comments and developer documentation are not as detailed as necessary to fully understand some of the constraints, so at the very least we need to get better at fully explaining ourselves in pull requests, developer documentation, but ideally in comments in the code. We cannot continue to carry around this information in our heads, we need to get better at communicating our rationale for major decisions in the code itself.
Possible compromise
Now, some of this might be a "bridge too far" for some people. Reproducible builds are becoming more and more important, and I think we should also support things like that.
To that end, one possible compromise would be to pin to specific versions of pip and setuptools only when we release a version of StackStorm. For example, ST2 v3.1 was build with
pip == 19.3.1
andsetuptools == 41.0.1
. And then during development, we would only specify a minimum version so we continue to incorporate the most recent versions of our tools as they are released. So the git branch for ST2v3.1.0
would pin to specific versions of pip and setuptools in theMakefile
:git checkout v3.1.0
and the git
master
branch would only specify minimum versions for pip and setuptools in theMakefile
:git checkout master
Updating and pinning the versions could be incorporated into the release process.
This would still allow us to achieve completely reproducible builds while preventing the development branch from breaking all the time.
Further Reading
pyproject.toml
- I need to read more up on this, but the Python ecosystem seems to be moving to usepyproject.toml
instead ofsetup.py
. This post goes through the problems with setuptools, the chicken-and-egg problem, and howpyproject.toml
solves these problems. I'm not done reading it yet (tons of interrupts all day), but the first few paragraphs looked promising. If this is The Way™️ we should go, then we will need to update our dependency updating scripts andMakefile
aroundpyproject.toml
files.The text was updated successfully, but these errors were encountered: