Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Start pinning doc and test requirements? #1566

Open
pganssle opened this issue Oct 29, 2018 · 17 comments
Open

Start pinning doc and test requirements? #1566

pganssle opened this issue Oct 29, 2018 · 17 comments
Labels
Needs Discussion Issues where the implementation still needs to be discussed. proposal

Comments

@pganssle
Copy link
Member

When making the issue #1565, I realized that if we don't do exact-version pins of the documentation requirements, some future person who wants to build old versions of the docs may start running into errors. Same for people trying to run old versions of the tests.

I'm thinking maybe we should start pinning all development requirements, and set up a bot to monitor them for changes and automatically bump them. That will make it much easier for people in the future to reproduce old behavior of the repo.

@pganssle pganssle added proposal Needs Discussion Issues where the implementation still needs to be discussed. labels Oct 29, 2018
@jaraco
Copy link
Member

jaraco commented Oct 29, 2018

My (often unpopular) opinion on pinning is that one should only pin if there's a known or expected reason to pin. You've given one

some future person who wants to build old versions of the docs may start running into errors.

That use case doesn't seem a very likely or strong one (as one could manually install older, compatible verisons)... or simply go back to Travis or RTD and find out what versions were used at that time.

In my experience, pinning versions is a much higher maintenance burden than addressing emergent concerns when they arise. But if you (or anyone) is willing to absorb that burden (or better, mitigate it), then I'm +0.

@pganssle
Copy link
Member Author

My (often unpopular) opinion on pinning is that one should only pin if there's a known or expected reason to pin.

In general I agree with this, and I strongly agree with it for libraries, but there are other reasons to pin the test dependencies as well. For example, the first person making a PR after a test dependency breaks something in our build will start getting broken CI, whereas if we have all our test dependencies pinned, the CI breaks on the PR that upgrades the dependency. No need to write a separate PR to fix the build, then rebase all the old PRs against it just to get CI working again.

That use case doesn't seem a very likely or strong one (as one could manually install older, compatible verisons)... or simply go back to Travis or RTD and find out what versions were used at that time.

I have recently revisited an old project that didn't pin its versions, and the tests were failing. It was very hard to figure out the versions that were used in the passing versions of the tests (though admittedly I did not have CI set up for this project). I still don't like the idea of relying on RTD and Travis to store this information.

In my experience, pinning versions is a much higher maintenance burden than addressing emergent concerns when they arise.

I can see why this would be the case, which is why I only advocate this if there's an out-of-the-box solution that provides a bot that will do the upgrades for us, but I think there's a bot that will do that. At that point, we'll probably end up merging a lot more PRs, but we can just merge all the "upgrade dependencies" PRs where the CI passes.

Ideally we'd be doing the same thing with our vendored dependencies, actually, so that we keep the vendored dependencies relatively up-to-date.

@benoit-pierre
Copy link
Member

No need to write a separate PR to fix the build, then rebase all the old PRs against it just to get CI working again.

Just a note: you don't need to rebase, just close and re-open the PR.

@webknjaz
Copy link
Member

webknjaz commented Apr 7, 2019

My $0.02:

Test env/CI/docs dependencies most definitely should be pinned. However, it's a good practice to keep them up-to-date.

Yet, it's no job for a human. We've got robots to do that dirty work... These two I saw other people using and used myself:

They just send PRs against master with bumped dependencies. Which triggers CI. Once CI is green, the human can just hit merge. That's it!

Well, if it's too boring, we can also automate merging on green CI...

@venthur
Copy link
Contributor

venthur commented May 15, 2019

Assuming that you're not sourcing the requirements.txt in setup.py, the rule of thumb is:

  • no version pinning in setup.py, except for known incompatibilities
  • always version pinning in requirements.txt

The requirements.txt defines a "deployment" where you actually want to know exactly what is installed (see the current tests failing) -- whereas the setup.py is something that gets parsed when you package is installed alone or as a (dependency's) dependency -- so pinning here would create a complete mess for packages that just use your package with pinned deps in setup.py

https://packaging.python.org/discussions/install-requires-vs-requirements/#install-requires-vs-requirements-files

If you don't mind, I'd try to create a PR with pinned versions.

venthur added a commit to venthur/setuptools that referenced this issue May 15, 2019
@pganssle
Copy link
Member Author

Assuming that you're not sourcing the requirements.txt in setup.py, the rule of thumb is:

  • no version pinning in setup.py, except for known incompatibilities
  • always version pinning in requirements.txt

This is not quite what we're talking about here, and in general we want two "requirements.txt" files, one that is abstract and follows the "setup.py" rules, and one that is a concrete "lock file" that locks in the test and documentation requirements and their transitive dependencies.

In any case, pinning the requirements is not the difficult part of this ticket, the more difficult part of this ticket is setting up some sort of automated mechanism for updating the requirements files, since we want both reproducible test and doc environments and to use recent versions of our dependencies.

@venthur
Copy link
Contributor

venthur commented May 15, 2019

I thought it was more about the bug that we stumbled upon yesterday when trying to fix #1761 -- where running tests in pristine master branch failed just because one of the dependencies was not pinned and tests suddenly failed because of incompatible changes in one of the dependencies?

I still believe that pinning everything in requirements.txt is a good or even necessary start -- although it is not as good as a proper lock file, which we for now not really have as an established standard. So even if this PR doesn't fix the issue, it should probably still be merged as it removes some moving parts.

@webknjaz
Copy link
Member

@pganssle may I suggest having constraints.txt with versions and hashes pinned while keeping only direct deps in requirements.txt files?

@pganssle
Copy link
Member Author

@webknjaz Yes, I always forget about constraints.txt, but I think that is the right separation here.

I still believe that pinning everything in requirements.txt is a good or even necessary start -- although it is not as good as a proper lock file, which we for now not really have as an established standard. So even if this PR doesn't fix the issue, it should probably still be merged as it removes some moving parts.

The way you have implemented it actually destroys information, because some of these things have minimum or maximum dependency requirements for a reason, and others are pinned to a specific value for a reason. Pinning all of them doesn't allow us to know which ones are safe to update and which ones aren't.

I think the way forward is to go with Sviatoslav's suggestion of a constraints.txt living alongside the requirements.txt, but it really needs to come with a way to update them, otherwise we'll just be indefinitely pinning old versions, and we probably won't remember to update our dependencies (this is already a problem with our vendored dependencies).

@venthur
Copy link
Contributor

venthur commented May 16, 2019

I don't think I agree, especially not with the constraints.txt part. It seems to me that you're trying to emulate something like a Pipfile (which is AFAIK not a standard yet).

But in the end it's your decision of course, I'd just like to point out that you have some responsibility here. Packaging (despite all the awesome effort of you guys) in Python is still very messy . People look at the specs and then maybe at some example code from the Packaging Authority only to find out that even they are not following the specs.

@webknjaz
Copy link
Member

@pganssle FTR I usually include constraints reference in requirements files so that it's automatically enforced:

-c constraints.txt

req1
req2
...

As for updating things, pyup provides a way of constraining things using comments: https://pyup.io/docs/bot/filter/

If that doesn't feel fitting the needs, this project may need to have its own robot and I may be willing to collaborate on that. But we'll need to come up with clear requirements first.

@pganssle
Copy link
Member Author

I don't think I agree, especially not with the constraints.txt part. It seems to me that you're trying to emulate something like a Pipfile (which is AFAIK not a standard yet).

If we do not freeze all dependencies (direct and transitive), we have the same problems as if we froze nothing. I'm not sure what problems a true lock file solves that pip freeze > constraints.txt doesn't solve, but I'm fairly sure constraining all direct and transitive dependencies solves the problem I'm trying to solve.

But in the end it's your decision of course, I'd just like to point out that you have some responsibility here. Packaging (despite all the awesome effort of you guys) in Python is still very messy . People look at the specs and then maybe at some example code from the Packaging Authority only to find out that even they are not following the specs.

What you linked to is not a spec, it just explains the differences between the kind of things people use install_requires for and the kind of things people use requirements.txt for. That document actually is making the same point that I'm making, but it is aimed at a different use case where it makes sense to divide the responsibility for the "abstract requirements" and "pinned requirements" between install_requires and requirements.txt.

In setuptools, it doesn't really make sense to do that, because we're not using the "extras" mechanism of specifying dev dependencies, we're actually using dedicated requirements.txt files for the different dev tasks you might do with setuptools, which means that for those specific applications, requirements.txt is playing the part that install_requires plays in the document that you linked to.

The constraints.txt file that @webknjaz mentions is another common way to use pip, where you specify both a requirements.txt file and a constraints.txt file, and the versions that are installed as a result of installing the requirements file are constrained by the versions pinned in constraints.txt. You can update the constraints file very easily by simply deleting it, running pip install -r requirements.txt in a clean virtualenv, then running pip freeze > constraints.txt.

@webknjaz
Copy link
Member

webknjaz commented May 25, 2019

FTR dependabot is now owned by GitHub so it's most likely a way to go but we'd need to convince them to improve Python support... Oh, it looks like they are integrating it directly: https://github.blog/2019-05-23-introducing-new-ways-to-keep-your-code-secure/.

@pganssle
Copy link
Member Author

So I was thinking and discussing this a bit at the EuroPython sprint, I think we'll probably want to go with something custom, at least for a little while. One problem we might have is that our constraints will be different for each version of Python we test against (e.g. we may be using pytest >= 5.0 on Python 3, but not on Python 2), but it is a bit of a pain to use separate constraints.txt files per version, so we will need to generate constraints.txt files for each version, then merge them with ;python_version == environment tags for each one.

Ideally we would also do some simple collapsing of requirements just to prevent these things from ballooning (e.g. merge together all the constraints that are the same for all versions, or for all versions >= a specific version).

@webknjaz
Copy link
Member

@pganssle it looks like dependabot now uses pip-tools. Maybe it's worth trying it own. The maintainers seem to be open to improvements too...

@pganssle
Copy link
Member Author

pganssle commented Sep 7, 2019

@webknjaz From what I've seen, dependabot introduces an enormous amount of spam into the PR queue, and each update is independent of the others. It would probably take a lot of work for it to be something I'd be happy with.

@webknjaz
Copy link
Member

So I was thinking and discussing this a bit at the EuroPython sprint, I think we'll probably want to go with something custom, at least for a little while. One problem we might have is that our constraints will be different for each version of Python we test against (e.g. we may be using pytest >= 5.0 on Python 3, but not on Python 2), but it is a bit of a pain to use separate constraints.txt files per version, so we will need to generate constraints.txt files for each version, then merge them with ;python_version == environment tags for each one.

Ideally we would also do some simple collapsing of requirements just to prevent these things from ballooning (e.g. merge together all the constraints that are the same for all versions, or for all versions >= a specific version).

A few updates here. I've been experimenting with an approach that seems to be close to what you were suggesting (as I understand now). Basically, it's a tox integration with a pip wrapper that automatically adds --constraint to pip install invocations if a corresponding constraints file exists. The name for that file is a combination of a tox env, python tag, platform, and arch.
With this, it's possible to have constraints specific to some of the platform+runtime combinations and would allow the constraints for one env not to influence or conflict with the constraints of others.
I've shown a partial demo to Pradyun and he hasn't pointed out any fatal flaws. The PoC I've got is not yet complete — I only have the installation side of the process but constraints generation/updates are not yet automated. The generation part needs access to multiple platforms that are not normally available on the dev laptops so I'm planning to explore producing the updates via GHA, under various VMs it has.

TBC.

@webknjaz From what I've seen, dependabot introduces an enormous amount of spam into the PR queue, and each update is independent of the others. It would probably take a lot of work for it to be something I'd be happy with.

Yeah but that's automatable now + GH is now alpha-testing a Zuul-like merge queue + we could specify monthly or quarterly updates if needed. The main showstopper right now is managing multiple constraints files which, when I get it right, will open new doors for us.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs Discussion Issues where the implementation still needs to be discussed. proposal
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants