Please sign in to comment.
Pin all application requirements in requirements.txt
The list of top-level dependencies is moved to `reqirements-app.txt` which is used to by `make compile-requirements` to generate the full list of requirements in `requirements.txt`. `requirements_for_test` and releated make step are renamed to `requirements-dev` to reflect the fact that it should contain any dependencies that aren't used by the app itself including dependencies for tests, utility scripts and dev tools. Rationale --------- We've had a number of issues caused by unpinned dependencies (eg #607). They can cause things to fail at different stages depending on the root cause, but generally highlight a few problems with our current approach: * The latest version of unpinned packages are installed, so whenever there's a breaking change in a new package release we're forced to either update our code or pin the package alongside our top-level application dependencies. * `pip install` is run at different times. Eg Travis CI will install dependencies on each test run but our image builds will only run it if `requirements.txt` have changed. This means that libraries we're running the tests with could be different from libraries that are used by the deployed application. * Finding out which library release broke the build is hard because there's no easily available list of "known-good" versions. These could be extracted from the previous application container versions, but the process isn't documented. The common solution to this is to ship a full list of pinned dependencies. There are currently 3 ways to do this in Python: 1. Generating a `requirements.txt` file with `pip freeze`. This stores the full list of currently installed packages. 2. Using [pip-tools] to generate a full list of required packages from a list of top-level dependencies. pip-tools also provides utilities to upgrade package versions without modifying the files manually. 3. Defining dependencies in a [Pipfile] and using [pipenv] to manage pinned versions and the virtualenv. [Pipfile] seems to be the planned replacement for requirements files, however the tools appear to have some issues when applied to our repos. For example, [pipenv] doesn't appear to be locking dependencies of the VCS packages (eg utils and apiclient). It also doesn't detect non-semver package versions correctly (eg `functools32==3.2.3.post2`). We could fix the VCS dependencies issue by publishing our packages on PyPI, but my feeling is the tools aren't ready for production just yet. [pip-tools] appears to be more commonly used at the moment. It works by examining setup.py files of packages listed in `requirements.in` file and generating a full list of dependencies in `requirements.txt`. The main issue with pip-tools when applied to our current applications is that it's stricter than pip in resolving version conflicts. Some of the packages we depend on are locking the libraries we use directly in their own setup.py (eg `mandrill` requires `docopt==0.4`, while we generally use `0.6.2`). While `pip` would install the latest package version despite the conflict, pip-tools requires us to resolve the conflict, which in this case means relying on a fork of mandrill client. And pip's behaviour in this case isn't really a problem for us, since docopt isn't used when mandrill is loaded as a module. `pip freeze` is the simplest approach that doesn't require any additional tools, however it also isn't very usably without additional tooling. The main problems with using `pip freeze` directly is that it will save whatever is installed in the current virtualenv including the test packages and packages that have been installed manually and that it loses VCS packages, only storing the names. This is solved by wrapping `pip freeze` with a `make compile-requirements` step that: 1. Creates a temporary virtualenv. 2. Installs application requirements. 3. Runs `pip freeze` to get a full list of installed packages. 4. Modifies `pip freeze` output to maintain the list of application dependencies in the original format. The benefits of this approach is that it keeps the existing process of running `pip install -r requirements.txt` for application builds. It also requires only one command to be run whenever application dependencies change to regenerate `requirements.txt`. The known downsides and unsolved problems: * You need to remember to run `make compile-requirements` after changing `requirements-app.txt` * Since the virtualenv is rebuilt from the application requirements it will still install newer versions of all unpinned packages. The difference is that the process is explicit and changes will be visible in the `requirements.txt` diff. This allows us to either accept the new package version or ignore it by discarding the change. This does mean that if there's a package we want to hold back we'll need to do this repeatedly after each `requirements-app.txt` change. The benefit of this approach is that it makes us aware of new package versions. Another major point is that development requirements don't go through the same process but are now loading the full list of application dependencies. This in turn means that: * `requirements-dev.txt` can't list any of the packages present in `requirements.txt` since pip will complain about a duplicate record. We can either rely on the record in `requirements.txt` in this case or move the dependency to `requirements-app.txt` * While it's not possible for one of the dev requirements to override the application dependencies (pip's resolver prioritises the top-level dependency version) they do install additional packages that aren't present in the application deployment container. This is unchanged from the existing process but could potentially hide issues that are only discovered once the app has been deployed. [pip-tools]: https://github.com/jazzband/pip-tools [pipenv]: https://github.com/kennethreitz/pipenv [Pipfile]: https://github.com/pypa/pipfile
- Loading branch information...
Showing with 94 additions and 10 deletions.
|@@ -0,0 +1,15 @@|
|# For schema validation|