Skip to content
This repository has been archived by the owner on Dec 20, 2022. It is now read-only.

Add step which automatically updates project dependencies #7

Closed
wants to merge 1 commit into from
Closed

Conversation

adiroiban
Copy link
Member

Problem

When a new dependency is added to Twisted, or when a newer version is required there is no way to run the patch/branch with buildbot as buildbot has an external way to update the packages.

Also, if you end up with 2 branches (master and ticket-branch) each strictly depending on a version, upgrading the dependency for the ticket-branch will break test on master.... and then if you want to re-run tests for an older commit... tests will break again.

Changes

A step was added for all builders which will call pip install --upgrade with a predefined buildbot-requirements.txt

A local cache is used to reduce network traffic.

This is just an initial code to start the conversation.

Limitations

The current step directly uses pip and only support installing source packages.

Instead of directly calling pip, we can also call python dependencies.py and then each repo can implement its own way of installing/updating the dependencies.

The current step requires all repositories to have a buildbot-requirements.txt file, it can be change to skip the step if a requirements / dependencies.py file is not present in the repository.

I went for pip... as it is much easier... but maybe we should go for dependencies.py so that we can also install binary packages.

@exarkun
Copy link
Contributor

exarkun commented Dec 4, 2014

The problem described here is real. Thank you very much for taking the time to write it up.

The proposed solution seems potentially workable but I'll note a couple drawbacks:

  • Introducing a buildbot-requirements.txt file will introduce a point of duplication/redundancy in our requirements declaration. We've been trying hard to avoid such redundancy.
  • A static requirements text file can't help with some other interesting aspects of this problem - for example, the fact that many of our requirements are given as ">= X" - which implies we're actually testing Twisted against several versions of that dependency. Currently, in some instances, we are - but in no formal way: we just test whatever happens to be installed on a buildslave and since buildslaves are manually configured it's quite easy for them to have different versions of libraries installed. 😉

Considering the work towards making setup.py correctly declare all of Twisted's dependencies, I wonder if a pip install twisted-checkout would solve the same basic issues you've outlined here but without introducing the duplication.

As far as testing against multiple versions of libraries goes, perhaps we can further extend the build factory (perhaps later) with additional build configurations that somehow introspect the dependency declarations and pick several (or ... all? or only the oldest and newest? or some other heuristic to save us from having a trillion trillion configurations) versions of dependencies to install instead of only the newest (as pip install ... will give us) versions to install and test against. We do need to be careful not to generate so many combinations that we can't actually afford to run the test suite anymore - but it would be nice to be able to at least pick a specific small number of configurations (as opposed to the current situation, where the configurations are random and sort of unknown).

@adiroiban
Copy link
Member Author

Then let's go for python dependencies.py and inside it we can do all kind of "advanced" actions.

I am aware of the in progress ticket for keeping dependencies in setup.py

I went for a dedicated buildbot-requirements.txt just to start the conversation.


There are more problem with a fixed buildbot-requirements.txt

For when we use it with sphinx-documentation / pylint or twistedcheker builders. Do we still want to install the same dependencies as when we test the code?

For osx or windows slaves, do we want to run the same dependencies?

On my project (which is based on Twisted and runs on Win/Linux/OSX/Unix) I use something based on python dependencies.py and builder name is passed as argument.

Builders are named based on a scheme which contains the OS and purpose in their name.
In this way on Windows I can install a set of dependencies, on Linux another and for Selenium tests other set.

So pip install twisted-checkout will not work... or it would need to be smart enough to detect the current OS and for example to install twisted[win32] when executed on Windows.

This is why I think that we should go for python dependencies.py

Then dependencies.py will read environment variables, detect on which builder is called and install the requirements.


For running testing with various versions of a dependency, I think that this should have its own ticket/discussion... and not block this step just because it does not support running the tests against a range of dependency versions.

Running test against random versions is bad... as once a test will fail it would be hard to re-run it with the same version.

In case Twisted is required to run with PyOpenSSL v1, v2, v3 I think that the best way to do it is to create dedicated builders for each environment.

Just like we have epoll, select, select-gc we should add pyopenssl1 pyopenssl2 ... etc builders ....

There is not much sense to declare that a dependency version is supported, but a test against it is not always executed before merge.

To simplify configuration, for my project I define an environment in the following way... inspired by travis CI.

I have the base slave which is a Ubuntu/Windows/etc minimal install.
Slaves are named bs-ubuntu1204 bs-ubuntu1404x86 bs-win7 bs-winxp

Environments are named:

  • linter (runs twistedcheker / pyflakes / apicod / sphinx ) in a single builder on a single OS
  • epoll
  • select
  • no-modules

Beside the name, environments are just a list of environment variables injected when tests is executed.
Then slave are combined with environment to create the final builders.

  • linter -> [bs-ubuntu1204]
  • epoll -> [bs-ubuntu1204, bs-ubuntu1404]
  • select -> [bs-ubuntu1204, bs-win7]
  • no-modules -> [bs-ubuntu1204]

I don't know if this make sense... sample code is here https://gist.github.com/adiroiban/04a4f7f10d9154e2fd78

Do to this, we might need to simplify the steps required to run a test and move some logic outside of buildbot configuration.
In my project, buildbot just run a standard set of get code/do cleanup/get deps/run tests and then the logic of selecting dependencies to be installed and tests to be run is comprise in the repo.

I have done something similar in the travis-ci integration branch https://github.com/chevah/twisted/pull/1/files

@tomprince
Copy link
Contributor

Most of the slaves are using system packages, and don't have permissions to install or updated packages, so this won't work as-is. If we are going to try and have buildbot install dependencies, then it would probably make sense to create and populate virtualenv's for each build.

I think there is value in testing against system packages, though. So the virtualenvs should be new builders? Or maybe the existing builders want to be system-site-packages virtualenvs, so that additional dependencies can be installed.

@exarkun
Copy link
Contributor

exarkun commented Dec 4, 2014

Then let's go for python dependencies.py and inside it we can do all kind of "advanced" actions.

This means that half of the logic is inside Twisted and half of the logic is inside the BuildBot configuration. Does that make sense? Why not put all of the logic in one place or the other? (And BuildBot seems like the place since it's very difficult to define what builders exist in a file inside the project being built.)

@exarkun
Copy link
Contributor

exarkun commented Dec 4, 2014

Most of the slaves are using system packages, and don't have permissions to install or updated packages, so this won't work as-is.

Won't pip see that the dependency is satisfied already and skip any attempts to install a new version? Or are there still problems where OS-supplied dependencies aren't recognized as satisfying some distribution requirement?

Regardless of that, though, I agree that using virtualenv makes much more sense than trying to make it possible for slaves to install packages system-wide.

@adiroiban
Copy link
Member Author

I assume that Linux slaves are using system packages... or this also include OSX and Windows.

I think that we start with system-site-packages virtualenvs ... we can try one builder and see how it works. Should I create a ticket...err pull request ?


Then let's go for python dependencies.py and inside it we can do all kind of "advanced" actions.

This means that half of the logic is inside Twisted and half of the logic is inside the BuildBot configuration.

Buildbot just execute a fixed set of steps for all builders (get code / clean code / get deps / run test) and the repo decide what to install and how to run tests based on builder environment variables.

... I am still struggling to create dynamic steps in buildbot.... so that the repo could also decide what steps to run.

@exarkun
Copy link
Contributor

exarkun commented Dec 4, 2014

Buildbot just execute a fixed set of steps for all builders (get code / clean code / get deps / run test) and the repo decide what to install and how to run tests based on builder environment variables.

The trick is "for all builders". How many environments are there to test? If you don't know until you run dependencies.py, how do you know how many builders to configure?

@glyph
Copy link
Contributor

glyph commented Dec 4, 2014

Some of Twisted's dependencies are not Python packages, which makes the virtualenv solution incomplete. build-essential, libffi-dev, and libssl-dev all need to be installed at a basic level, but also, things like PyGTK are challenging to squeeze into a virtualenv setup.

That said, I think the vast majority of Twisted's users are doing virtualenv or something like it at this point, so having a good solution that requires less work for the 80% case would mean we could focus more careful attention on the edge-cases.

@glyph
Copy link
Contributor

glyph commented Dec 4, 2014

I think my previous comment was more or less in support of the "half of the logic is inside Twisted and half of the logic is inside the BuildBot configuration" situation that @exarkun was previously arguing against. Right now, I think that about 20% of the logic is inside Twisted and about 80% of the logic is in various different parts of the BuildBot configuration.

The main thing I'd like to do is to eliminate the parts which are actually in both places. Many dependencies are specified within the Twisted setup.py and then "specified" again by somebody running some manual commands on a builder host. If we could just get rid of that part, and have the implicit dependencies (native libraries, for example) be manually provisioned but the explicit ones (python packages) be automatically fetched, I would be very happy.

@exarkun
Copy link
Contributor

exarkun commented Dec 4, 2014

Some of Twisted's dependencies are not Python packages, which makes the virtualenv solution incomplete. build-essential, libffi-dev, and libssl-dev all need to be installed at a basic level, but also, things like PyGTK are challenging to squeeze into a virtualenv setup.

I was avoiding saying anything about Docker to try to keep scope creep for this issue to a minimum. 😄 Ultimately I do think something like that is the right direction to go - but figuring out what our version support matrix actually is and figuring out how to express that to buildbot is a necessary first step whether we're talking about virtualenv or more fully-featured containers.

The main thing I'd like to do is to eliminate the parts which are actually in both places. Many dependencies are specified within the Twisted setup.py and then "specified" again by somebody running some manual commands on a builder host. If we could just get rid of that part, and have the implicit dependencies (native libraries, for example) be manually provisioned but the explicit ones (python packages) be automatically fetched, I would be very happy.

So... As a first step... I think this means making the build slaves create a virtualenv for the test run and pip install <twisted checkout> to set it up. Perhaps the virtualenv is a use-system-site-packages virtualenv or perhaps we want some mix of configurations so we can test both system packages and other stuff. But ultimately this means getting setup.py into good shape wrt requirement declarations.

@glyph
Copy link
Contributor

glyph commented Dec 5, 2014

I was avoiding saying anything about Docker to try to keep scope creep for this issue to a minimum. 😄 Ultimately I do think something like that is the right direction to go - but figuring out what our version support matrix actually is and figuring out how to express that to buildbot is a necessary first step whether we're talking about virtualenv or more fully-featured containers.

I think that in terms of our longer-term, broad-spectrum support matrix, we necessarily need to be talking about both.

So... As a first step... I think this means making the build slaves create a virtualenv for the test run and pip install <twisted checkout> to set it up. Perhaps the virtualenv is a use-system-site-packages virtualenv or perhaps we want some mix of configurations so we can test both system packages and other stuff.

We have a lot of build slaves and their configuration is already pretty divergent. One nice incremental step would be to make at least one buildslave use a pip- and possibly virtualenv-based workflow.

There are two potential benefits here: one is making the process of contributing a change that adds a new dependency easier for contributors, and another is that this will make maintaining the buildbots easier for the administrators. While the former requires a complete conversion over the build farm over to an administrative regime with documented semantics – in other words converting over all of the builders – every additional builder that we move over to managing more dependencies with pip eases the administrative issues associated with keeping the fleet running, even if we do them one at a time.

So for the first pass I think we should focus on the latter and just get the requisite infrastructure in place to install things with pip and virtualenv.

But ultimately this means getting setup.py into good shape wrt requirement declarations.

Luckily it seems to me like https://twistedmatrix.com/trac/ticket/3696 is making a lot of progress and may be quite close to merging. In fact I may be able to review it tonight.

@adiroiban
Copy link
Member Author

Many thanks for the feedback.
I will start with creating a virtualenv builder for pyflakes test as they should be much simple than a full test.

If all is ok we can extent to other builders... twistedchecker/documentation... etc

@adiroiban
Copy link
Member Author

virtualenv builder was created for twistedchecker and is now in production.

I will close this as it is no longer relevant.

I will follow up on twisted as the important infra part is done.

Thanks!

@adiroiban adiroiban closed this Jun 16, 2015
@glyph
Copy link
Contributor

glyph commented Jun 17, 2015

Thanks a bunch @adiroiban !

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
4 participants