Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experiment with switching to the fully virtualised Travis infrastructure #1317

Closed
edmorley opened this issue Sep 3, 2017 · 5 comments
Closed
Assignees

Comments

@edmorley
Copy link

edmorley commented Sep 3, 2017

Hi!

I noticed as part of looking at travis-ci/travis-ci#8315 that the current Travis job is using their EC2 container infrastructure (sudo: false) and takes ~12 minutes to run. Whilst jobs start much faster than the fully virtualised GCE infra (1s vs 30s), the container environment has half the RAM and often suffers from CPU contention. See the comparison table here:
https://docs.travis-ci.com/user/reference/overview/

For one of my projects I found switching to the fully visualised GCE infra (sudo: required) halved the job runtime, so still resulted in a significant net time saving even after including the 30s bootup time penalty.

Anyway, thought I'd mention it just in case it helped you too :-)

@peternewman
Copy link
Member

Hi Ed,

Thanks for the input. It's interesting, looking back at our blame list for our Travis config, we've switched back and forth between container and GCE a few times, most recently we switched to container to get Trusty (which is now on both I see), before that we dropped sudo for reasons I can't remember (probably because Travis told us the other one would be faster).

The 12 minutes is actually a trick, as I re-ran one of the checks, due to a flaky test, which has confused the timing.

It turns out we actually have a semi-comparable state running at the moment:
Sudo required: https://travis-ci.org/OpenLightingProject/ola/builds/269918401
Sudo false: https://travis-ci.org/OpenLightingProject/ola/builds/266737662

Or a fairer test, without re-runs to confuse the numbers (the failures are near the end of each test run anyway):
Sudo required: https://travis-ci.org/OpenLightingProject/ola/builds/269320893
Sudo false: https://travis-ci.org/OpenLightingProject/ola/builds/266610252

There's caching to confuse things, and a slightly different selection of tests, but assuming the Travis total time is the sum of the individual active times, there's not much in it, given required is currently running an additional test.

The sudo false appears to be quicker though; I think we get more tests running in parallel in sudo false/container mode. I guess what we really want is a mix, with the short, simple jobs running in containers (or merge them into one single job, but then the failure of an individual job is less obvious), and the longer compile jobs running in GCE.

@edmorley
Copy link
Author

edmorley commented Sep 4, 2017

Ah sorry I'd misread and thought there was only one 12 minute job. If it helps, it's possible to mix and match container and GCE by putting the sudo: {false, required} inside the include directive (and for less duplication, common elements can be factored out to the top level, though be aware that this doesn't work for all properties).

We do something similar in the project I'm working on at the moment, where most jobs default to GCE but the quicker linters run on the container infra instead:
https://github.com/mozilla/treeherder/blob/b00031e419cfcb282273f31f84e585c9ca35c65c/.travis.yml#L1-L5
https://github.com/mozilla/treeherder/blob/b00031e419cfcb282273f31f84e585c9ca35c65c/.travis.yml#L17

@edmorley
Copy link
Author

edmorley commented Sep 4, 2017

One other thing - when comparing the relative runtimes of container vs GCE, be aware of travis-ci/travis-ci#8138.

peternewman added a commit to peternewman/ola that referenced this issue Sep 20, 2017
…iners for the shorter jobs as recommended in OpenLightingProject#1317

This should give us the fastest overall run time.
@peternewman
Copy link
Member

Thanks @edmorley we've switched to using a mix of GCE and containers depending on job run duration in #1322 . Closing this now.

@peternewman peternewman self-assigned this Oct 8, 2017
@edmorley
Copy link
Author

edmorley commented Oct 8, 2017

You're welcome! Have a great rest of weekend :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants