Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up travis by caching the conda install #8

Closed
tovrstra opened this issue Aug 31, 2017 · 18 comments
Closed

Speed up travis by caching the conda install #8

tovrstra opened this issue Aug 31, 2017 · 18 comments

Comments

@tovrstra
Copy link
Member

The following should be safe/needed for caching:

  • miniconda/LICENSE.txt
  • miniconda/bin
  • miniconda/conda-meta (meta info on installed packages)
  • miniconda/envs
  • miniconda/etc
  • miniconda/gcc (not sure why it exists)
  • miniconda/include
  • miniconda/lib
  • miniconda/lib64
  • miniconda/libexec
  • miniconda/sbin
  • miniconda/share
  • miniconda/ssl
  • ...

Not good for caching:

  • miniconda/conda-bld
  • miniconda/locks
  • miniconda/pkgs
  • miniconda/var

There may be others.

This may not be a good idea because of things not to cache. See also broadinstitute/viral-ngs#290

This is how to enable it: https://docs.travis-ci.com/user/caching/#Arbitrary-directories

Other things to consider for caching in Travis:

@matt-chan
Copy link
Member

The pip cache is only for the downloads I think. I'm not sure it'll cache the install itself (it would be better if it didn't, since it's likely to interact with other packages).

I'm currently just caching all of the miniconda directory. I'll remove the offending directories in the before-cache section.

@tovrstra
Copy link
Member Author

tovrstra commented Sep 1, 2017

Pip cache is only for downloads: https://pip.pypa.io/en/latest/reference/pip_install/#caching

@matt-chan
Copy link
Member

Okay, caching is implemented! It sped things up from 4min->3min in the python-cython-ci-example. Not great, but it's something I guess.

@tovrstra
Copy link
Member Author

tovrstra commented Sep 1, 2017

Good. I'll take a quick look.

@tovrstra
Copy link
Member Author

tovrstra commented Sep 1, 2017

I think the speedup is purely random. In the older runs, the download and install of miniconda just takes a few seconds. Just compare the following two and look at the variation in timing of the steps that remained the same:

With such variability, there is no point in profiling (and caching) I'm afraid.

@matt-chan
Copy link
Member

matt-chan commented Sep 1, 2017 via email

@tovrstra
Copy link
Member Author

tovrstra commented Sep 1, 2017

I'd guess CPU time versus wall time. Not sure.

@tovrstra
Copy link
Member Author

tovrstra commented Sep 1, 2017

Can we just postpone this for a while? This is not such an issue and it quickly clobbers the CI scripts.

@matt-chan
Copy link
Member

matt-chan commented Sep 1, 2017 via email

@tovrstra
Copy link
Member Author

tovrstra commented Sep 2, 2017

@matt-chan It seems that caching is fully working! So we can close this issue I guess.
There is just one small thing: the .travis.yml script is not self-contained at the moment due to one of the caching steps. (This is a source of mistakes when copying over .travis.yml files to other projects.)

@matt-chan
Copy link
Member

Okay, #13 should fix this?

I'm going to leave the issue open a bit more because I want to make sure our cache isn't being invalidated by spurious changes.

@tovrstra
Copy link
Member Author

tovrstra commented Sep 2, 2017

The current .travis.yml installs pip packages into the conda env, which gets then cached. This is not ideal because they will not get update once cached. We could do pip install --user --upgrade ... to avoid this issue. Pip caching can be made efficient as shown here: https://github.com/nickstenning/travis-pip-cache

That still has the disadvantage that wheels may accumulate over time. We have a wheel cleaning script in the HORTON repo to get rid of them: https://github.com/theochem/horton/blob/master/tools/qa/remove_old_wheels.py

@matt-chan
Copy link
Member

matt-chan commented Sep 3, 2017 via email

@tovrstra
Copy link
Member Author

tovrstra commented Sep 5, 2017

This is fixed.

@tovrstra tovrstra closed this as completed Sep 5, 2017
@dhimmel
Copy link

dhimmel commented Oct 12, 2017

I'm interested in speeding up our Travis builds that use conda environments. We specify our environment using an environment.yml, which is currently reinstalled every build. The part of the install process that takes the longest is Solving package specifications triggered by conda env create.

I just wanted to confirm that your caching method discussed above alleviates this "Solving package specifications" step (it seems like it)? Also if so, is 157318c the crucial commit to enable conda caching on Travis? Sorry if this is a bit off topic... this is the most relevant issue I could find!

@matt-chan
Copy link
Member

matt-chan commented Oct 13, 2017 via email

@dhimmel
Copy link

dhimmel commented Oct 13, 2017

Thanks @matt-chan!

The lines of interest are: 46-54 and 61-85.

For posterity / convenience, I've pasted those snippets below:

https://github.com/theochem/qcgrids/blob/ac5d1e263b8f3fe0e659f908ab8bc849970ec24f/.travis.yml#L46-L54

https://github.com/theochem/qcgrids/blob/ac5d1e263b8f3fe0e659f908ab8bc849970ec24f/.travis.yml#L61-L85

If you remove anything from your meta(environment).yml, you must invalidate the cache on Travis, or you must uninstall it in the Travis yml.

I wonder if there's a way to see if the changeset that's being built modified environment.yml, and if so wipe the cache. Referencing nest/nest-simulator#75.

@tovrstra
Copy link
Member Author

tovrstra commented Oct 13, 2017 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants