Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

invalidate travis cache when packrat.lock changes #7246

Closed
maxheld83 opened this issue Jan 31, 2017 · 10 comments
Closed

invalidate travis cache when packrat.lock changes #7246

maxheld83 opened this issue Jan 31, 2017 · 10 comments

Comments

@maxheld83
Copy link

@jimhester has suggested a really cool way to cache packrat compilations, which makes the travis builds a lot faster (see #6504).

There's just one (small) problem: when the packrat.lock changes, the Travis cache is not automatically invalidated, resulting in build fails.
Once you delete the cache, and restart the build, things work out.

This isn't a huge problem, but a bit annoying and a stumbling block for new users, with packrat already being a little complicated.

Any chance this could be added to the default travis setup?

@BanzaiMan
Copy link
Contributor

I am unfamiliar with the tooling, so I need some background information.

How does the tool work with packrat.lock changes? Does it not detect the changes in it and invalidate the cache? For example, Bundler (for Ruby) would compare the cache directory content and the Gemfile (or Gemfile.lock) content and install gems that are missing. How does the R tool do it?

@jimhester
Copy link

I think this can be remedied by caching only the packrat/src and packrat/lib directories, rather than the entire packrat/ directory. Caching the entire packrat/ directory would overwrite the existing packrat/packrat.lock file with the one in the cache, which is causing the errors.

@BanzaiMan
Copy link
Contributor

OK. I guess that's an issue with user configuration and modifying the documentation, then?

@jimhester
Copy link

Yep

@maxheld83
Copy link
Author

thanks @jimhester for addressing this so quickly, amazing work as always.

I'm afraid this still does not work as I had expected, I probably did not phrase the issue correctly.


expected behavior:

I'd like to have travis to cache packrat/lib/, because otherwise the builds take too long, and also to rebuild (parts of) packrat/lib/, when new packages are added, removed or updated, as indicated by a changing packrat/packrat.lock.

So, I was hoping for something like this:

  1. Initial Commit: Set up a new project with packrat, and the .travis.yml, *travis takes a long time to build all packages from packrat/src to packrat/lib, as expected.
  2. Non-Packrat Commit: I make some changes that do not affect packrat/packrat.lock, and the build is faster, because packrat/lib can be used from the cache.
  3. Packrat-Changing Commit: I add some new package, editing packrat/packrat.lock and packrat/src; Travis rebuilds either all (easier) or only the affected packages to packrat/src.

observed behavior

  1. Initial Commit: Same as above.
  2. Non-Packrat Commit: Same as above.
  3. Packrat-Chaning Commit: Build fails with there is no package ... or something similar, because apparently, Travis used the old packrat/lib, which does not contain the missing package.

Only solution, for now, is to manually delete the cache, when I change the dependencies.


I apologise for nagging about this thing; not a lot of people appear to use packrat + Travis.
I'm also, admittedly, having a hard time wrapping my head around how packrat works, and then Travis caching in addition is a bit confusing.

This still seems like a fairly important piece of the reproducibility puzzle, and so far, I've failed to come up with a solution that works for me and colleagues.


Ps.: I have tried all of the below .travis.yml, all to no avail.

language: r

dist: trusty

sudo: false

install:
  - R -e "0" --args --bootstrap-packrat

warnings_are_errors: false

cache:
  directories:
    - $TRAVIS_BUILD_DIR/packrat/lib

before_script:
  - chmod +x ./_build.sh
  - chmod +x ./_deploy.sh

script:
  - ./_build.sh
  - ./_deploy.sh

notifications:
  email: false
language: r

dist: trusty

sudo: false

install:
  - R -e "0" --args --bootstrap-packrat

warnings_are_errors: false

cache:
  directories:
    - $TRAVIS_BUILD_DIR/packrat/lib
    - $TRAVIS_BUILD_DIR/packrat/src

before_script:
  - chmod +x ./_build.sh
  - chmod +x ./_deploy.sh

script:
  - ./_build.sh
  - ./_deploy.sh

notifications:
  email: false
language: r

dist: trusty

sudo: false

install:
  - R -e "0" --args --bootstrap-packrat

warnings_are_errors: false

cache:
  directories:
    - $TRAVIS_BUILD_DIR/packrat/lib
    - $TRAVIS_BUILD_DIR/packrat/src
  packages: true

before_script:
  - chmod +x ./_build.sh
  - chmod +x ./_deploy.sh

script:
  - ./_build.sh
  - ./_deploy.sh

notifications:
  email: false

@jimhester
Copy link

I am not sure why the cache with only the packrat/lib directory would not work.

I personally I am not sure this it is a good idea to begin with, and am not that familiar with packrat. I also cannot debug this further without a build log. If you want further assistance I would suggest opening a packrat issue.

@maxheld83
Copy link
Author

maxheld83 commented Feb 1, 2017

very sorry about the missing reproducible example @jimhester and thank you for taking the time.

I've now added a new minimal repo with respective build fails.

This is the failing build which should have worked, because the commit did include a changed packrat.lock and the necessary new files in /packrat/src.
However, Travis seems to have ignored this, just downloaded the past /packrat/lib and went with it.

As expected, test-script.R then fails, because it is missing the new package (babynames in this case).

Does that help make it more informative for you?

and/or should I still open an issue over at packrat?

@jimhester
Copy link

Please open an issue there, I don't know enough about packrat internals to know why it is not picking up the packrat.lock changes.

@kevinushey
Copy link

kevinushey commented Feb 1, 2017

To restore the packrat library given a lockfile, you need to call packrat::restore(). If pre-existing installed packages are discovered in the packrat/lib directory and match the current lockfile, they will be used.

I assume the simplest fix is to just run:

R -e "packrat::restore(restart = FALSE)"

or something similar to hydrate the library, rather than the --bootstrap-packrat hack. (This, of course, assumes that an up-to-date packrat installation is available somewhere)

maxheld83 added a commit to maxheld83/travis-cache-test that referenced this issue Feb 1, 2017
@maxheld83
Copy link
Author

thanks @kevinushey not sure why I didn't think of that myself.

It appears that we actually need both bootstrap and restore, because with only packrat::restore(), it won't work on the first run without any cache.

This now works:

language: r

sudo: false

install:
  - R -e "0" --args --bootstrap-packrat
  - R -e "packrat::restore(restart = FALSE)"

warnings_are_errors: false

cache:
  packages: true
  directories:
    - $TRAVIS_BUILD_DIR/_bookdown_files
    - $TRAVIS_BUILD_DIR/packrat/lib

script:
  - Rscript test-script.R

notifications:
  email: false

As suggested by @jimhester I'll open up an issue over at packrat to make sure this makes sense from their end, and will then add a PR on the docs.

In the meantime, I'll close this issue, because it's done from the Travis end of things.

Thanks so much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants