Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

when using bundler-cache: true, no gem versions in log #105

Closed
jrochkind opened this issue Nov 9, 2020 · 21 comments
Closed

when using bundler-cache: true, no gem versions in log #105

jrochkind opened this issue Nov 9, 2020 · 21 comments
Assignees

Comments

@jrochkind
Copy link

bundler-cache: true is a great feature!

But when I'm using it, the job logs seem to lack the list of gems that were actually installed, even if expanded.

This is for a project that is a gem so does not have Gemfile.lock checked into repo.

When using similar functionality on travis (which I'm moving from), even when caching the bundle, you could still expand the logs to see actual bundler install output with individual gems, that might just have "Using" lines since theywere all installed something like:

...
Using rspec-support 3.9.3
Using rspec-core 3.9.2
Using rspec-expectations 3.9.2
Using rspec-mocks 3.9.1
Using rspec 3.9.0
Using rspec-rails 4.0.1
Using rubyzip 2.3.0
...

I have found this can be invaluable for debugging a build, or comparing why one build might be behaving differently to another -- you can look at the logs to see exactly which versions of all dependencies are installed. It's not common for me to expand logs of the build at all, but when I do this is the most common thing I'm looking at.

However, with setup-ruby with bundler-cache: true, it looks like:

Screen Shot 2020-11-09 at 5 18 46 PM

I can't see the actual list of gems installed. I'm not sure if this was intentional or a side effect of how you're doing things, don't totally understand what's going on.

But I find it a loss!

I suppose a workaround could maybe be uploading the produced Gemfile.lock as an artifact, since this could also be used to see exactly what version of all dependencies is installed.

Actually... as I think about how this bundle cache appears to be working... if it's actually skipping the entire bundle install and just using cached install if cache key matches... now I realize without checking Gemfile.lock into repo, does that mean I will not get new versions of dependencies that have become available on rubygems unless my Gemfile/gemspec itself changes? Oh no, that may mean this feature is unsuitable for me entirely.

@jrochkind
Copy link
Author

Aha, wait, I think I get everything I want if I include a bundler-cache: true AND ALSO (contrary to instrutions) include my own run: bundle install step somewhere.

I think I get:

  • really fast installs, it is using cached gem install
  • a log of the actual gems/versions under my manual run: bundle install step, normally listed as Using
  • And I think it's gonna get new versions of dependencies even if I don't have a Gemfile.lock checked in, and the cache key isn't busted.

So if I understand things right everything is good... but I wonder if the README instructions should be changed to not discourage a manual bundle install quite so hard?

@eregon
Copy link
Member

eregon commented Nov 9, 2020

bundler-cache: true uses bundle check instead of bundle install when the cache already exists.
That verifies all the gems are up-to-date, so semantically it should be exactly the same as bundle install.

bundle check is quite a bit faster than bundle install when only checking it seems, and I need a way to know if there was any gem not already installed (the exit status of bundle check gives that).

I think a good solution is to run the bundle list command so that will list the gem versions, in the case that all gems are already installed.

@eregon
Copy link
Member

eregon commented Nov 9, 2020

if it's actually skipping the entire bundle install and just using cached install if cache key matches...

It uses bundle check.
But from a quick test, that seems to use the old gem (compatible with the Gemfile) versions if installed, when there is no Gemfile.lock :/ I thought I tested that and it would fetch the latest versions but apparently not.
So probably we should always (or at least when there is no Gemfile.lock) run bundle install, and somehow detect if that installed extra gems that we need to persist in the cache.

@jrochkind
Copy link
Author

Thanks for looking into it! Yeah, I guess bundle check is so quick exactly because it doesn't actually go to rubygems to resolve dependency tree based on latest releases!

Hm, it does sound like different logic might be needed if no Gemfile.lock is checked into repo. If Gemfile.lock is checked into repo, it won't be a problem.

With logic as it is, I think it's okay to add a run: bundle install manually -- it will always wind up with the "right" Gemfile.lock (getting new dependencies), it just won't really be caching everything it could be. For a gem where you don't change the Gemfile itself much... the cache will usually just keep getting more and more out of date, and providing fewer and fewer "hits".

I guess if no Gemfile.lock is in repo... you still need to create cache-key out of the generated Gemfile.lock somehow? Not sure if that's too late to create the cache key, but if you could do that, it'd work?

@eregon
Copy link
Member

eregon commented Nov 10, 2020

With logic as it is, I think it's okay to add a run: bundle install manually -- it will always wind up with the "right" Gemfile.lock (getting new dependencies), it just won't really be caching everything it could be.

bundle check will generate the Gemfile.lock, so actually a bundle install won't do anything (except listing gems).

For a gem where you don't change the Gemfile itself much... the cache will usually just keep getting more and more out of date, and providing fewer and fewer "hits".

Note that GHA caches are removed after 7 days if they are not accessed:
https://docs.github.com/en/free-pro-team@latest/actions/guides/caching-dependencies-to-speed-up-workflows#usage-limits-and-eviction-policy

I guess if no Gemfile.lock is in repo... you still need to create cache-key out of the generated Gemfile.lock somehow? Not sure if that's too late to create the cache key, but if you could do that, it'd work?

Yeah, I thought about that too, it seems bundle lock should do that.

This is actually very important because it's not possible to replace or remove a cache on GitHub Actions.
So making the hash of the Gemfile.lock part of the key, even when there was no Gemfile.lock, seems the best.

I'll try that approach.

@eregon
Copy link
Member

eregon commented Nov 10, 2020

I have a fix for this on branch fix-caching-no-lockfile, but it unfortunately revealed a bug of jruby-head on Windows (jruby/jruby#6458).

@jrochkind
Copy link
Author

bundle check will generate the Gemfile.lock, so actually a bundle install won't do anything (except listing gems).

Oh, I see. I think this makes bundler-cache: true unsuitable for use with projects without a Gemfile.lock in repository. So long as a cache exists, it will never get updated versions of dependencies, unless you change the requirements in Gemfile/gemspec.

Note that GHA caches are removed after 7 days if they are not accessed:

The use case I am concerned about is not a project that seldom runs tests. Rather, we can imagine a gem project -- with no Gemfile.lock in repo. We can imagine that it has PRs and commits made to it fairly often, at least once a week, so the 7 day freshness is maintained. However, changes to the gemspec or Gemfile happen infrequently.

This is I think a common development pattern for gems. I am used to expecting that every time CI is run, it re-resolves the dependency tree, and possibly gets more recent versions of dependencies than in previous run. In this way you can find build failures caused by new releases of dependencies.

In fact, sometimes I schedule a build once every 7 days even if no changes have been made, exactly for the purpose of catching failures caused by updated dependencies. This would likely keep the cache from ever expiring -- but would be just running the same biuld over and over again because of the cache, it would not be getting newly released dependency versions.

So that's a problem.

bundle lock

I was not previously familiar with bundle lock but looking at docs and experimenting with it, I'm not completely following. Ah, in case of no Gemfile.lock present, run bundle lock before creating a cache key to restore from cache? Then create a cache key based on resulting Gemfile.lock, and proceed as normal: restore from cache if cache key hits; if not bundle install and store to cache under cache key, etc. Hmm. That could work?

I think the alternate approach that would be more like travis bundle cache is to always use bundle install instead of bundle check, and always write to cache at end of process (the installed gems may or may not have changed as result of bundle install, without worrying about it write them all to the cache). This would I think be somewhat slower, but maybe not as much as expected. In most cases the bundle install will find little new to install, so will be as fast as bundle check. The slowdown might be always writing to cache, depending on how long that takes. But this might be a simpler reliable approach, which could also perhaps be used only in case of no checked-in Gemfile.lock, so cases with checked Gemfile.lock keep working as-is.

@jrochkind
Copy link
Author

(It also occurs to me that some of this logic, including perhaps current implementation may require bundler 2, and not work with bundler 1? I'm not sure, if some features only work with bundler 2 that should be doc'd though. I do still have some projects using bundler 1).

@eregon
Copy link
Member

eregon commented Nov 10, 2020

In short: yes I understood your concern, and indeed it was not intended that the cache would not use the latest gem versions when there is no Gemfile.lock. Everything should work with Bundler 1 and 2, of course.
The fix is here if you want to take a look at the logic:
https://github.com/ruby/setup-ruby/compare/fix-caching-no-lockfile

@jrochkind
Copy link
Author

Thanks!

I'm also now really confused how bundler-cache works if I have a matrix with different gemfiles to test... but I'll just leave you to it for now, thanks!

@eregon
Copy link
Member

eregon commented Nov 12, 2020

Fixed in https://github.com/ruby/setup-ruby/releases/tag/v1.50.4

@eregon eregon closed this as completed Nov 12, 2020
@jrochkind
Copy link
Author

jrochkind commented Nov 12, 2020

Nice! For anyone curious, looks like this is the commit: bc0f274

I'm still trying to figure out the right way to get the Gemfile/Gemfile.lock used to be the right one when I am testing different gemfiles in a matrix (say to test with different Rails versions). Looking at the source... looks like I have to get BUNDLE_GEMFILE set before this action... instead of using bundle config set gemfile as i was currently doing, which was working very nicely for me otherwise but obviously can only be done once ruby/bundler are installed. :( But that's not about this ticket (which became not what it's title was anyway!).

@eregon
Copy link
Member

eregon commented Nov 12, 2020

Yes, if you want to use a Gemfile not at the root, there are 2 ways:

  • Use the working-directory input https://github.com/ruby/setup-ruby#working-directory if the gemfile is named Gemfile and just in a different directory
  • Set BUNDLE_GEMFILE before running this action via e.g. - run: echo BUNDLE_GEMFILE=foo/Gemfile >> $GITHUB_ENV.

@jrochkind
Copy link
Author

Thanks! I'm using appraisal, so my gemfiles are in ./gemfiles/ruby_50.gemfile, ./gemfiles/whatever_you_want.gemfile, etc.

I've been having trouble with BUNDLE_GEMFILE, with setting it from the matrix while making sure it stays set not just for bundle install but also for when I actually bundle exec things, in possibly various different steps. But I was just doing export BUNDLE_GEMFILE=, I didn't know about (and don't entirely understand) that >> $GITHUB_ENV thing, that might be the answer.

I thought I was so clever using the newer bundle config set gemfile instead of the ENV variable, but I guess not!

I wonder if it would make sense to have a gemfile or gemfile-path input that lets you specify the actual complete path not just the directory assuming Gemfile?

@eregon
Copy link
Member

eregon commented Nov 12, 2020

Yes, you need - run: echo BUNDLE_GEMFILE=foo/Gemfile >> $GITHUB_ENV, export BUNDLE_GEMFILE= only sets it for the current step (unlike TravisCI).
https://docs.github.com/en/free-pro-team@latest/actions/reference/workflow-commands-for-github-actions#setting-an-environment-variable

It might also be possible to set it from matrix values in the job env: key.
https://docs.github.com/en/free-pro-team@latest/actions/reference/workflow-syntax-for-github-actions#jobsjob_idenv

An extra input wouldn't be convenient because the variable needs to be set for bundle exec too.

@jrochkind
Copy link
Author

What if it picked it up from bundle config set gemfile as usually supported by bundler? But I guess >> $GITHUB_ENV works, now that I know about it. So many little details to figure out and get right on GH actions.

@jrochkind
Copy link
Author

(I wonder if just put the echo BUNDLE_GEMFILE=foo/Gemfile >> $GITHUB_ENV in the README as an example might be helpful? Perhaps even an example using a matrix context).

@eregon
Copy link
Member

eregon commented Nov 13, 2020

What if it picked it up from bundle config set gemfile as usually supported by bundler?

Doesn't seem realistic, we would need to guess where the config file is and parse it, which is a non-starter.
Bundler supports environment variables, and it's also faster than the config commands.

(I wonder if just put the echo BUNDLE_GEMFILE=foo/Gemfile >> $GITHUB_ENV in the README as an example might be helpful? Perhaps even an example using a matrix context).

Feel free to make a PR to improve the docs. Having a Gemfile not at the root is rather the exception, so I would show that in this section:
https://github.com/ruby/setup-ruby#caching-bundle-install-automatically

@jrochkind
Copy link
Author

Oh, I thought you could just execute bundle config get gemfile same as I was just using bundle config set gemfile to set it, but you know more than me.

It's definitely the exception for an app which may be the majority of use. For any gem that interacts with Rails, I find it pretty common to test under multiple versions of Rails, which generally requires multiple Gemfiles.

Thanks for info. I'll see about a README PR. I'm wondering if maybe in the 'matrix' section, since the main use case I am aware of for non-standard Gemfiles is that one, a matrix of gemfiles.

@khiav223577
Copy link

Yes, if you want to use a Gemfile not at the root, there are 2 ways:

  • Use the working-directory input https://github.com/ruby/setup-ruby#working-directory if the gemfile is named Gemfile and just in a different directory
  • Set BUNDLE_GEMFILE before running this action via e.g. - run: echo BUNDLE_GEMFILE=foo/Gemfile >> $GITHUB_ENV.

@eregon

Should I use working-directory input even when I have set working-directory to the defaults?

Ex:

defaults:
  run:
    working-directory: my_folder

I'm not quite understand what's really going on behind them.

@eregon
Copy link
Member

eregon commented Feb 9, 2021

@khiav223577 https://github.community/c/code-to-cloud/github-actions/41 would be a better place to ask such questions.
Yes, you need it, because that defaults only apply for - run:, not for - uses:.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants