Skip to content
This repository has been archived by the owner on Apr 14, 2021. It is now read-only.

[WIP] Experimental Gemfile checksum feature #5584

Closed
wants to merge 1 commit into from
Closed

[WIP] Experimental Gemfile checksum feature #5584

wants to merge 1 commit into from

Conversation

hmistry
Copy link
Contributor

@hmistry hmistry commented Apr 13, 2017

Generate checksum for Gemfile and save to lock file. Use the previous…ly saved checksum in the lock file to compare against the current Gemfile checksum to determine if there were any changes made. If no changes, then skip resolution.

This is a quick hack to evaluate if there is any performance benefit of this feature. For a single rails gem Gemfile, it saved 130ms on my local machine (0.430s to 0.3s) to run bundle install.

Submitting PR for feedback (@indirect , @segiddins) and other possible performance tests to evaluate further.

Issue #4871

…ly saved checksum in the lock file to compare against the current Gemfile checksum to determine if there were any changes made. If no changes, then skip resolution.
@indirect
Copy link
Member

@hmistry thanks! I'm looking forward to seeing how this turns out.

@jules2689 I'm interested to see if this works for you, and if it saves any time on exec or install.

@colby-swandale
Copy link
Member

colby-swandale commented Apr 13, 2017

@indirect IIRC we were talking about this a while back and there was talk that this could be an issue for Gemfiles that has ruby that is evaluated? Does this need to have an opt-out/opt-in flag?

@@ -75,6 +75,10 @@ def run(options)
return
end


# EXPERIMENT: Evaluate benefit if we skip resolution by comparing Gemfile checksum against the checksum saved in the lockfile.
return if Definition.build(Bundler.default_gemfile, Bundler.default_lockfile, nil).checksum_matches?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will also skip installation even if none of the gems are actually installed.... I think maybe you might've meant something like the following?

diff --git a/lib/bundler/installer.rb b/lib/bundler/installer.rb
index cbfdddeaf..6474d182f 100644
--- a/lib/bundler/installer.rb
+++ b/lib/bundler/installer.rb
@@ -212,6 +212,8 @@ module Bundler
     end
 
     def resolve_if_need(options)
+      return if @definition.nothing_changed?
+
       if !options[“update”] && !options[:inline] && Bundler.default_lockfile.file?
         local = Bundler.ui.silence do
           begin

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I already mentioned to @segiddins but for documentation purposes, this did not result in any time improvement.

@jules2689
Copy link
Contributor

Gemfiles that has ruby that is evaluated

That would be a problem for us, yea. We do some juggling magic to support 2 Rails versions at a time.

Could this be used for the bundle check command maybe?


I'm interested to see if this works for you, and if it saves any time on exec or install.

@indirect I'm happy to test this out, but it needs a rebase on master. It's missing the work that we've put in. I rebased locally to get the numbers below.

1.14.6 master this branch
bundle install with all deps met 3.67s 3.4s 0.67s
bundle install with a missing dep 15-16s 15-16s ERROR

The big difference I noticed is that there is less output, which is likely saving a lot of time so 💯

ERROR

~/src/github.com/Shopify/shopify(master*) ➜ time bundle install
Could not find json_schema-0.14.3 in any of the sources

real	0m0.639s
user	0m0.542s
sys	0m0.057s

@jules2689
Copy link
Contributor

Gemfiles that has ruby that is evaluated

Actually, looking at the way we generate the checksum we can likely hit most use cases by supporting a separate gemfile lockfile (shopify's use case) via an env var*, combine that with an opt-out and we should be good.

* We have a single Gemfile so that people don't forget to add stuff, and resolve to separate lockfiles.

@@ -449,6 +451,10 @@ def warn_deprecated_git_source(name, repo_string)
EOS
end

def generate_checksum(gemfile_source)
Digest::MD5.hexdigest(gemfile_source)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't use MD5 because of FIPS :( #4989

@jules2689
Copy link
Contributor

Another thing to note here is that gemfiles can source other gemfiles, which has been suggested in many instances such as #183 (comment)

@hmistry
Copy link
Contributor Author

hmistry commented Apr 13, 2017

@jules2689 and @colby-swandale thanks for all your feedback. This quick hack does not account for all the edge cases and was simply done to evaluate what performance benefit we could get from this approach.
Looking at the initial data, thanks to @jules2689, there is a good gain so now the challenge will be to see how to address each of the edge cases and trade off with the time improvement.

@bundlerbot
Copy link
Collaborator

☔ The latest upstream changes (presumably #5718) made this pull request unmergeable. Please resolve the merge conflicts.

@colby-swandale
Copy link
Member

ping @hmistry is still being worked on?

@hmistry
Copy link
Contributor Author

hmistry commented Oct 3, 2017

@colby-swandale I haven't had a chance to work on this since then. It's more of an exploration to see if there's any improvement. Is this something needed soon?

@colby-swandale
Copy link
Member

No but let us know if you're not working on this anymore.

@agrim123
Copy link
Contributor

@hmistry Are you still working on this? If not, I would be happy to continue the work!

@segiddins
Copy link
Member

I had something a bit more advanced in https://github.com/bundler/bundler/compare/seg-definition-from-lockfile, you might want to take a look at that as well

@hmistry
Copy link
Contributor Author

hmistry commented Mar 14, 2018

@agrim123 No, I haven't had time to continue working on this. Please go ahead.

@agrim123
Copy link
Contributor

agrim123 commented Mar 16, 2018

@segiddins Can you explain what optional_groups here means exactly?
Why do we need static_gemfile option?
Is there a need to store as key-value pair for Gemfile, when the only checksum of the corresponding gemfile matters, as did here? I didn't quite understand this.

@segiddins
Copy link
Member

It’s needed because some people use dynamic ruby code in their gemfile, so simply checking the checksum matches is insufficient to guarantee a semantic match

@agrim123
Copy link
Contributor

@segiddins I had one more doubt.
If I add a gem manually to the gems.rb and then run bundle install, then the variable static_gemfile does not gets updated and thus no GEMFILE CHECKSUMS section is created. When do we want to save the checksum?

@segiddins
Copy link
Member

As things are on my branch, you need to manually call_static_gemfile!() in your Gemfile to opt-in to the behavior

@hsbt hsbt closed this Apr 14, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants