New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize defining methods on deeply included modules #2752
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
rb_clear_method_cache_by_class calls rb_class_clear_method_cache recursively on subclasses, where it will bump the class serial and clear some other data (callable_m_tbl, and some mjit data). Previously this could end up taking a long time to clear all the classes if the module was included a few levels deep and especially if there were multiple paths to it in the dependency tree (ie. a class includes two modules which both include the same other module) as we end up revisiting class/iclass/module objects multiple times. This commit avoids revisiting the same object, by short circuiting when revisit the same object. We can check this efficiently by comparing the class serial of each object we visit with the next class serial at the start. We know that any objects with a higher class serial have already been visited.
Avoids genereating a "throwaway" sentinel class serial. There wasn't any read harm in doing so (we're at no risk of exhaustion and there'd be no measurable performance impact), but if feels cleaner that all class serials actually end up assigned and used (especially now that we won't overwrite them in a single method definition).
We know that this is a heap-allocated object (a CLASS, MODULE, or ICLASS) so we don't need to check if it is an immediate value. This should be very slightly faster.
It's a Ruby bug if this ever happens check it as an assertion instead of paying the cost of the check every time.
Previously every time a method was defined on a module, we would recursively walk all subclasses to see if the module was included in a class which the VM optimizes for (such as Integer#+). For most method definitions we can tell immediately that this won't be the case based on the method's name. To do this we just keep a hash with method IDs of optimized methods and if our new method isn't in that list we don't need to check subclasses at all.
This also seems to (very slightly) improve performance of Ruby startup (even with nothing loaded). We can see the methods in question from
Before
After
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In Rails we discovered that a significant amount of time was spent defining named routes, methods on a module included by a lot of other classes. Checking profiling this time was mostly spent in
rb_class_foreach_subclass
andrb_class_clear_method_cache
. @tenderlove helped me investigate what was happening inside Ruby and we found that it was revisiting classes and clearing their caches multiple times since there could be multiple paths to them down the subclass tree. This could also be measured by a large increase inRubyVM.stat(:class_serial)
.This added several seconds to the bootup time of large applications like GitHub's and Shopify's.
Though we've improved this in Rails by avoiding some duplicate modules, defining these methods is still a little slow and Ruby should avoid doing duplicate work clearing caches.
This PR improves the speed of defining methods on modules which have already been included in other classes.
rb_clear_method_cache_by_class
rb_clear_method_cache_by_class
callsrb_class_clear_method_cache
recursively on subclasses, where it will bump the class serial and clear some other data (callable_m_tbl
, and some mjit data).Previously this could end up taking a long time to clear all the classes if the module was included a few levels deep and especially if there were multiple paths to it in the dependency tree (ie. a class includes two modules which both include the same other module) as we end up revisiting class/iclass/module objects multiple times.
This commit avoids revisiting the same object, by short circuiting when revisit the same object. We can check this efficiently by comparing the class serial of each object we visit with the next class serial at the start. We know that any objects with a higher class serial have already been visited.
(This would probably have even more of an improvement if mjit was enabled, since it will avoid clearing that cache when possible, but I didn't test that)
check_override_opt_method
This was another place we were calling
rb_class_foreach_subclass
recursively.Previously every time a method was defined on a module, we would recursively walk all subclasses to see if the module was included in a class which the VM optimizes for (such as
Integer#+
).For most method definitions we can tell immediately that this won't be the case based on the method's name. To do this we just keep a hash with method IDs of optimized methods and if our new method isn't in that list we don't need to check subclasses at all.
Benchmarks
Synthetic
This is probably an unrealistic scenario but best measures that we are avoiding duplicate work revisiting the same trees.
Before:
After:
Rails without duplicate module fix
This is test is a good approximation of this part of a very large real Rails 6.0 application's boot.
Benchmark script
Before:
After:
Rails with duplicate module fix
This is test is a good approximation of this part of a very large real Rails master application's boot. There's less impact because it's been patched to avoid the worst of the cache clearing (most of the runtime is now outside of that) but the improvement is still significant.
Before:
After: