-
Notifications
You must be signed in to change notification settings - Fork 21.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert AttributeMethodMatcher to Module Builder #30895
Conversation
Thanks for the pull request, and welcome! The Rails team is excited to review your changes, and you should hear from @eileencodes (or someone else) soon. If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes. This repository is being automatically checked for code quality issues using Code Climate. You can see results for this analysis in the PR status below. Newly introduced issues should be fixed before a Pull Request is considered ready to review. Please see the contribution instructions for more information. |
@@ -38,6 +38,7 @@ def self.columns_hash | |||
end | |||
|
|||
def test_define_attribute_methods | |||
skip "implementation changed, need to update test" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am skipping the two tests in this file because they are no longer relevant given the changes made here. I intend to replace them but decided to first get the entire change looked over once before continuing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not skip these tests just to get this green. Otherwise someone might accidentally merge and then we're not testing correctly. A failing build is a good indication that this PR is not ready.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need to fix them right now, just don't skip them either please.
e5ce883
to
7342c88
Compare
I think you're substantially underestimating the outcry that would follow such an apparent "bloating" of the inheritance chain: the number of modules we include is already pretty controversial, and they're currently carrying a lot more functionality per include. A little further from straight "people would complain", I'm legitimately concerned about supering through so many Perhaps it's worth comparing a less textbook-pure version, where we still use the module instance to contain things, but handle all the matchers in a single instance? As for only needing one instance per inheritance chain, it sounds like we might need more tests around behaviour when |
That may be true, but personally I don't see the issue with inheritance "bloat" if there are no performance or code complexity implications. In this case I don't see those implications (I don't think the supering through But another point is that with this change,
Yes, this is a legitimate concern, but I don't see any performance implications at least. I'm making this PR to identify if there are show-stopper issues with this idea, though, so if you have a concrete example of where this would cause problems I'd like to discuss it. The direction that I'm pushing here is (for one) separating AR from AM more cleanly, as well as making the attribute methods code more modular. I personally think this outweighs any concerns about a longer inheritance chain, but that's my personal view. |
The other point to be made about the inheritance chain is that although having an extra dozen or more modules may trigger outcry among some, for others (like myself), the fact of actually being able to see all the different patterns of methods defined on the model class is really really illuminating. When I first made the change here and looked at the ancestor chain it was like a light came on, showing very clearly what all these different modules were actually doing to the model. Of course, you can see those same matchers by looking at |
@eileencodes Ok removed those skips. |
One thing that might make sense to investigate would be to reduce the number of modules by grouping them logically, e.g. dirty matchers go together, etc. But the problem is that the matcher class would then no longer neatly wrap around a single method missing target, method name, etc. - instead you'd have more complexity there again. I'm eager to hear other suggestions, but also I'd like to know concretely what the problem is with having more modules in the inheritance chain. The one thing I notice is that a |
I think this is on the right track. I don’t think the number of modules included is quite as controversial as the kitchen sink of methods defined on a Are there benchmarks around this? I suspect perf is similar, but it can be hard to intuit. Whichever way this PR goes, this was a neat idea. |
I just ran the performance benchmarks in Here is the result from master (commit hash 5668dc6), skipping the warming up:
Here is the result from this branch:
So you can see there are some very large differences, e.g. I'm curious to know what would be causing the slowdown, so I'll dig a bit more and report back. You're absolutely right that these things can be hard to intuit. |
So I thought about this a bit and I've got an idea which I think may solve many of the same issues this PR solves, without the module "bloat". Since there also seem to be performance issues related (I suppose?) to the number of modules, I'm going to close this and in a few days open a new PR with a "lite" version. What I'm basically envisioning is:
I believe this should work and would potentially actually decrease the number of modules in the inheritance chain, while also removing some private methods that are currently added by AttributeMethods. It would also decouple AR::AttributeMethods from AM::AttributeMethods in the same way that this PR does, since everything woudl be wrapped up in a module (so you would be able to use AM::AttributeMethods alongside AR::AttributeMethods with just a little extra work). Does this sound like a good plan? I'd like to at least get a thumbs up on the idea before proceeding. It won't be any more complicated than this PR so shouldn't take too much time to implement. |
Wow, I was concerned that missed methods going through so many Most of your proposed plan sounds like a much better explanation of my half-thought; I agree that does seem to have merit. It'd also be an interesting middle-ground from which to revisit the more radical multi-module approach in future, and possibly better understand what's causing the performance differential. As for reducing the number of modules in an inheritance scenario... I agree a subclass needn't have its own module by default. But I do believe we expect a subclass to be able to define its own matchers and attributes, and have those properly interact with those defined by the superclass -- so it seems necessary for it to be possible. (It's also worth noting that AR::Base isn't necessarily the only abstract class in the hierarchy.) |
I haven't dug into the details of what's slowing things down, but just removing the
So I'd like to clarify this point, because it confuses me a bit. Ignoring this PR, in the code in the master branch, a subclass creates a new module instance stored in Each subclass defines (and includes) this instance variable, including But actually, I don't see any reason why subclasses beyond the first subclass of In terms of the generated methods, you only really have to possibilities (AFAICT):
In either case, the fact that the generated attribute methods module is included in the subclass has no impact on the result. In the former case the method is the same as the parent method, both dispatching to Am I missing something? This is why I want to post an issue for this: I think AR is needlessly defining a module and defining attribute methods on this module for no reason other than that the current code depends on using an instance variable on the subclass, which (I think) can be fixed. |
@matthewd So I figured out at least half the reason for the slowdown, and it's pretty trivial. (results below updated, there was a typo in the The fix is in shioyama/rails@a414920. Basically the problem is that when Here are the results with that change (just showing relevant stuff):
So you can see now that initialization is much closer to the results from I suspect the remainder of the difference is accounted for by similar cases where rather than returning immediately, a procedure is looping over 18 modules before returning the same result. |
This is quite a large and substantial refactor of two core modules,
ActiveModel::AttributeMethods
andActiveRecord::AttributeMethods
. The inspiration for this PR is this blog post on the module builder pattern which I wrote a few months ago (the end of the post discusses a change to AM::AttributeMethods which is almost the same to the one here). I had intended only to change ActiveModel, but the two modules are so heavily coupled that it was impossible to change one without the other.I have tried to make the absolute minimal changes to achieve the goal of de-coupling described below, and wherever possible I have moved methods without actually altering their logic.
There are still some points to discuss and more tests to write, so this is not a finished PR; for now I'd just like to get some feedback on the general idea and current implementation.
Summary
ActiveModel and ActiveRecord provide a simple mechanism for defining prefixed/suffixed/affixed attribute methods, whereby you call a class method like
attribute_method_prefix
(etc) with a prefix (or suffix/affix), then calldefine_attribute_methods
to actually define the attribute methods.Under the hood, calling one of these class methods creates an instance of a class
AttributeMethodMatcher
which stores the prefix and/or suffix, and appends the instance onto a class attributeattribute_method_matchers
. (Similar for defining aliases.)https://github.com/shioyama/rails/blob/5668dc6b1863ef43be8f8ef0fb1d5db913085fb3/activemodel/lib/active_model/attribute_methods.rb#L387-L412
When you actually call an attribute method (i.e. a method with a prefix/suffix), AR/AM dispatches to a handler using one of two mechanisms:
method_missing
to dispatch calls for a method (say)reset_foo_to_default!
toreset_attribute_to_default!("foo")
, iffoo
is a key on the hash stored in the hash returned by theattributes
method. (This one is actually not very well documented.)define_attribute_methods
, you can convert these prefixes/suffixes/affixes into actual methods, which are defined on instances of an anonymous module (in ActiveModel) or instances of a named moduleGeneratedAttributeMethods
(in ActiveRecord).While it works, there are a few problems with the current implementation, the major one being that the components (attribute method matchers on the class, instance methods on generated attribute methods, etc.) are heavily coupled to each other, making it hard to read the code and even harder to extend it. This is even more clear when you look at how heavily coupled AR::AttributeMethods is to the internals of AM::AttributeMethods.
I recently wrote a blog post where (toward the end of the post) I discuss the problems with this approach. This PR is an attempt to implement what I describe in the post.
The basic concept here is to convert the matcher class
ActiveModel::ClassMethods::AttributeMethodMatcher
into a subclass ofModule
, which I've promoted in namespace toActiveModel::AttributeMethodMatcher
. Since the matcher has the prefix and/or suffix, you can define themethod_missing
/respond_to?
overrides, and also define attribute methods (and aliases) themselves on the matcher and include it into the class. Coupling is vastly reduced since everything is encapsulated in one place.The first commit in this PR does this in AM, which is quite a simple change and passes tests easily. The hard part is to make this work in ActiveRecord, since AR currently overrides private methods like
generated_attribute_methods
.However, what is pleasantly surprising is that making a similar change in AR also yields cleaner code. Whereas currently AR::AttributeMethods has to override many methods in AM::AttributeMethods, here the customization happens instead by subclassing
ActiveModel::AttributeMethodMatcher
(the class which generates the matcher modules). By doing this, a number of the private methods added to the model class are instead moved to the matcher module, avoiding namespace pollution and keeping the implementation more encapsulated.Concretely, the following AR methods are moved out of the model class:
generated_attribute_methods
define_method_attribute
define_proxy_call
attribute_method_matchers_cache
attribute_method_matchers_matching
In addition,
initialize_generated_modules
is no longer used (but still required for generated association methods), and thedefine_method_#{matcher.method_missing_target}
pattern of methods are all moved to the matcher, so this method namespace is now free in the model class.The biggest change is when you look at the ancestors of an AR class. Since the modules are now each associated with their prefix/suffix, and I've overridden
inspect
to show you the corresponding regex, you can actually see all the matchers applied to the class.So instead of this:
where the cryptic
ActiveRecord::AttributeMethods::GeneratedAttributeMethods
holds generated methods, and themethod_missing
andrespond_to?
are defined on the class itself, you get this:So you can clearly see all the matchers in the ancestor chain. When you call
define_attribute_methods
on the class, this actually delegates to calling the same method on each of these modules, which then defines the attribute methods on the module. So rather than having one big collection of attribute methods, you have methods each paired to the respective matcher that defines them.e.g.:
Here you can see that methods are defined for the
to_be_saved
suffix; other modules are similar.Other Information
Another thing that I noticed while working on this change is that currently, each AR::Base subclass includes its own
@generated_attribute_methods
instance, where it defines its attribute methods. This means that if I have a classSuperPost
which inherits fromPost
, then I will actually include three modules, one defined in ActiveRecord (which has no actual attribute methods defined on it), one onPost
which will have attribute methods defined on it, and another one defined inSuperPost
which has (generally the same, or a subset of) attribute methods defined on it. There is no need to have these duplicate modules AFAICT, but the current code doesn't work without doing it this way.With the change here, although there are many more modules (one per prefix/suffix), they are only included once in the first subclass of
ActiveRecord::Base
. This avoids defining a potentially large number of methods multiple times if you have a deep inheritance tree.One point that I feel is still problematic is the method
instance_method_already_implemented?
. Currently the AR::AttributeMethodMatcher` class just calls through to the model class method, which is simple but creates coupling I was hoping to avoid. For now I've left this since I want to avoid making any more changes before discussion, but this is one point I'm still not really happy with.Thanks for reading! I'm happy and ready to explain any and all changes here.