Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New approach to finding attached class #1098

Merged
merged 1 commit into from
Aug 12, 2022
Merged

Conversation

egiurleo
Copy link
Contributor

@egiurleo egiurleo commented Aug 5, 2022

Closes #1097.

Motivation

Historically, to identify the attached class of a singleton class, we've parsed the name (result of the#inspect method) of the singleton class and then constantized the result. However, this doesn't work when the #inspect method of a singleton class is overridden because the return value of #inspect no longer conforms to the structure we expected.

Implementation

We can fix this problem by searching ObjectSpace for the class that has the correct singleton class. This is less performant but more correct.

Tests

Added a test that fails without the changes and passes with them. Existing tests continue to pass.

@egiurleo egiurleo requested a review from a team as a code owner August 5, 2022 18:00
Copy link
Member

@paracycle paracycle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a small comment, otherwise this looks good, thanks!


constant = T.cast(constantize(name), Module)
attached_class = attached_class_of(constant)
constant = attached_class if attached_class
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously we were skipping the rest of the processing if the attached class wasn't found (if the name wasn't available). In the new version we continue processing if attached_class is nil, which might lead to changes in behaviour in the future if more code is added after this block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh good catch. Thanks for pointing that out.

Historically, to identify the attached class of a singleton class, we've
parsed the name (result of the #inspect method) of the singleton class
and then constantized the result.

However, this doesn't work when the #inspect method of a singleton class
is overridden because the return value of #inspect no longer conforms to
the structure we expected.

Instead, we can search ObjectSpace for the class that has the correct
singleton class. This is less performant but more correct.
@egiurleo egiurleo force-pushed the emily/singleton-inspect-bug branch from da38f30 to 282786f Compare August 5, 2022 18:08
@rafaelfranca
Copy link
Member

What is the performance of this in a big app like Shopify/shopify? There are a lot of objects in the object space in an application of that size, so looping though it doesn't seems to be as fast as the previous implementation.

@vinistock
Copy link
Member

Question: would an approach similar to this one work? I mean invoking the original inspect and then keeping the original implementation. And if not, what are the differences?

@paracycle
Copy link
Member

Question: would an approach similar to this one work? I mean invoking the original inspect and then keeping the original implementation. And if not, what are the differences?

This was discussed in Slack, unfortunately that trick doesn't work, since we would be calling inspect on the singleton class, but that internally does an inspect call on the attached class to get its name, so that it can build the string #<Class:AttachedClassName>. Since we have no control over the internal call to inspect on the attached class, we can't avoid the problem, it still ends up calling the user defined method.

@egiurleo
Copy link
Contributor Author

egiurleo commented Aug 5, 2022

What is the performance of this in a big app like Shopify/shopify? There are a lot of objects in the object space in an application of that size, so looping though it doesn't seems to be as fast as the previous implementation.

Good question! Because this is run during the tapioca gem command, the only objects that should be iterated in the object space are the ones loaded by the gem, not by the application. So this code should never be run while Shopify/shopify is loaded.

Also, this code is only run when monkeypatching a singleton class defined in another gem (e.g. if you were to call ActiveRecord::SchemaMigration.singleton_class.include(Foo) in the foo gem), which I'm assuming most gems don't do at all and some gems do sparingly.

This is definitely a less performant implementation but it should never iterate through an object space as large as Shopify's and should only be run on a very specific edge case.

Comment on lines +171 to +176
sig { params(singleton_class: Module).returns(T.nilable(Module)) }
def attached_class_of(singleton_class)
# https://stackoverflow.com/a/36622320/98634
result = ObjectSpace.each_object(singleton_class).find do |klass|
singleton_class_of(T.cast(klass, Module)) == singleton_class
end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be useful to maintain a cache of these, so we can at least cap the look-ups to once per singleton_class?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, could we add a fast path?

If inspect is not overridden, then we can use the old technique

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amomchilov both interesting thoughts!

Re: caching, I wonder if it would really be worth the amount of memory used, considering it's probably pretty rare for the same singleton class to have multiple dynamic mixins added to it. How would you think of the tradeoffs there?

Re: fast path, I really like that! What do you think of something like this?

def attached_class_of(singleton_class)
  if singleton_class.method(:inspect).owner == Module
    # old way here
  else
    # new way here
  end
end

cc @paracycle

Copy link
Contributor

@amomchilov amomchilov Aug 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would you think of the trade-offs there?

No clue, I'm just thinking out loud :)

What do you think of something like this?

Yep, that's pretty much what I had in mind!

Worth noting:

  • caching and fast-pathing are orthogonal ideas.
  • If the fast path is common enough, a catch-all cache would be less useful
  • We could add caching to only the slow path.
    • It would be smaller (few classes hit the slow path)
    • It would only be applied where it would actually help (i.e. not on the fast path)
    • OFC it's still only worthwhile if we find we're hitting attached_class_of for the same class multiple times. Needs research.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that pseudocode would work since the problem is that the inspect method is overriden not on the singleton class but on the original class, i.e. #<Class:Foo> has the default inspect, but Foo has an overriden inspect method. Since we don't have a handle on Foo to begin with, we can't check if its inspect method is overriden either.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The optimization we can do here is to do the object space walk just once over all Module descendants and create a lookup table from each singleton class to its attached class in that whole hierarchy. After that, each call to this method would be an O(1) operation (a simple lookup).

However, it would be critical that we do that after all of the classes that expect to be loaded are actually loaded and any mistake there might be costly. I'd rather we keep the less performant version for now and optimize later if we see it being a problem.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re: comment 1: hahaha good point.

Re: any mistake there might be costly. I'd rather we keep the less performant version for now and optimize later if we see it being a problem.

Agree.

@@ -35,10 +35,10 @@ def on_scope(event)
# base constant. Then, generate RBIs as if the base constant is extending the mixin,
# which is functionally equivalent to including or prepending to the singleton class.
if !name && constant.singleton_class?
name = constant_name_from_singleton_class(constant)
next unless name
attached_class = attached_class_of(constant)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a thought after reading the conversation in the other thread. Would it perhaps make sense to do something like:

MODULE_INSPECT_METHOD = T.let(Module.method(:inspect).unbind, UnboundMethod)

name = MODULE_INSPECT_METHOD.bind_call(constant)

Since the problem is that inspect may be overridden, what if we ... just didn't use it?

The only case where I could see this being an issue is for anonymous classes, but that behavior wouldn't apply to foreign constants in the first place.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another remark:

Looks like singleton classes already have a pointer to their attached classes in underlying implementation (at least in MRI). Which is like, obvious when I think about it (d'uh, otherwise how could they generate the ::inspect?).

It's currently stored in an ivar called id__attached (the constant in the C code is id__attached__):

https://github.com/ruby/ruby/blob/4325e90205aa4cd0ea031df1b5e6334bfd9c7e51/class.c#L33

Usage:

https://github.com/ruby/ruby/blob/ff07e5c264c82f73b0368dd0bc2ae39f78678519/object.c#L1542

Perhaps we could pitch making a public API that exposes this, instead of using this Stringy hack to begin with? (long term, this PR can still proceed as normal)

Copy link
Member

@paracycle paracycle Aug 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michaelherold That won't work for the reason that @amomchilov describes above and I describe here.

@amomchilov That already exists, and I'd chimed in on that thread too: https://bugs.ruby-lang.org/issues/12084

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@paracycle attached_object makes perfect sense to me. Perhaps we can link this pr in a comment on that thread as a concrete use case

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can try to make another case on the issue with reflection/introspection as the main use case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@paracycle paracycle merged commit e043537 into main Aug 12, 2022
@paracycle paracycle deleted the emily/singleton-inspect-bug branch August 12, 2022 20:42
egiurleo pushed a commit that referenced this pull request Aug 19, 2022
@paracycle paracycle added the backported Backported to stable branch label Aug 22, 2022
paracycle added a commit that referenced this pull request Aug 22, 2022
paracycle added a commit that referenced this pull request Aug 22, 2022
@shopify-shipit shopify-shipit bot temporarily deployed to production August 31, 2022 14:40 Inactive
@shopify-shipit shopify-shipit bot temporarily deployed to 0-10-stable September 14, 2022 02:35 Inactive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backported Backported to stable branch bugfix
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Foreign constant reflection breaks with non-standard singleton class inspect
6 participants