New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unify shapes of ActiveModel::Attributes #47804
base: main
Are you sure you want to change the base?
Conversation
I don't remember but it is possible to define custom marshal loading to keep backwards compatibility? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently the PR is only enabling the optimization if you switched to the new AR Marshal format, but:
- You can switch to that format and still deserialize old payloads, this PR would break that.
- Active Record may not be the only way these are serialized. We may break more stuff.
What about using separate boolean ivars to indicate undefined-ness? Then I think we could unify the implementations with something like:
attr_reader :has_value, :has_value_for_database
def initialize(name, value_before_type_cast, type, original_attribute = nil, value = nil)
@name = name
@value_before_type_cast = value_before_type_cast
@type = type
@original_attribute = original_attribute
@value = value
@has_value = !value.nil?
@value_for_database = nil
@has_value_for_database = false
end
def has_value?
has_value || (has_value.nil? && defined?(@value))
# Or possibly:
# has_value || (@has_value = !!defined?(@value) if has_value.nil?)
end
alias :has_been_read? :has_value?
def has_value_for_database?
has_value_for_database || (has_value_for_database.nil? && defined?(@value_for_database))
end
Since old serializations will not have the boolean ivars, we can detect them with has_value.nil?
and has_value_for_database.nil?
, and handle them by falling back to defined?
.
I don't think it is. Additionally, it doesn't seem like there is a way to upgrade marshal formats. According to this comment marshal contains its version number, but AFAICT there is no way to deserialize one version then upgrade it to a newer version. If I understand this PR correctly, we basically want to initialize class MyObj
def initialize name
@value = name
@has_value_for_database = :undef
end
# Called after object has been deserialized
def marshal_loaded
@has_value_for_database ||= :undef
end
end
obj = MyObj.new :a
p Marshal.load Marshal.dump(obj)
I think this could work. We'll still end up with more shapes than we'd like, but over time (old marshal data being replaced) they should go away. Additionally, those extra shapes would only exist in systems that deserialized old data. |
Yes and no. Payload that were serialized form versions that didn't have a So if we want to keep compat, we need to handle instance that were serialized in 7.0 :/ |
Yes, except to avoid warnings in 2.7, we'll also have to check for |
Also it's not so much forward compatibility that is my problem, but backward, as in Ideally when you first ship 7.1 without the new defaults enabled, it should generate exactly the same payloads as 7.0 used to. |
Right, the tricky part is that some of those serialized objects can be in database, which record not being accessed or even updated for years, so in theory we can never delete the compatibility code, unless we generate some kind of helper to help people migrate data. |
Hum, in my mind the contract was that we do keep YAML compact for long time, but Marshal is only across one version to the next. Did I imagine this? |
oh yeah. I forgot Marshal is not used in database, so I'd not worry with backwards compatibly. If the new behavior is enabled, we can just tell people to bump the cache version. |
In which cases? I was thinking we could replace all occurrences of
Could that be handled by a |
If you load an object that was serialized by
Hum, that's a good point worth testing.
That's already the case in this PR. I piggy back on the config introduced in #47747 |
One other thing that may (or may not) be important. Obviously we should reduce the number of shapes in the system, but maybe the performance impact isn't as great in Ruby 3.3? |
Right, but it's still basically a year away, so I'd like to reap some benefits now if possible. |
07fc929
to
d9713d6
Compare
Ok, so one solution could be to keep that code unchanged, and to use another class entirely e.g. But that's quite ugly, so I'm also tempted to just delay this post 7.1 release, and just remove the old format then. |
7321f14
to
cec9f5b
Compare
`if defined?(@ivar)` a performance anti-pattern in Ruby 3.2+ even more so with YJIT. This pattern cause the object shape to be inconsistent which slow downs instance variable access. ``` ruby 3.2.1 (2023-02-08 revision 31819e82c8) [arm64-darwin22] #value (orig): 1811240.2 i/s #value (opt): 2045238.9 i/s - 1.13x faster ``` ``` ruby 3.2.1 (2023-02-08 revision 31819e82c8) +YJIT [arm64-darwin22] #value (orig): 4379180.3 i/s #value (opt): 6347280.7 i/s - 1.45x faster ``` Benchmark: https://gist.github.com/casperisfine/872f0a486b5ccdf90d9feb830c76d9ad
cec9f5b
to
8f2c17d
Compare
The `UNDEF.equal?` overheadis significant enough that what is gained by higher cache rate is lost doing the check. It's because we have to both lookup the constant, and then call `equal?` which surprisingly doesn't have an optimized opcode in the interpreter. But it does in YJIT, hence why it's still faster with it enabled. So instead we can use a secondary instance variable to keep a flag. Prior to this Attribute instances had 6 ivars, now they have 8, so they still fit in the same size 80 slots, meaning the memory usage is unchanged.
if defined?(@ivar)
a performance anti-pattern in Ruby 3.2+ even more so with YJIT.This pattern cause the object shape to be inconsistent which slow downs instance variable access.
Benchmark: https://gist.github.com/casperisfine/872f0a486b5ccdf90d9feb830c76d9ad
Backward compatibility
There is a big backward compatibility concern here, and I'm not sure we can actually make this change safety.
Until #47747, when you serialized an
ActiveRecord::Base
instance, lots ofActiveModel::Attribute
instances would be serialized with it. Which means this optimization would break any instance serialized with an older Rails.Currently the PR is only enabling the optimization if you switched to the new AR Marshal format, but:
I can't think of any decent way to do this optimization while still retaining perfect backward/forward compatibility...
cc @tenderlove, I wonder if you have opinions here.