Skip to content
This repository

ActiveRecord doesn't recognize in-place-edited attributes as being dirty. #2665

Closed
YenTheFirst opened this Issue · 17 comments

7 participants

T S Michael Koziarski Aaron Patterson Erik Michaels-Ober Kurt Stephens Alexey Muranov dariusmc
T S

An ActiveRecord attribute modified using an in-place-edit method won't be marked as dirty, and won't have updates persisted when #save is called

Working example:

pirate = Pirate.first
puts pirate.catchphrase # "arrr"
pirate.catchphrase = pirate.catchphrase.upcase
pirate.save
pirate.reload
puts pirate.catchphrase #"ARRR"

Non-working example:

pirate = Pirate.first
puts pirate.catchphrase # "arrr"
pirate.catchphrase.upcase!
pirate.save
pirate.reload
puts pirate.catchphrase #"arrr"

I wrote a test to demonstrate here: YenTheFirst@b3e7a74

Aaron Patterson
Owner

SMITH: Doctor, it hurts when I do this.
DALE: Don't do that.

But seriously. Supporting this would require either monkey patching string, or keeping internal copies of the state of the AR object. Either solution will require a significant regression in speed, memory usage, or both. I think we should leave this as a "known issue".

Erik Michaels-Ober

This issue is very confusing for Rails beginner-to-intermediate users, who don't know how ActiveRecord::Dirty works. Yes, there exists the workaround of calling ActiveRecord::Dirty#attribute_will_change (in this example, pirate.catchphrase_will_change!) but that's a dirty hack (pun intended). IMHO, users should never need to manually manage the dirty state (or be aware that it even exists).

In short, users expect to be able to do the following:

  1. Retrieve an object from a database
  2. Modify that object
  3. Save that object to the database, including their modifications

I'm optimistic we can come up with a solution that isn't too memory intensive.

Erik Michaels-Ober

@tenderlove I agree that monkey patching String would be rough on performance, but I'm optimistic that keeping an internal copy of the state could be done efficiently (e.g. possibly using hashes of values instead of a full copy). Such an approach would break the ability to call ActiveRecord::Dirty#attribute_was, but I think it's worth considering deprecating that API in favor of solving this incongruity. I'd love to see an implementation that includes benchmarks, so we're not merely speculating.

T S

On the subject of benchmarks, what would we want to measure this against? It would seem the main metrics would be - memory usage, object read-from-database time, and object write-to-database time. Anything else? I'm not aware of a performance test suite for Rails.

Aaron Patterson
Owner

@sferik hashes of values is exactly what I'm talking about. If you don't store two full copies of the string, you can't compare them later. Destructive methods will destroy both references.

Possibly we could lazily dup the strings on method call, but that means added overhead to the method call and added memory usage. Until we fully understand the performance and memory impact of this, I am very :-1: on this feature.

Learn what destructive methods in ruby do.

Aaron Patterson
Owner
>> x = 'mystring'
=> "mystring"
>> internal = {}
=> {}
>> internal['column'] = x
=> "mystring"
>> x.upcase!
=> "MYSTRING"
>> internal
=> {"column"=>"MYSTRING"}
>>
T S

One can compare equality without keeping a full dup of the original, using a hash.

>> x='mystring'
=> "mystring"
>> x.object_id
=> -608512528
>> old_hash = x.hash
=> -728113656
>> x.upcase!
=> "MYSTRING"
>> x.object_id
=> -608512528
>> new_hash = x.hash
=> 114720008
>> old_hash == new_hash
=> false

as @sferik mentioned, this removes the ability to use attribute_was, as you're storing a hash, not a dup, of a changed attribute.

I agree, though, the most straightforward fix (storing a complete copy of all the attributes on read, and using that for comparison) has the potential for a major performance hit.

I'll write up a naive implementation and see how that looks, performance-wise.

Erik Michaels-Ober

@tenderlove Sorry if that was confusing, I was talking about hashing the value and storing result instead of a copy of the value. This has the side-effect of being irreversible, but it might be a worthwhile tradeoff. For example, given an ActiveRecord object o:

internal = {}
o.attributes.each{|k, v| internal[k] = v.hash}
Kurt Stephens

@sferik, @YenTheFirst: String#hash is not guaranteed to produce a distinct value for all String that are not String#==, therefore it cannot be substituted for String#==.

Alexey Muranov

This is off topic, but similar from the user-devellopper side: save does not always save, i reported it as Issue #993.

Michael Koziarski
Owner

I don't believe that the costs and complexity of trying to support this are worth the benefits.

If you want to modify those values in place you'll need to turn off partial updates and not rely on the change tracking.

Michael Koziarski NZKoz closed this
Erik Michaels-Ober

I think there's a case to be made that Rails has the wrong default behavior here. I would argue Rails should behave consistently by default (i.e. always update). Partial updates could be turned on by those who need to optimize for the performance gains it brings. Then, when someone goes to turn it on, they could be warned (in the surrounding comments and documentation) about how it would break in-place updates. This would more closely follow the principle of least astonishment, a core tenant of Ruby and Rails. Why, in this case, is it correct to optimize for performance over a consistent API?

What do you think?

Erik Michaels-Ober

@kstephens nor is validates_uniqueness_of guaranteed to validate the uniqueness of an attribute, but the number circumstances in which it won't work are sufficiently few that it's an accepted solution. That said, if we really want to follow POLA, we could dup every object after it's selected. That seems like the obvious, correct solution that we could later optimize, if necessary.

Michael Koziarski
Owner

duping every object is probably the only reliable way to do this, and that involves using a significant amount of memory to solve what is a peculiar edge case.

Feel free to implement this level of change tracking yourself in the application, but for now the tradeoff of "don't do that" seems perfectly acceptable to me.

Alexey Muranov

I agree with sferik. Is it possible now to turn off partial updates? A similar solutions would work in my opinion for the Issue #993.

T S

While going through the code to implement a 'dup on select' solution, I realize that quite a few modules have some tight couplings with the change tracking - some of them change the backing instance variables directly, and others manipulate the changed_attributes state directly (though accessing it through an accessor method, at least)

it's probably a separate issue, but I think there's a bit to be gained by abstracting the interfaces between these modules, so that it's not an impossible task to switch out change tracking for other methods (like dup on select) of implementing ActiveModel::Dirty

dariusmc

For more fun: try in-place updating a FactoryGirl object. Watch the same data come back with your next FactoryGirl.create! I guess it re-uses non-dirty objects.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.