Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detect in-place changes on mutable AR attributes #15674

Merged
merged 1 commit into from
Jun 13, 2014

Conversation

sgrif
Copy link
Contributor

@sgrif sgrif commented Jun 12, 2014

We have several mutable types on Active Record now. (Serialized, JSON,
HStore). We need to be able to detect if these have been modified in
place.

Serialized attributes now "just work" and no longer need to be saved 100% of the time.

Fixes #8328

@chancancode
Copy link
Member

❤️ 👍

@chancancode
Copy link
Member

Does this handle (previous_)changes and attribute_was? Would it be appropriate to return the type casted values there?

Just thinking that it would be useful to branch on things like if user.settings_was[:some_flag]. We can probably freeze and cache the type casted old value?

end

def store_original_raw_attributes
attribute_names.each do |attr, _|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you use attribute_names.each_key?

@sgrif
Copy link
Contributor Author

sgrif commented Jun 13, 2014

It does not. I can definitely add that, but it's complex enough that I feel like it should be a separate PR. We definitely can't freeze it, because some types (by which I mean String return the same object as _before_type_cast. I don't want to dup because of performance reasons.

@chancancode
Copy link
Member

Sounds reasonable. Would it be okay to do the freeze version in just Mutable? I'm just thinking whether we can avoid deserializing the value every time you call settings_was. But ultimately it's not a big deal.

Big 👍 😁

@sgrif
Copy link
Contributor Author

sgrif commented Jun 13, 2014

Yes, will spend a lot of time optimizing this after #15593 is merged in to reduce the legwork. 😄

result = super(attr, value)
save_changed_attribute(attr, old_value)
result
super.tap do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please avoid to use tap The say it were before was fine

We have several mutable types on Active Record now. (Serialized, JSON,
HStore). We need to be able to detect if these have been modified in
place.
@rafaelfranca
Copy link
Member

:shipit:

@brndnblck
Copy link

Hey guys, are we planning to stick this into a rails 3.2.x release as well? Would greatly appreciate it if that was the case.

@matthewd
Copy link
Member

It's (a substantial conglomeration of code leading to) new functionality; if we back-ported that sort of thing, there's be no distinction between release series in the first place. This is destined for 4.2.

But even if it were a bug fix, it still wouldn't make it to 3.2: that release is only updated for severe security issues.

@brndnblck
Copy link

@matthewd thanks. I missed that with Rails 4.1 we put 3.2 in maintenance mode. GTK.

@felixbuenemann
Copy link
Contributor

I think this is not entirely fixed. Now instead of having an empty changes hash, I always have a changes hash with the same old and new value, so the record still gets saved every time.

class Asset < ActiveRecord::Base
  serialize :attachment_digest, BinaryHexCoder
end

class BinaryHexCoder
  class << self
    def load(value)
      value && Digest.hexencode(value)
    end

    def dump(value)
      value && Array(value).pack('H*')
    end
  end
end

Rails 4.1.5:

asset = Asset.last
asset.changes
=> {}
asset.save
# saves the record because of serialized attribute attachment_digest

Rails 4.2.0.beta1:

asset = Asset.last
asset.changes
=> {"attachment_digest"=>["779bbc9fbd34f881b862e3c43deb835d", "779bbc9fbd34f881b862e3c43deb835d"]}
asset.save
# saves the record because changes are not empty

So while the behavior is different, the outcome is the same.

In Rails 4.1 I could work around this by overriding the setter and save/save! but now I would have to do something dirty like check if the old and new values in the changes hash match and abort the save.

Rails 4.1 workaround:

  def attachment_digest=(value)
    attachment_digest_will_change! if value != attachment_digest
    super
  end

  def save(*)
    changes.empty? ? true : super
  end

  def save!(*)
    changes.empty? ? true : super
  end

@sgrif
Copy link
Contributor Author

sgrif commented Aug 26, 2014

@felixbuenemann Please open a new issue, preferably with an executable test case using this as a template.

@felixbuenemann
Copy link
Contributor

@sgrif OK, minimal testcase attached to #16701.

@robertomiranda
Copy link
Contributor

Guys any plan to back-port this to 4.1?

@sgrif
Copy link
Contributor Author

sgrif commented Nov 20, 2014

No

On Thu, Nov 20, 2014, 2:12 PM Roberto Miranda notifications@github.com
wrote:

Guys any plan to back-port this to 4.1?


Reply to this email directly or view it on GitHub
#15674 (comment).

@lawso017
Copy link
Contributor

Hi there -- wondering if this enhancement would have resulted in the following issue that I'm trying to track down:

https://gist.github.com/lawso017/1095d2e8ba868fdd8a6c

In brief, attempting to set a JSON attribute (stored in PG as a native JSON or JSONB type) to an invalid value results in a JSON::ParserError.

While that's OK, what's proving problematic is that the object is terminally broken at that point. I apparently cannot reset the value of the attribute, as subsequent attempts to reset the value result in again trying to parse the previous invalid input thus triggering the JSON::ParserError again.

The minimal test case against illustrates the problem -- requires Postgresql at least 9.2 (9.4 if you want to use jsonb).

I'm encountering this problem in Rails 4.2 although this test is against edge.

@rafaelfranca
Copy link
Member

@lawso017 could open a new issue for this? Comments in closed PRs are usually ignored.

@lawso017
Copy link
Contributor

Thanks, just did here:

#18629

On Wed, Jan 21, 2015 at 1:18 PM, Rafael Mendonça França <
notifications@github.com> wrote:

@lawso017 https://github.com/lawso017 could open a new issue for this?
Comments in closed PRs are usually ignored.


Reply to this email directly or view it on GitHub
#15674 (comment).

William (B.J.) Lawson, MD
919.335.3107 (direct)

ramontayag pushed a commit to G5/storext that referenced this pull request Apr 9, 2015
If Rails < 4.2, then handle it ourselves by duping the hash. If Rails >= 4.2,
then rely on the Rails fix:

- rails/rails#15674
- rails/rails#6127
sgrif added a commit that referenced this pull request Oct 15, 2015
This type adds an escape hatch to apps for which string duping causes
unacceptable memory growth. The reason we are duping them is in order to
detect mutation, which was a feature added to 4.2 in #15674. The string
type was modified to support this behavior in #15788.

Memory growth is really only a concern for string types, as it's the
only mutable type where the act of coersion does not create a new object
regardless (as we're usually returning an object of a different class).

I do feel strongly that if we are going to support detecting mutation,
we should do it universally for any type which is mutable. While it is
less common and ideomatic to mutate strings than arrays or hashes, there
shouldn't be rules or gotchas to understanding our behavior.

However, I also appreciate that for apps which are using a lot of string
columns, this would increase the number of allocations by a large
factor. To ensure that we keep our contract, if you'd like to opt out of
mutation detection on strings, you'll also be option out of mutation of
those strings.

I'm not completely married to the thought that strings coming out of
this actually need to be frozen -- and I think the name is correct
either way, as the purpose of this is to provide a string type which
does not detect mutation.

In the new implementation, I'm only overriding `cast_value`. I did not
port over the duping in `serialize`. I cannot think of a reason we'd
need to dup the string there, and the tests pass without it.
Unfortunately that line was introduced at a time where I was not nearly
as good about writing my commit messages, so I have no context as to
why I added it. Thanks past Sean. You are a jerk.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Non-dirty serialized columns are being saved anyway
9 participants