Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Reduced memory leak problem in transactions by lazily updating AR objects #9068

Merged
merged 1 commit into from

6 participants

@wangjohn

This pull request concerns Issue #776.

I handle memory bloat by having the transaction hold only the AR objects which it absolutely needs to know about. These are the AR objects with callbacks (they need to be updated as soon as something in the transaction occurs).

All other AR objects can be updated lazily by keeping a reference to a TransactionState object. If an AR object is inside a transaction, then the transaction will add its TransactionState to the AR object. When the user makes a call to some attribute on an AR object (which has no callbacks) associated with a transaction, the AR object will call the sync_with_transaction_state method and make sure it is up to date with the transaction. After it has synced with the transaction state, the AR object will return the attribute that was requested.

Most of the logic in the changes are used to handle multiple transactions, in which case the AR object has to recursively follow parent pointers of TransactionState objects.

...ve_record/connection_adapters/abstract/transaction.rb
((23 lines not shown))
- @state == :rolledback
+ def num_states
+ @states.length
+ end
+
+ def committed?(counter)
+ if @states[counter]
+ return @states[counter] == :committed
+ end
+ false
+ end
+
+ def rolledback?(counter)
+ if @states[counter]
+ return @states[counter] == :rolledback
+ end
@tenderlove Owner

I think you can simplify this function to just :rolledback == @states[counter]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
activerecord/lib/active_record/core.rb
((26 lines not shown))
+
+ def update_attributes_from_transaction_state(transaction_state, state_counter, depth)
+ if transaction_state.nil? || !_rollback_callbacks.empty? || !_commit_callbacks.empty? || !_create_callbacks.empty?
+ return
+ end
+
+ last_counter = transaction_state.num_states - 1
+ if state_counter < last_counter
+ (state_counter..last_counter).each do |count|
+ begin
+ if transaction_state.committed?(count)
+ committed!
+ elsif transaction_state.rolledback?(count)
+ rolledback!
+ end
+ rescue => e
@tenderlove Owner

What happens if we just let this exception raise? Are there tests that depend on it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
activerecord/lib/active_record/core.rb
((19 lines not shown))
+ #
+ # Since ActiveRecord objects can be inside multiple transactions, this
+ # method recursively goes through the parent of the TransactionState and
+ # checks if the ActiveRecord object reflects the state of the object.
+ def sync_with_transaction_state
+ update_attributes_from_transaction_state(@transaction_state, @state_counters[0], 0)
+ end
+
+ def update_attributes_from_transaction_state(transaction_state, state_counter, depth)
+ if transaction_state.nil? || !_rollback_callbacks.empty? || !_commit_callbacks.empty? || !_create_callbacks.empty?
+ return
+ end
+
+ last_counter = transaction_state.num_states - 1
+ if state_counter < last_counter
+ (state_counter..last_counter).each do |count|
@tenderlove Owner

This creates a Range object that we don't need. I think you can change to something like this:

state_counter.upto(last_counter) do |count|
   ...
end

It may be off by one though, so you need to check.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
...ve_record/connection_adapters/abstract/transaction.rb
@@ -116,7 +134,11 @@ def commit
end
def add_record(record)
- records << record
+ if !record._rollback_callbacks.empty? || !record._commit_callbacks.empty? || !record._create_callbacks.empty?
@tenderlove Owner

The opposite of xxx.empty? is xxx.any?. So you can change all of these empty?s to any? and remove the !.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@wangjohn

I corrected the code based on your suggestions. Also, it seems that all transactions, once they are rolledback or committed, are never changed so that the list of states in the TransactionState class is unnecessary. I've run a pretty simple test where I do the following:

Topic.transaction do
while true
Topic.new
end
end

On my machine (ruby 1.9.3), the amount of memory seems to stabilize pretty quickly and I was able to created millions of objects without any marked increase in memory. Note that the Topic class does not have callbacks.

...ve_record/connection_adapters/abstract/transaction.rb
@@ -14,11 +14,17 @@ def state
end
class TransactionState
+ attr_reader :parent
@tenderlove Owner

Change this to attr_accessor.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
...ve_record/connection_adapters/abstract/transaction.rb
((5 lines not shown))
VALID_STATES = Set.new([:committed, :rolledback, nil])
def initialize(state = nil)
@state = state
+ @parent = nil
+ end
+
+ def set_parent_state(parent)
+ @parent = parent
end
@tenderlove Owner

Is this supposed to set the parent, or the parent state? The method name is confusing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
activerecord/lib/active_record/core.rb
@@ -347,8 +347,52 @@ def slice(*methods)
Hash[methods.map { |method| [method, public_send(method)] }].with_indifferent_access
end
+ def add_transaction_state(state)
@tenderlove Owner

Change this to set_transaction_state

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
activerecord/lib/active_record/core.rb
((21 lines not shown))
+ # the TransactionState, and rolls back or commits the ActiveRecord object
+ # as appropriate.
+ #
+ # Since ActiveRecord objects can be inside multiple transactions, this
+ # method recursively goes through the parent of the TransactionState and
+ # checks if the ActiveRecord object reflects the state of the object.
+ def sync_with_transaction_state
+ update_attributes_from_transaction_state(@transaction_state, 0)
+ end
+
+ def update_attributes_from_transaction_state(transaction_state, depth)
+ if transaction_state.nil? || _rollback_callbacks.any? || _commit_callbacks.any? || _create_callbacks.any?
+ return
+ end
+
+ if !@reflects_state[depth]
@tenderlove Owner

Change to unless @reflects_state[depth]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
activerecord/lib/active_record/core.rb
((30 lines not shown))
+
+ def update_attributes_from_transaction_state(transaction_state, depth)
+ if transaction_state.nil? || _rollback_callbacks.any? || _commit_callbacks.any? || _create_callbacks.any?
+ return
+ end
+
+ if !@reflects_state[depth]
+ if transaction_state.committed?
+ committed!
+ elsif transaction_state.rolledback?
+ rolledback!
+ end
+ @reflects_state[depth] = true
+ end
+
+ if !transaction_state.parent.nil? && !@reflects_state[depth+1]
@rafaelfranca Owner
if transaction_state.parent && !@reflects_state[depth+1]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
activerecord/lib/active_record/core.rb
((17 lines not shown))
+ # Each AR object inside of a transaction carries that transaction's
+ # TransactionState.
+ #
+ # This method checks to see if the ActiveRecord object's state reflects
+ # the TransactionState, and rolls back or commits the ActiveRecord object
+ # as appropriate.
+ #
+ # Since ActiveRecord objects can be inside multiple transactions, this
+ # method recursively goes through the parent of the TransactionState and
+ # checks if the ActiveRecord object reflects the state of the object.
+ def sync_with_transaction_state
+ update_attributes_from_transaction_state(@transaction_state, 0)
+ end
+
+ def update_attributes_from_transaction_state(transaction_state, depth)
+ if transaction_state.nil? || _rollback_callbacks.any? || _commit_callbacks.any? || _create_callbacks.any?
@rafaelfranca Owner

What about invert the conditional to avoid the short-circuit return?

if transaction_state || _rollback_callbacks.empty? || _commit_callbacks.empty? || _create_callbacks.empty?
  unless @reflects_state[depth]
    if transaction_state.committed?
      committed!
    elsif transaction_state.rolledback?
      rolledback!
    end

    @reflects_state[depth] = true
  end

  if transaction_state.parent && !@reflects_state[depth + 1]
    update_attributes_from_transaction_state(transaction_state.parent, depth + 1)
  end
end
@rafaelfranca Owner

Also would be great if you extracted this conditional to a method with a meaningful name.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
...ve_record/connection_adapters/abstract/transaction.rb
@@ -116,7 +122,11 @@ def commit
end
def add_record(record)
- records << record
+ if record._rollback_callbacks.any? || record._commit_callbacks.any? || record._create_callbacks.any?
@rafaelfranca Owner

Maybe we can extract this conditional to a method like

record.has_transactional_callbacks?

Doing this we can reuse it on update_attributes_from_transaction_state

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@wangjohn

Thanks for all the comments! I've made changes so that my code reflects all of these. The tests still pass.

activerecord/lib/active_record/core.rb
@@ -347,8 +347,54 @@ def slice(*methods)
Hash[methods.map { |method| [method, public_send(method)] }].with_indifferent_access
end
+ def set_transaction_state(state)
+ @transaction_state = state
+ end
+
+ def has_transactional_callbacks?
@rafaelfranca Owner

Put # :nodoc: since this method should not be part of the public API

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@rafaelfranca

Very good! :+1:

@wangjohn

Ok, nodocs have been added.

activerecord/lib/active_record/core.rb
@@ -347,8 +347,54 @@ def slice(*methods)
Hash[methods.map { |method| [method, public_send(method)] }].with_indifferent_access
end
+ def set_transaction_state(state) # :nodoc:
+ @transaction_state = state
+ end
+
+ def has_transactional_callbacks? # :nodoc:
+ _rollback_callbacks.any? || _commit_callbacks.any? || _create_callbacks.any?
@lexmag
lexmag added a note

any? definitely not the same as ! + empty?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@wangjohn

While its true that any? is not the same as !empty?, it doesn't matter because we are searching for non-nil callbacks from the _#{transactional}_callbacks stack. Technically, using any? is actually better than using !empty? because a nil callback should not set off the has_transaction_callbacks? method.

@lexmag

_#{transactional}_callbacks could not contain nil object. Right? And ! + empty? has better performance.
There is example 2ff47c4

@wangjohn

I've added a CHANGELOG entry for this PR, and have rebased it with master.

@rafaelfranca

It stills doesn't can be automatically merged

activerecord/lib/active_record/core.rb
((27 lines not shown))
+ #
+ # Since ActiveRecord objects can be inside multiple transactions, this
+ # method recursively goes through the parent of the TransactionState and
+ # checks if the ActiveRecord object reflects the state of the object.
+ def sync_with_transaction_state
+ update_attributes_from_transaction_state(@transaction_state, 0)
+ end
+
+ def update_attributes_from_transaction_state(transaction_state, depth)
+ if transaction_state && !has_transactional_callbacks?
+ unless @reflects_state[depth]
+ if transaction_state.committed?
+ committed!
+ elsif transaction_state.rolledback?
+ rolledback!
+ end

Wrong indent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@wangjohn wangjohn Reduced memory leak problem in transactions by lazily updating AR obj…
…ects with new transaction state. If AR object has a callback, the callback will be performed immediately (non-lazily) so the transaction still has to keep records with callbacks.
67d8bb9
@rafaelfranca rafaelfranca merged commit 82a5432 into rails:master
@jonleighton

@tenderlove @wangjohn why was this line removed? it introduced a regression. On 3.2 the follow script runs the callback, on 4.0 it does not.

require "active_record"

ActiveRecord::Base.establish_connection(adapter: "sqlite3", database: ":memory:")

ActiveRecord::Schema.define do
  create_table :posts
end

class Post < ActiveRecord::Base
  after_commit { puts "after commit" }
end

Post.transaction(joinable: false) do
  Post.create
end

@jonleighton Hmm, seems like this is actually a bug. This line should not have been removed, since the the transactional state is actually set in add_record.

I think we can add it back: I'll do that in a PR.

Sorry I take that back. The original intention of removing this line was that records that weren't already alive before the transaction were created wouldn't have a callback run on them.

It seems like that was a bad assumption to have. I was trying to make sure not all of the objects created in a transaction had strong references to them so that they could potentially get garbage collected.

I'm going to look into this more, because it seems like just adding this line back in doesn't actually solve the problem.

Collaborator

@wangjohn Thanks for looking into it. Your patch achieves that for records for which has_transactional_callbacks? returns false. For records that do have transactional callbacks, it's not going to be possible without using weak references in Ruby >= 2. But either way I don't think adding this line back in will affect that, unless I am missing something?

@jonleighton No you're right, adding that line back in won't fix it.

I think the original reason this architecture was chosen was because weak refs in ruby < 1.9 used a very slow _id2ref implementation. Now that Rails 4.0 has higher Ruby version requirement, maybe we should look into weak refs?

Do you have any idea of how prevalent this use case is?

Collaborator

@wangjohn the main thing is that transactions now don't leak for objects which don't use transactional callbacks. that's definitely going to be the most prevalent use case. When I chatted about this with @tenderlove last he didn't mind that it leaks for records that do use transactional callbacks. Using weakrefs where supported could be a nice enhancement but I don't think it's essential. Either way it would be good to fix this regression.

@chancancode chancancode referenced this pull request from a commit in chancancode/rails
@chancancode chancancode `TransactionState` is internal API, so added :nodoc:
This was introduced in 26853e8 / #9011. Its main purpose is to flip the
reference from `Transaction` -> AR objects to AR objects -> `TransactionState`.

This method this was extracted from originally from was a private API, and there
are no other public APIs to make this accessible to the user, so there is no
reason for this class to be public.

See also 67d8bb9 / #9068.
b382087
@arthurnn arthurnn referenced this pull request from a commit in arthurnn/rails
@arthurnn arthurnn Use WeakRef to store records on transactions.
In order to restore state on records we need to store all records
touched in the transaction. However if we store all records, we will be
holding a hard reference to them, not allowing them to be garbage
collected. #9068 kinda solved the GC issue inversing the dependency.
As we are on ruby 2.2+ we can use WeakRef, and wont need to inverse that
depedency to restore state.

[fixes #15549] - Because records with a 'create' callback were still
been stored, and memory grow was still a problem for those.
822135e
@arthurnn arthurnn referenced this pull request from a commit in arthurnn/rails
@arthurnn arthurnn Use WeakRef to store records on transactions.
In order to restore state on records we need to store all records
touched in the transaction. However if we store all records, we will be
holding a hard reference to them, not allowing them to be garbage
collected. #9068 kinda solved the GC issue inversing the dependency.
As we are on ruby 2.2+ we can use WeakRef, and wont need to inverse that
depedency to restore state.

[fixes #15549] - Because records with a 'create' callback were still
been stored, and memory grow was still a problem for those.
884f630
@arthurnn arthurnn referenced this pull request from a commit in arthurnn/rails
@arthurnn arthurnn Use WeakRef to store records on transactions.
In order to restore state on records we need to store all records
touched in the transaction. However if we store all records, we will be
holding a hard reference to them, not allowing them to be garbage
collected. #9068 kinda solved the GC issue inversing the dependency.
As we are on ruby 2.2+ we can use WeakRef, and wont need to inverse that
depedency to restore state.

[fixes #15549] - Because records with a 'create' callback were still
been stored, and memory grow was still a problem for those.
57de482
@arthurnn arthurnn referenced this pull request from a commit in arthurnn/rails
@arthurnn arthurnn Use WeakRef to store records on transactions.
In order to restore state on records we need to store all records
touched in the transaction. However if we store all records, we will be
holding a hard reference to them, not allowing them to be garbage
collected. #9068 kinda solved the GC issue inversing the dependency.
As we are on ruby 2.2+ we can use WeakRef, and wont need to inverse that
depedency to restore state.

[fixes #15549] - Because records with a 'create' callback were still
been stored, and memory grow was still a problem for those.
4382123
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Feb 20, 2013
  1. @wangjohn

    Reduced memory leak problem in transactions by lazily updating AR obj…

    wangjohn authored
    …ects with new transaction state. If AR object has a callback, the callback will be performed immediately (non-lazily) so the transaction still has to keep records with callbacks.
This page is out of date. Refresh to see the latest.
View
22 activerecord/CHANGELOG.md
@@ -1,5 +1,27 @@
## Rails 4.0.0 (unreleased) ##
+* Fixing issue #776.
+
+ Memory bloat in transactions is handled by having the transaction hold only
+ the AR objects which it absolutely needs to know about. These are the AR
+ objects with callbacks (they need to be updated as soon as something in the
+ transaction occurs).
+
+ All other AR objects can be updated lazily by keeping a reference to a
+ TransactionState object. If an AR object gets inside a transaction, then
+ the transaction will add its TransactionState to the AR object. When the
+ user makes a call to some attribute on an AR object (which has no
+ callbacks) associated with a transaction, the AR object will call the
+ sync_with_transaction_state method and make sure it is up to date with the
+ transaction. After it has synced with the transaction state, the AR object
+ will return the attribute that was requested.
+
+ Most of the logic in the changes are used to handle multiple transactions,
+ in which case the AR object has to recursively follow parent pointers of
+ TransactionState objects.
+
+ *John Wang*
+
* Descriptive error message when the necessary AR adapter gem was not found.
Fix #7313
View
5 activerecord/lib/active_record/attribute_methods/primary_key.rb
@@ -8,27 +8,32 @@ module PrimaryKey
# Returns this record's primary key value wrapped in an Array if one is
# available.
def to_key
+ sync_with_transaction_state
key = self.id
[key] if key
end
# Returns the primary key value.
def id
+ sync_with_transaction_state
read_attribute(self.class.primary_key)
end
# Sets the primary key value.
def id=(value)
+ sync_with_transaction_state
write_attribute(self.class.primary_key, value) if self.class.primary_key
end
# Queries the primary key value.
def id?
+ sync_with_transaction_state
query_attribute(self.class.primary_key)
end
# Returns the primary key value before type cast.
def id_before_type_cast
+ sync_with_transaction_state
read_attribute_before_type_cast(self.class.primary_key)
end
View
13 activerecord/lib/active_record/connection_adapters/abstract/transaction.rb
@@ -5,7 +5,7 @@ class Transaction #:nodoc:
def initialize(connection)
@connection = connection
- @state = TransactionState.new
+ @state = TransactionState.new
end
def state
@@ -14,11 +14,13 @@ def state
end
class TransactionState
+ attr_accessor :parent
VALID_STATES = Set.new([:committed, :rolledback, nil])
def initialize(state = nil)
@state = state
+ @parent = nil
end
def committed?
@@ -116,7 +118,11 @@ def commit
end
def add_record(record)
- records << record
+ if record.has_transactional_callbacks?
+ records << record
+ else
+ record.set_transaction_state(@state)
+ end
end
def rollback_records
@@ -188,8 +194,9 @@ def perform_rollback
end
def perform_commit
+ @state.set_state(:committed)
+ @state.parent = parent.state
connection.release_savepoint
- records.each { |r| parent.add_record(r) }
end
end
end
View
49 activerecord/lib/active_record/core.rb
@@ -347,8 +347,54 @@ def slice(*methods)
Hash[methods.map { |method| [method, public_send(method)] }].with_indifferent_access
end
+ def set_transaction_state(state) # :nodoc:
+ @transaction_state = state
+ end
+
+ def has_transactional_callbacks? # :nodoc:
+ !_rollback_callbacks.empty? || !_commit_callbacks.empty? || !_create_callbacks.empty?
+ end
+
private
+ # Updates the attributes on this particular ActiveRecord object so that
+ # if it is associated with a transaction, then the state of the AR object
+ # will be updated to reflect the current state of the transaction
+ #
+ # The @transaction_state variable stores the states of the associated
+ # transaction. This relies on the fact that a transaction can only be in
+ # one rollback or commit (otherwise a list of states would be required)
+ # Each AR object inside of a transaction carries that transaction's
+ # TransactionState.
+ #
+ # This method checks to see if the ActiveRecord object's state reflects
+ # the TransactionState, and rolls back or commits the ActiveRecord object
+ # as appropriate.
+ #
+ # Since ActiveRecord objects can be inside multiple transactions, this
+ # method recursively goes through the parent of the TransactionState and
+ # checks if the ActiveRecord object reflects the state of the object.
+ def sync_with_transaction_state
+ update_attributes_from_transaction_state(@transaction_state, 0)
+ end
+
+ def update_attributes_from_transaction_state(transaction_state, depth)
+ if transaction_state && !has_transactional_callbacks?
+ unless @reflects_state[depth]
+ if transaction_state.committed?
+ committed!
+ elsif transaction_state.rolledback?
+ rolledback!
+ end
+ @reflects_state[depth] = true
+ end
+
+ if transaction_state.parent && !@reflects_state[depth+1]
+ update_attributes_from_transaction_state(transaction_state.parent, depth+1)
+ end
+ end
+ end
+
# Under Ruby 1.9, Array#flatten will call #to_ary (recursively) on each of the elements
# of the array, and then rescues from the possible NoMethodError. If those elements are
# ActiveRecord::Base's, then this triggers the various method_missing's that we have,
@@ -376,7 +422,8 @@ def init_internals
@new_record = true
@txn = nil
@_start_transaction_state = {}
- @transaction = nil
+ @transaction_state = nil
+ @reflects_state = [false]
end
end
end
View
2  activerecord/lib/active_record/persistence.rb
@@ -69,11 +69,13 @@ def discriminate_class_for_record(record)
# Returns true if this object hasn't been saved yet -- that is, a record
# for the object doesn't exist in the data store yet; otherwise, returns false.
def new_record?
+ sync_with_transaction_state
@new_record
end
# Returns true if this object has been destroyed, otherwise returns false.
def destroyed?
+ sync_with_transaction_state
@destroyed
end
View
4 activerecord/test/cases/transactions_test.rb
@@ -460,7 +460,7 @@ def test_transactions_state_from_rollback
assert !transaction.state.committed?
transaction.perform_rollback
-
+
assert transaction.state.rolledback?
assert !transaction.state.committed?
end
@@ -474,7 +474,7 @@ def test_transactions_state_from_commit
assert !transaction.state.committed?
transaction.perform_commit
-
+
assert !transaction.state.rolledback?
assert transaction.state.committed?
end
Something went wrong with that request. Please try again.