Permalink
Browse files

Reduced memory leak problem in transactions by lazily updating AR obj…

…ects with new transaction state. If AR object has a callback, the callback will be performed immediately (non-lazily) so the transaction still has to keep records with callbacks.
  • Loading branch information...
1 parent 3a0b6c8 commit 67d8bb963d5d51fc644d6b1ca20164efb4cee6d7 @wangjohn wangjohn committed Jan 25, 2013
View
@@ -1,5 +1,27 @@
## Rails 4.0.0 (unreleased) ##
+* Fixing issue #776.
+
+ Memory bloat in transactions is handled by having the transaction hold only
+ the AR objects which it absolutely needs to know about. These are the AR
+ objects with callbacks (they need to be updated as soon as something in the
+ transaction occurs).
+
+ All other AR objects can be updated lazily by keeping a reference to a
+ TransactionState object. If an AR object gets inside a transaction, then
+ the transaction will add its TransactionState to the AR object. When the
+ user makes a call to some attribute on an AR object (which has no
+ callbacks) associated with a transaction, the AR object will call the
+ sync_with_transaction_state method and make sure it is up to date with the
+ transaction. After it has synced with the transaction state, the AR object
+ will return the attribute that was requested.
+
+ Most of the logic in the changes are used to handle multiple transactions,
+ in which case the AR object has to recursively follow parent pointers of
+ TransactionState objects.
+
+ *John Wang*
+
* Descriptive error message when the necessary AR adapter gem was not found.
Fix #7313
@@ -8,27 +8,32 @@ module PrimaryKey
# Returns this record's primary key value wrapped in an Array if one is
# available.
def to_key
+ sync_with_transaction_state
key = self.id
[key] if key
end
# Returns the primary key value.
def id
+ sync_with_transaction_state
read_attribute(self.class.primary_key)
end
# Sets the primary key value.
def id=(value)
+ sync_with_transaction_state
write_attribute(self.class.primary_key, value) if self.class.primary_key
end
# Queries the primary key value.
def id?
+ sync_with_transaction_state
query_attribute(self.class.primary_key)
end
# Returns the primary key value before type cast.
def id_before_type_cast
+ sync_with_transaction_state
read_attribute_before_type_cast(self.class.primary_key)
end
@@ -5,7 +5,7 @@ class Transaction #:nodoc:
def initialize(connection)
@connection = connection
- @state = TransactionState.new
+ @state = TransactionState.new
end
def state
@@ -14,11 +14,13 @@ def state
end
class TransactionState
+ attr_accessor :parent
VALID_STATES = Set.new([:committed, :rolledback, nil])
def initialize(state = nil)
@state = state
+ @parent = nil
end
def committed?
@@ -116,7 +118,11 @@ def commit
end
def add_record(record)
- records << record
+ if record.has_transactional_callbacks?
+ records << record
+ else
+ record.set_transaction_state(@state)
+ end
end
def rollback_records
@@ -188,8 +194,9 @@ def perform_rollback
end
def perform_commit
+ @state.set_state(:committed)
+ @state.parent = parent.state
connection.release_savepoint
- records.each { |r| parent.add_record(r) }
@jonleighton

jonleighton Nov 18, 2013

Member

@tenderlove @wangjohn why was this line removed? it introduced a regression. On 3.2 the follow script runs the callback, on 4.0 it does not.

require "active_record"

ActiveRecord::Base.establish_connection(adapter: "sqlite3", database: ":memory:")

ActiveRecord::Schema.define do
  create_table :posts
end

class Post < ActiveRecord::Base
  after_commit { puts "after commit" }
end

Post.transaction(joinable: false) do
  Post.create
end
@wangjohn

wangjohn Nov 19, 2013

Contributor

@jonleighton Hmm, seems like this is actually a bug. This line should not have been removed, since the the transactional state is actually set in add_record.

I think we can add it back: I'll do that in a PR.

@wangjohn

wangjohn Nov 19, 2013

Contributor

Sorry I take that back. The original intention of removing this line was that records that weren't already alive before the transaction were created wouldn't have a callback run on them.

It seems like that was a bad assumption to have. I was trying to make sure not all of the objects created in a transaction had strong references to them so that they could potentially get garbage collected.

I'm going to look into this more, because it seems like just adding this line back in doesn't actually solve the problem.

@jonleighton

jonleighton Nov 20, 2013

Member

@wangjohn Thanks for looking into it. Your patch achieves that for records for which has_transactional_callbacks? returns false. For records that do have transactional callbacks, it's not going to be possible without using weak references in Ruby >= 2. But either way I don't think adding this line back in will affect that, unless I am missing something?

@wangjohn

wangjohn Nov 20, 2013

Contributor

@jonleighton No you're right, adding that line back in won't fix it.

I think the original reason this architecture was chosen was because weak refs in ruby < 1.9 used a very slow _id2ref implementation. Now that Rails 4.0 has higher Ruby version requirement, maybe we should look into weak refs?

Do you have any idea of how prevalent this use case is?

@jonleighton

jonleighton Nov 20, 2013

Member

@wangjohn the main thing is that transactions now don't leak for objects which don't use transactional callbacks. that's definitely going to be the most prevalent use case. When I chatted about this with @tenderlove last he didn't mind that it leaks for records that do use transactional callbacks. Using weakrefs where supported could be a nice enhancement but I don't think it's essential. Either way it would be good to fix this regression.

end
end
end
@@ -347,8 +347,54 @@ def slice(*methods)
Hash[methods.map { |method| [method, public_send(method)] }].with_indifferent_access
end
+ def set_transaction_state(state) # :nodoc:
+ @transaction_state = state
+ end
+
+ def has_transactional_callbacks? # :nodoc:
+ !_rollback_callbacks.empty? || !_commit_callbacks.empty? || !_create_callbacks.empty?
+ end
+
private
+ # Updates the attributes on this particular ActiveRecord object so that
+ # if it is associated with a transaction, then the state of the AR object
+ # will be updated to reflect the current state of the transaction
+ #
+ # The @transaction_state variable stores the states of the associated
+ # transaction. This relies on the fact that a transaction can only be in
+ # one rollback or commit (otherwise a list of states would be required)
+ # Each AR object inside of a transaction carries that transaction's
+ # TransactionState.
+ #
+ # This method checks to see if the ActiveRecord object's state reflects
+ # the TransactionState, and rolls back or commits the ActiveRecord object
+ # as appropriate.
+ #
+ # Since ActiveRecord objects can be inside multiple transactions, this
+ # method recursively goes through the parent of the TransactionState and
+ # checks if the ActiveRecord object reflects the state of the object.
+ def sync_with_transaction_state
+ update_attributes_from_transaction_state(@transaction_state, 0)
+ end
+
+ def update_attributes_from_transaction_state(transaction_state, depth)
+ if transaction_state && !has_transactional_callbacks?
+ unless @reflects_state[depth]
+ if transaction_state.committed?
+ committed!
+ elsif transaction_state.rolledback?
+ rolledback!
+ end
+ @reflects_state[depth] = true
+ end
+
+ if transaction_state.parent && !@reflects_state[depth+1]
+ update_attributes_from_transaction_state(transaction_state.parent, depth+1)
+ end
+ end
+ end
+
# Under Ruby 1.9, Array#flatten will call #to_ary (recursively) on each of the elements
# of the array, and then rescues from the possible NoMethodError. If those elements are
# ActiveRecord::Base's, then this triggers the various method_missing's that we have,
@@ -376,7 +422,8 @@ def init_internals
@new_record = true
@txn = nil
@_start_transaction_state = {}
- @transaction = nil
+ @transaction_state = nil
+ @reflects_state = [false]
end
end
end
@@ -69,11 +69,13 @@ def discriminate_class_for_record(record)
# Returns true if this object hasn't been saved yet -- that is, a record
# for the object doesn't exist in the data store yet; otherwise, returns false.
def new_record?
+ sync_with_transaction_state
@new_record
end
# Returns true if this object has been destroyed, otherwise returns false.
def destroyed?
+ sync_with_transaction_state
@destroyed
end
@@ -460,7 +460,7 @@ def test_transactions_state_from_rollback
assert !transaction.state.committed?
transaction.perform_rollback
-
+
assert transaction.state.rolledback?
assert !transaction.state.committed?
end
@@ -474,7 +474,7 @@ def test_transactions_state_from_commit
assert !transaction.state.committed?
transaction.perform_commit
-
+
assert !transaction.state.rolledback?
assert transaction.state.committed?
end

0 comments on commit 67d8bb9

Please sign in to comment.