Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Bugfix by stopping find_in_batches using the records after yielding. #1807

Merged
merged 1 commit into from

3 participants

Caius Durling Tekin Suleyman Santiago Pastorino
Caius Durling

I've submitted this patch before (pull request 1435) but decided redefining String#id in the test was a bad idea, so I've refactored it slightly to achieve the same thing using a mocha stub instead.

Currently if the code that calls find_in_batches modifies the yielded array in place then find_in_batches can enter an infinite loop searching with ruby object ids in the database instead of the primary key of records in the database. This happens because it naively assumes the yielded array hasn't been modified before calling id on the last object in the array. And ruby (1.8 at least) alias' id to object_id so an integer is still returned no matter what the last object is.

By moving finding the id of the last object before yielding the array of records it means the calling code can do whatever it wants to the array in terms of modifying it in place, and find_in_batches doesn't care. Also moved the count of the array records before yielding in case elements are removed or added when yielded.

Caius Durling caius Stop find_in_batches using the records after yielding.
Currently if the code that calls .find_in_batches modifies the yielded array in place then .find_in_batches can enter an infinite loop searching with ruby object ids in the database instead of the primary key of records in the database. This happens because it naively assumes the yielded array hasn't been modified before calling #id on the last object in the array. And ruby (1.8 at least) alias' #id to #object_id so an integer is still returned no matter what the last object is.

By moving finding the #id of the last object before yielding the array it means the calling code can do whatever it wants to the array in terms of modifying it in place, and .find_in_batches doesn't care.
96be08d
Tekin Suleyman

looks good to me.

Jake Varghese jake3030 referenced this pull request from a commit in jake3030/rails
Andy Stewart airblade Truncate helper accepts a :separator for a more legible result [#1807
…state:resolved]

Signed-off-by: Pratik Naik <pratiknaik@gmail.com>
5e190ef
Santiago Pastorino spastorino merged commit d632e92 into from
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Jun 21, 2011
  1. Caius Durling

    Stop find_in_batches using the records after yielding.

    caius authored
    Currently if the code that calls .find_in_batches modifies the yielded array in place then .find_in_batches can enter an infinite loop searching with ruby object ids in the database instead of the primary key of records in the database. This happens because it naively assumes the yielded array hasn't been modified before calling #id on the last object in the array. And ruby (1.8 at least) alias' #id to #object_id so an integer is still returned no matter what the last object is.
    
    By moving finding the #id of the last object before yielding the array it means the calling code can do whatever it wants to the array in terms of modifying it in place, and .find_in_batches doesn't care.
This page is out of date. Refresh to see the latest.
7 activerecord/lib/active_record/relation/batches.rb
View
@@ -68,11 +68,14 @@ def find_in_batches(options = {})
records = relation.where(table[primary_key].gteq(start)).all
while records.any?
+ records_size = records.size
+ primary_key_offset = records.last.id
+
yield records
- break if records.size < batch_size
+ break if records_size < batch_size
- if primary_key_offset = records.last.id
+ if primary_key_offset
records = relation.where(table[primary_key].gt(primary_key_offset)).to_a
else
raise "Primary key not included in the custom select clause"
16 activerecord/test/cases/batches_test.rb
View
@@ -93,4 +93,20 @@ def test_find_in_batches_should_quote_batch_order
end
end
end
+
+ def test_find_in_batches_should_not_use_records_after_yielding_them_in_case_original_array_is_modified
+ not_a_post = "not a post"
+ not_a_post.stubs(:id).raises(StandardError, "not_a_post had #id called on it")
+
+ assert_nothing_raised do
+ Post.find_in_batches(:batch_size => 1) do |batch|
+ assert_kind_of Array, batch
+ assert_kind_of Post, batch.first
+
+ batch.map! { not_a_post }
+ end
+ end
+
+ end
+
end
Something went wrong with that request. Please try again.