Skip to content


Subversion checkout URL

You can clone with
Download ZIP


Profiling-based improvement to import#values_sql_for_columns_and_attributes - 80% speedup for large imports #45

wants to merge 1 commit into from

5 participants

@saizai for details.

Three issues corrected:

  1. Calling ActiveRecord::Base#connection within the loop. This may well be an ActiveRecord WTF, but connection actually isn't memoized; every time you call it, it goes back to the connection pool and fetches things over again. Yikes. So I memoized it local to the function.

  2. Calling sequence_name needlessly, for non-id columns. This is a result of doing and in the wrong order. Statements are evaluated left-to-right; this means that you should put the stuff that's cheapest to fail on the left, and the expensive stuff (or the bits whose real return values you want preserved) on the right. So I just swapped the order.
    There's a deeper WTF here, in that sequence_name should be memoized within ActiveRecord, but something was breaking that. Not sure what, and I haven't fixed that, so it'll still be slow if you are using a primary key column. Serves you right for trying to insert explicit primary key values instead of leaving that up to the database at insertion time. :-P

  3. Double-escaping. connection.quote already does appropriate type conversion and escaping; calling connection.type_cast first provides no better security, but slows things down significantly by making things be converted twice. So I just got rid of that.

These three changes resulted in an 80% performance improvement for the import call.


Excellent commit - using this, an import that previously took around 2 hours now takes about 40 minutes!


@poggs Damn dood, wtf are you importing? I'm doing 10k records in ~3s. I'm having difficulty imagining what could take that long that would even fit in memory.

Is it the import itself that's taking that long, or some other stuff?


Railway timetable data. It's this function here - - which reads ~500Mb of data, chops it up, pushes it in to a hash called 'pending', and then every 1000 records, imports it.

I haven't really looked at speeding that up, or where it's hideously inefficient :)


FYI I added this change to my fork, published to rubygems as Empact-activerecord-import.


@Empact Would be nice if I got contributor credit. ;-)


@saizai, thank you for your pull request. I believe the first two issues have been resolved as a part of #71. I will look into the third issue.


@zdennis The third issue was also dealt with by #71 so you can close this.


Thanks @chewi

@zdennis zdennis closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Feb 10, 2012
  1. @saizai

    Profiling-based improvements to import#values_sql_for_columns_and_att…

    saizai committed
    …ributes - 80% speedup for large imports
This page is out of date. Refresh to see the latest.
Showing with 7 additions and 3 deletions.
  1. +7 −3 lib/activerecord-import/import.rb
10 lib/activerecord-import/import.rb
@@ -294,13 +294,17 @@ def import_without_validations_or_callbacks( column_names, array_of_attributes,
# Returns SQL the VALUES for an INSERT statement given the passed in +columns+
# and +array_of_attributes+.
def values_sql_for_columns_and_attributes(columns, array_of_attributes) # :nodoc:
+ # connection gets called a *lot* in this high intensity loop.
+ # Reuse the same one w/in the loop, otherwise it would keep being re-retreived (= lots of time for large imports)
+ connection_memo = connection do |arr|
my_values = do |val,j|
column = columns[j]
- if !sequence_name.blank? && == primary_key && val.nil?
- connection.next_value_for_sequence(sequence_name)
+ # be sure to query sequence_name *last*, only if cheaper tests fail, because it's costly
+ if val.nil? && == primary_key && !sequence_name.blank?
+ connection_memo.next_value_for_sequence(sequence_name)
- connection.quote(column.type_cast(val), column)
+ connection_memo.quote(val, column) # no need for column.type_cast(val) - quote already does type casting
Something went wrong with that request. Please try again.