diff --git a/CHANGELOG b/CHANGELOG index a73db37..e009f4c 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -1,3 +1,19 @@ +1.0.0 / 2012-11-07 + +* Breaking changes (well, not really) + + * Not using INSERT ... ON DUPLICATE KEY UPDATE for MySQL! + +* Enhancements + + * Replaced ON DUPLICATE KEY with a true merge function (procedure) + * Simplified code - buffering is no longer used anywhere + * Clarified documentation + +* Bug fixes + + * MySQL upserts won't fail if you have a multi-key selector and no multi-column UNIQUE index to cover them (https://github.com/seamusabshere/upsert/issues/6) + 0.5.0 / 2012-09-21 * Breaking changes (well, not really) diff --git a/README.md b/README.md index b0972c7..60cad5e 100644 --- a/README.md +++ b/README.md @@ -7,46 +7,75 @@ MySQL, PostgreSQL, and SQLite all have different SQL MERGE tricks that you can u You pass a selector that uniquely identifies a row, whether it exists or not. You pass a set of attributes that should be set on that row. Syntax inspired by [mongo-ruby-driver's update method](http://api.mongodb.org/ruby/1.6.4/Mongo/Collection.html#update-instance_method). ### Single record - - # if you have required 'upsert/active_record_upsert' - Pet.upsert({:name => 'Jerry'}, :breed => 'beagle') - - # if you're not using activerecord, that's ok - connection = Mysql2::Client.new([...]) - upsert = Upsert.new connection, 'pets' - upsert.row({:name => 'Jerry'}, :breed => 'beagle') + +```ruby +connection = Mysql2::Client.new([...]) +table_name = :pets +upsert = Upsert.new connection, table_name +upsert.row({:name => 'Jerry'}, :breed => 'beagle') +``` + +If you want to use an `ActiveRecord` helper method, try: + +```ruby +require 'upsert/active_record_upsert' +Pet.upsert({:name => 'Jerry'}, :breed => 'beagle') +``` + +So just to reiterate you've got a `selector` and a `setter`: + +```ruby +connection = Mysql2::Client.new([...]) +table_name = :pets +upsert = Upsert.new connection, table_name +selector = { :name => 'Jerry' } +setter = { :breed => 'beagle' } +upsert.row(selector, setter) +``` ### Multiple records (batch mode) -Rows are buffered in memory until it's efficient to send them to the database. +Slightly faster. - connection = Mysql2::Client.new([...]) - Upsert.batch(connection, 'pets') do |upsert| - upsert.row({:name => 'Jerry'}, :breed => 'beagle') - upsert.row({:name => 'Pierre'}, :breed => 'tabby') - end +```ruby +connection = Mysql2::Client.new([...]) +Upsert.batch(connection, :pets) do |upsert| + upsert.row({:name => 'Jerry'}, :breed => 'beagle') + upsert.row({:name => 'Pierre'}, :breed => 'tabby') +end +``` -Tested to be much about 60% faster on PostgreSQL and 60–90% faster on MySQL and SQLite3 than comparable methods (see the tests, which fail if they are not faster). +Tested to be much about 80% faster on PostgreSQL, MySQL, and SQLite3 than comparable methods (see the tests, which fail if they are not faster). ## Gotchas -### Undefined behavior without real UNIQUE indexes +### In PostgreSQL, trying to put strings into integer fields fails confusingly -Make sure you're upserting against either primary key columns or columns with UNIQUE indexes or both. +We currently don't have any logic to convert integers into strings, strings into integers, etc. in order to satisfy PostgreSQL's strictness on this issue. -### For MySQL, columns are set based on the first row you pass +Tracking [the issue on Github](https://github.com/seamusabshere/upsert/issues/7). -Currently, on MySQL, the first row you pass in determines the columns that will be used for all future upserts using the same Upsert object. That's useful for mass importing of many rows with the same columns, but is surprising if you're trying to use a single `Upsert` object to add arbitrary data. For example, this won't work: +### Within a batch, it's assumed that you're always passing the same columns - Upsert.batch(Pet.connection, Pet.table_name) do |upsert| - upsert.row({:name => 'Jerry'}, :breed => 'beagle') - upsert.row({:tag_number => 456}, :spiel => 'great cat') # won't work - doesn't use same columns - end +Currently, on MySQL, the first row you pass in determines the columns that will be used for all future upserts using the same Upsert object. That's useful for mass importing of many rows with the same columns, but is surprising if you're trying to use a single `Upsert` object to add arbitrary data. For example: + +```ruby +# won't work - doesn't use same columns +Upsert.batch(Pet.connection, Pet.table_name) do |upsert| + upsert.row({:name => 'Jerry'}, :breed => 'beagle') + upsert.row({:tag_number => 456}, :spiel => 'great cat') +end +``` You would need to use a new `Upsert` object. On the other hand, this is totally fine: - Pet.upsert({:name => 'Jerry'}, :breed => 'beagle') - Pet.upsert({:tag_number => 456}, :spiel => 'great cat') +```ruby +# totally fine +Pet.upsert({:name => 'Jerry'}, :breed => 'beagle') +Pet.upsert({:tag_number => 456}, :spiel => 'great cat') +``` + +Hopefully this surprising behavior won't exist in the future! ## Wishlist @@ -75,106 +104,129 @@ Originally written to speed up the [`data_miner`](https://github.com/seamusabshe Using the [mysql2](https://rubygems.org/gems/mysql2) driver. - upsert = Upsert.new(Mysql2::Connection.new(:username => 'root', :password => 'password', :database => 'upsert_test'), :pets) +```ruby +connection = Mysql2::Connection.new(:username => 'root', :password => 'password', :database => 'upsert_test') +table_name = :pets +upsert = Upsert.new(connection, table_name) +``` #### Speed -From the tests (updated 9/21/12): +From the tests (updated 11/7/12): - Upsert was 88% faster than find + new/set/save + Upsert was 82% faster than find + new/set/save + Upsert was 85% faster than find_or_create + update_attributes Upsert was 90% faster than create + rescue/find/update - Upsert was 90% faster than find_or_create + update_attributes - Upsert was 60% faster than faking upserts with activerecord-import + Upsert was 46% faster than faking upserts with activerecord-import #### SQL MERGE trick -"ON DUPLICATE KEY UPDATE" where we just set everything to the value of the insert. - - # http://dev.mysql.com/doc/refman/5.0/en/insert-on-duplicate.html - INSERT INTO table (a,b,c) VALUES (1,2,3), (4,5,6) - ON DUPLICATE KEY UPDATE a=VALUES(a),b=VALUES(b),c=VALUES(c); - -If `a` only appeared in the selector, then we avoid updating it in case of a duplicate key: +Thanks to [Dennis Hennen's StackOverflow response!](http://stackoverflow.com/questions/11371479/how-to-translate-postgresql-merge-db-aka-upsert-function-into-mysql/)! - ON DUPLICATE KEY UPDATE a=a,b=VALUES(b),c=VALUES(c); - -Since this is an upsert helper library, not a general-use ON DUPLICATE KEY UPDATE wrapper, you **can't** do things like `c=c+1`. +```sql +CREATE PROCEDURE upsert_pets_SEL_name_A_tag_number_SET_name_A_tag_number(`name_sel` varchar(255), `tag_number_sel` int(11), `name_set` varchar(255), `tag_number_set` int(11)) +BEGIN + DECLARE done BOOLEAN; + REPEAT + BEGIN + -- If there is a unique key constraint error then + -- someone made a concurrent insert. Reset the sentinel + -- and try again. + DECLARE ER_DUP_UNIQUE CONDITION FOR 23000; + DECLARE ER_INTEG CONDITION FOR 1062; + DECLARE CONTINUE HANDLER FOR ER_DUP_UNIQUE BEGIN + SET done = FALSE; + END; + + DECLARE CONTINUE HANDLER FOR ER_INTEG BEGIN + SET done = TRUE; + END; + + SET done = TRUE; + SELECT COUNT(*) INTO @count FROM `pets` WHERE `name` = `name_sel` AND `tag_number` = `tag_number_sel`; + -- Race condition here. If a concurrent INSERT is made after + -- the SELECT but before the INSERT below we'll get a duplicate + -- key error. But the handler above will take care of that. + IF @count > 0 THEN + -- UPDATE table_name SET b = b_SET WHERE a = a_SEL; + UPDATE `pets` SET `name` = `name_set`, `tag_number` = `tag_number_set` WHERE `name` = `name_sel` AND `tag_number` = `tag_number_sel`; + ELSE + -- INSERT INTO table_name (a, b) VALUES (k, data); + INSERT INTO `pets` (`name`, `tag_number`) VALUES (`name_set`, `tag_number_set`); + END IF; + END; + UNTIL done END REPEAT; +END +``` ### PostgreSQL Using the [pg](https://rubygems.org/gems/pg) driver. - upsert = Upsert.new(PG.connect(:dbname => 'upsert_test'), :pets) +```ruby +connection = PG.connect(:dbname => 'upsert_test') +table_name = :pets +upsert = Upsert.new(connection, table_name) +``` #### Speed From the tests (updated 9/21/12): - Upsert was 65% faster than find + new/set/save + Upsert was 72% faster than find + new/set/save Upsert was 79% faster than find_or_create + update_attributes - Upsert was 76% faster than create + rescue/find/update + Upsert was 83% faster than create + rescue/find/update # (can't compare to activerecord-import because you can't fake it on pg) #### SQL MERGE trick - # http://www.postgresql.org/docs/current/interactive/plpgsql-control-structures.html#PLPGSQL-ERROR-TRAPPING - CREATE TABLE db (a INT PRIMARY KEY, b TEXT); - CREATE FUNCTION merge_db(key INT, data TEXT) RETURNS VOID AS - $$ +Adapted from the [canonical PostgreSQL upsert example](http://www.postgresql.org/docs/current/interactive/plpgsql-control-structures.html#PLPGSQL-ERROR-TRAPPING): + +```sql +CREATE OR REPLACE FUNCTION upsert_pets_SEL_name_A_tag_number_SET_name_A_tag_number("name_sel" character varying(255), "tag_number_sel" integer, "name_set" character varying(255), "tag_number_set" integer) RETURNS VOID AS +$$ +DECLARE + first_try INTEGER := 1; +BEGIN + LOOP + -- first try to update the key + UPDATE "pets" SET "name" = "name_set", "tag_number" = "tag_number_set" + WHERE "name" = "name_sel" AND "tag_number" = "tag_number_sel"; + IF found THEN + RETURN; + END IF; + -- not there, so try to insert the key + -- if someone else inserts the same key concurrently, + -- we could get a unique-key failure BEGIN - LOOP - -- first try to update the key - UPDATE db SET b = data WHERE a = key; - IF found THEN - RETURN; - END IF; - -- not there, so try to insert the key - -- if someone else inserts the same key concurrently, - -- we could get a unique-key failure - BEGIN - INSERT INTO db(a,b) VALUES (key, data); - RETURN; - EXCEPTION WHEN unique_violation THEN - -- Do nothing, and loop to try the UPDATE again. - END; - END LOOP; + INSERT INTO "pets"("name", "tag_number") VALUES ("name_set", "tag_number_set"); + RETURN; + EXCEPTION WHEN unique_violation THEN + -- seamusabshere 9/20/12 only retry once + IF (first_try = 1) THEN + first_try := 0; + ELSE + RETURN; + END IF; + -- Do nothing, and loop to try the UPDATE again. END; - $$ - LANGUAGE plpgsql; - SELECT merge_db(1, 'david'); - SELECT merge_db(1, 'dennis'); + END LOOP; +END; +$$ +LANGUAGE plpgsql; +``` I slightly modified it so that it only retries once - don't want infinite loops. -The decision was made **not** to use the following because it's not straight from the manual: - - # http://stackoverflow.com/questions/1109061/insert-on-duplicate-update-postgresql - UPDATE table SET field='C', field2='Z' WHERE id=3; - INSERT INTO table (id, field, field2) - SELECT 3, 'C', 'Z' - WHERE NOT EXISTS (SELECT 1 FROM table WHERE id=3); - -This was also rejected because there's something we can use in the manual: - - # http://stackoverflow.com/questions/5269590/why-doesnt-this-rule-prevent-duplicate-key-violations - BEGIN; - CREATE TEMP TABLE stage_data(key_column, data_columns...) ON COMMIT DROP; - \copy stage_data from data.csv with csv header - -- prevent any other updates while we are merging input (omit this if you don't need it) - LOCK target_data IN SHARE ROW EXCLUSIVE MODE; - -- insert into target table - INSERT INTO target_data(key_column, data_columns...) - SELECT key_column, data_columns... - FROM stage_data - WHERE NOT EXISTS (SELECT 1 FROM target_data - WHERE target_data.key_column = stage_data.key_column) - END; - ### Sqlite Using the [sqlite3](https://rubygems.org/gems/sqlite3) driver. - upsert = Upsert.new(SQLite3::Database.open(':memory:'), :pets) +```ruby +connection = SQLite3::Database.open(':memory:') +table_name = :pets +upsert = Upsert.new(connection, table_name) +``` #### Speed @@ -187,16 +239,20 @@ From the tests (updated 9/21/12): #### SQL MERGE trick - # http://stackoverflow.com/questions/2717590/sqlite-upsert-on-duplicate-key-update - # bad example because we're not doing on-duplicate-key update - INSERT OR IGNORE INTO visits VALUES (127.0.0.1, 1); - UPDATE visits SET visits = 1 WHERE ip LIKE 127.0.0.1; +Thanks to [@dan04's answer on StackOverflow](http://stackoverflow.com/questions/2717590/sqlite-upsert-on-duplicate-key-update): + +```sql +INSERT OR IGNORE INTO visits VALUES (127.0.0.1, 1); +UPDATE visits SET visits = 1 WHERE ip LIKE 127.0.0.1; +``` ### Rails / ActiveRecord (assuming that one of the other three supported drivers is being used under the covers) - Upsert.new Pet.connection, Pet.table_name +```ruby +Upsert.new Pet.connection, Pet.table_name +``` #### Speed @@ -219,22 +275,26 @@ In addition to correctness, the library's tests check that it is As below, all you need is a raw database connection like a `Mysql2::Connection`, `PG::Connection` or a `SQLite3::Database`. These are equivalent: - # with activerecord - Upsert.new ActiveRecord::Base.connection, :pets - # with activerecord, prettier - Upsert.new Pet.connection, Pet.table_name - # without activerecord - Upsert.new Mysql2::Connection.new([...]), :pets +```ruby +# with activerecord +Upsert.new ActiveRecord::Base.connection, :pets +# with activerecord, prettier +Upsert.new Pet.connection, Pet.table_name +# without activerecord +Upsert.new Mysql2::Connection.new([...]), :pets +``` ### For a specific use case, faster and more portable than `activerecord-import` You could also use [activerecord-import](https://github.com/zdennis/activerecord-import) to upsert: - Pet.import columns, all_values, :timestamps => false, :on_duplicate_key_update => columns +```ruby +Pet.import columns, all_values, :timestamps => false, :on_duplicate_key_update => columns +``` This, however, only works on MySQL and requires ActiveRecord—and if all you are doing is upserts, `upsert` is tested to be 40% faster. And you don't have to put all of the rows to be upserted into a single huge array - you can batch them using `Upsert.batch`. ## Copyright -Copyright 2012 Brighter Planet, Inc. +Copyright 2012 Seamus Abshere diff --git a/lib/upsert.rb b/lib/upsert.rb index f0b8270..953167f 100644 --- a/lib/upsert.rb +++ b/lib/upsert.rb @@ -4,8 +4,9 @@ require 'upsert/version' require 'upsert/binary' -require 'upsert/buffer' require 'upsert/connection' +require 'upsert/merge_function' +require 'upsert/column_definition' require 'upsert/row' require 'upsert/cell' @@ -43,7 +44,7 @@ def logger # Currently only applies to PostgreSQL. def clear_database_functions(connection) dummy = new(connection, :dummy) - dummy.buffer.clear_database_functions + dummy.clear_database_functions end # @param [String] v A string containing binary data that should be inserted/escaped as such. @@ -53,9 +54,7 @@ def binary(v) Binary.new v end - # Guarantee that the most efficient way of buffering rows is used. - # - # Currently mostly helps for MySQL, but you should use it whenever possible in case future buffering-based optimizations become possible. + # More efficient way of upserting multiple rows at once. # # @param [Mysql2::Client,Sqlite3::Database,PG::Connection,#raw_connection] connection A supported database connection. # @param [String,Symbol] table_name The name of the table into which you will be upserting. @@ -71,9 +70,7 @@ def binary(v) # end def batch(connection, table_name) upsert = new connection, table_name - upsert.buffer.async! yield upsert - upsert.buffer.sync! end # @deprecated Use .batch instead. @@ -101,25 +98,32 @@ def batch(connection, table_name) # @return [String] attr_reader :table_name - # @private - attr_reader :buffer - # @private attr_reader :row_class # @private attr_reader :cell_class + # @private + attr_reader :column_definition_class + + # @private + attr_reader :merge_function_class + # @param [Mysql2::Client,Sqlite3::Database,PG::Connection,#raw_connection] connection A supported database connection. # @param [String,Symbol] table_name The name of the table into which you will be upserting. def initialize(connection, table_name) @table_name = table_name.to_s raw_connection = connection.respond_to?(:raw_connection) ? connection.raw_connection : connection connection_class_name = HANDLER[raw_connection.class.name] + Dir[File.expand_path("../upsert/**/#{connection_class_name}.rb", __FILE__)].each do |path| + require path + end @connection = Connection.const_get(connection_class_name).new self, raw_connection - @buffer = Buffer.const_get(connection_class_name).new self @row_class = Row.const_get connection_class_name @cell_class = Cell.const_get connection_class_name + @column_definition_class = ColumnDefinition.const_get connection_class_name + @merge_function_class = MergeFunction.const_get connection_class_name end # Upsert a row given a selector and a setter. @@ -138,12 +142,22 @@ def initialize(connection, table_name) # upsert.row({:name => 'Jerry'}, :breed => 'beagle') # upsert.row({:name => 'Pierre'}, :breed => 'tabby') def row(selector, setter = {}) - buffer << row_class.new(self, selector, setter) + merge_function_class.execute self, row_class.new(self, selector, setter) nil end + # @private + def clear_database_functions + merge_function_class.clear connection + end + # @private def quoted_table_name @quoted_table_name ||= connection.quote_ident table_name end + + # @private + def column_definitions + @column_definitions ||= column_definition_class.all connection, table_name + end end diff --git a/lib/upsert/buffer.rb b/lib/upsert/buffer.rb deleted file mode 100644 index c43345f..0000000 --- a/lib/upsert/buffer.rb +++ /dev/null @@ -1,36 +0,0 @@ -require 'upsert/buffer/mysql2_client' -require 'upsert/buffer/pg_connection' -require 'upsert/buffer/sqlite3_database' - -class Upsert - # @private - class Buffer - attr_reader :parent - attr_reader :rows - - def initialize(parent) - @parent = parent - @rows = [] - end - - def <<(row) - rows << row - ready - end - - def async? - !!@async - end - - def async! - @async = true - end - - def sync! - @async = false - until rows.empty? - ready - end - end - end -end diff --git a/lib/upsert/buffer/mysql2_client.rb b/lib/upsert/buffer/mysql2_client.rb deleted file mode 100644 index 433c32c..0000000 --- a/lib/upsert/buffer/mysql2_client.rb +++ /dev/null @@ -1,80 +0,0 @@ -class Upsert - # @private - class Buffer - class Mysql2_Client < Buffer - def ready - return if rows.empty? - connection = parent.connection - if not async? - connection.execute sql - rows.clear - return - end - @cumulative_sql_bytesize ||= static_sql_bytesize - new_row = rows.pop - d = new_row.values_sql_bytesize + 3 # ),( - if @cumulative_sql_bytesize + d > max_sql_bytesize - connection.execute sql - rows.clear - @cumulative_sql_bytesize = static_sql_bytesize + d - else - @cumulative_sql_bytesize += d - end - rows << new_row - nil - end - - def setter - @setter ||= rows.first.setter.keys - end - - def original_setter - @original_setter ||= rows.first.original_setter_keys - end - - def insert_part - @insert_part ||= begin - connection = parent.connection - column_names = setter.map { |k| connection.quote_ident(k) } - %{INSERT INTO #{parent.quoted_table_name} (#{column_names.join(',')}) VALUES } - end - end - - def update_part - @update_part ||= begin - connection = parent.connection - updaters = setter.map do |k| - quoted_name = connection.quote_ident(k) - if original_setter.include?(k) - "#{quoted_name}=VALUES(#{quoted_name})" - else - # NOOP - "#{quoted_name}=#{quoted_name}" - end - end.join(',') - %{ ON DUPLICATE KEY UPDATE #{updaters}} - end - end - - # where 2 is the parens - def static_sql_bytesize - @static_sql_bytesize ||= insert_part.bytesize + update_part.bytesize + 2 - end - - def sql - [ - insert_part, - '(', - rows.map { |row| row.quoted_setter_values.join(',') }.join('),('), - ')', - update_part - ].join - end - - # since setting an option like :as => :hash actually persists that option to the client, don't pass any options - def max_sql_bytesize - @max_sql_bytesize ||= parent.connection.database_variable_get(:MAX_ALLOWED_PACKET).to_i - end - end - end -end diff --git a/lib/upsert/buffer/pg_connection.rb b/lib/upsert/buffer/pg_connection.rb deleted file mode 100644 index 8005f2b..0000000 --- a/lib/upsert/buffer/pg_connection.rb +++ /dev/null @@ -1,19 +0,0 @@ -require 'upsert/buffer/pg_connection/column_definition' -require 'upsert/buffer/pg_connection/merge_function' - -class Upsert - class Buffer - # @private - class PG_Connection < Buffer - def ready - return if rows.empty? - row = rows.shift - MergeFunction.execute self, row - end - - def clear_database_functions - MergeFunction.clear self - end - end - end -end diff --git a/lib/upsert/buffer/pg_connection/column_definition.rb b/lib/upsert/buffer/pg_connection/column_definition.rb deleted file mode 100644 index 287d6f1..0000000 --- a/lib/upsert/buffer/pg_connection/column_definition.rb +++ /dev/null @@ -1,59 +0,0 @@ -class Upsert - class Buffer - class PG_Connection < Buffer - # @private - class ColumnDefinition - class << self - # activerecord-3.2.5/lib/active_record/connection_adapters/postgresql_adapter.rb#column_definitions - def all(buffer, table_name) - connection = buffer.parent.connection - res = connection.execute <<-EOS -SELECT a.attname AS name, format_type(a.atttypid, a.atttypmod) AS sql_type, d.adsrc AS default -FROM pg_attribute a LEFT JOIN pg_attrdef d - ON a.attrelid = d.adrelid AND a.attnum = d.adnum -WHERE a.attrelid = '#{connection.quote_ident(table_name)}'::regclass -AND a.attnum > 0 AND NOT a.attisdropped -EOS - res.map do |row| - new connection, row['name'], row['sql_type'], row['default'] - end.sort_by do |cd| - cd.name - end - end - end - - attr_reader :name - attr_reader :sql_type - attr_reader :default - attr_reader :quoted_name - attr_reader :quoted_selector_name - attr_reader :quoted_setter_name - - def initialize(connection, name, sql_type, default) - @name = name - @sql_type = sql_type - @default = default - @quoted_name = connection.quote_ident name - @quoted_selector_name = connection.quote_ident "#{name}_selector" - @quoted_setter_name = connection.quote_ident "#{name}_setter" - end - - def to_selector_arg - "#{quoted_selector_name} #{sql_type}" - end - - def to_setter_arg - "#{quoted_setter_name} #{sql_type}" - end - - def to_setter - "#{quoted_name} = #{quoted_setter_name}" - end - - def to_selector - "#{quoted_name} = #{quoted_selector_name}" - end - end - end - end -end diff --git a/lib/upsert/buffer/pg_connection/merge_function.rb b/lib/upsert/buffer/pg_connection/merge_function.rb deleted file mode 100644 index eeba5fc..0000000 --- a/lib/upsert/buffer/pg_connection/merge_function.rb +++ /dev/null @@ -1,179 +0,0 @@ -require 'digest/md5' - -class Upsert - class Buffer - class PG_Connection < Buffer - # @private - class MergeFunction - class << self - def execute(buffer, row) - merge_function = lookup buffer, row - merge_function.execute row - end - - def unique_name(table_name, selector, setter) - parts = [ - 'upsert', - table_name, - 'SEL', - selector.join('_A_'), - 'SET', - setter.join('_A_') - ].join('_') - # maybe i should md5 instead - crc32 = Zlib.crc32(parts).to_s - [ parts.first(MAX_NAME_LENGTH-11), crc32 ].join - end - - def lookup(buffer, row) - @lookup ||= {} - selector = row.selector.keys - setter = row.setter.keys - key = [buffer.parent.table_name, selector, setter] - @lookup[key] ||= new(buffer, selector, setter) - end - - def clear(buffer) - connection = buffer.parent.connection - # http://stackoverflow.com/questions/7622908/postgresql-drop-function-without-knowing-the-number-type-of-parameters - connection.execute <<-EOS -CREATE OR REPLACE FUNCTION pg_temp.upsert_delfunc(text) - RETURNS void AS -$BODY$ -DECLARE - _sql text; -BEGIN - -FOR _sql IN - SELECT 'DROP FUNCTION ' || quote_ident(n.nspname) - || '.' || quote_ident(p.proname) - || '(' || pg_catalog.pg_get_function_identity_arguments(p.oid) || ');' - FROM pg_catalog.pg_proc p - LEFT JOIN pg_catalog.pg_namespace n ON n.oid = p.pronamespace - WHERE p.proname = $1 - AND pg_catalog.pg_function_is_visible(p.oid) -- you may or may not want this -LOOP - EXECUTE _sql; -END LOOP; - -END; -$BODY$ - LANGUAGE plpgsql; -EOS - res = connection.execute(%{SELECT proname FROM pg_proc WHERE proname LIKE 'upsert_%'}) - res.each do |row| - k = row['proname'] - next if k == 'upsert_delfunc' - Upsert.logger.info %{[upsert] Dropping function #{k.inspect}} - connection.execute %{SELECT pg_temp.upsert_delfunc('#{k}')} - end - end - end - - MAX_NAME_LENGTH = 63 - - attr_reader :buffer - attr_reader :selector - attr_reader :setter - - def initialize(buffer, selector, setter) - @buffer = buffer - @selector = selector - @setter = setter - create! - end - - def name - @name ||= MergeFunction.unique_name table_name, selector, setter - end - - def execute(row) - first_try = true - bind_selector_values = row.selector.values.map(&:bind_value) - bind_setter_values = row.setter.values.map(&:bind_value) - begin - connection.execute sql, (bind_selector_values + bind_setter_values) - rescue PG::Error => pg_error - if pg_error.message =~ /function #{name}.* does not exist/i - if first_try - Upsert.logger.info %{[upsert] Function #{name.inspect} went missing, trying to recreate} - first_try = false - create! - retry - else - Upsert.logger.info %{[upsert] Failed to create function #{name.inspect} for some reason} - raise pg_error - end - else - raise pg_error - end - end - end - - private - - def sql - @sql ||= begin - bind_params = [] - 1.upto(selector.length + setter.length) { |i| bind_params << "$#{i}" } - %{SELECT #{name}(#{bind_params.join(', ')})} - end - end - - def connection - buffer.parent.connection - end - - def table_name - buffer.parent.table_name - end - - def quoted_table_name - buffer.parent.quoted_table_name - end - - # the "canonical example" from http://www.postgresql.org/docs/9.1/static/plpgsql-control-structures.html#PLPGSQL-UPSERT-EXAMPLE - # differentiate between selector and setter - def create! - Upsert.logger.info "[upsert] Creating or replacing database function #{name.inspect} on table #{table_name.inspect} for selector #{selector.map(&:inspect).join(', ')} and setter #{setter.map(&:inspect).join(', ')}" - column_definitions = ColumnDefinition.all buffer, table_name - selector_column_definitions = column_definitions.select { |cd| selector.include?(cd.name) } - setter_column_definitions = column_definitions.select { |cd| setter.include?(cd.name) } - connection.execute <<-EOS -CREATE OR REPLACE FUNCTION #{name}(#{(selector_column_definitions.map(&:to_selector_arg) + setter_column_definitions.map(&:to_setter_arg)).join(', ')}) RETURNS VOID AS -$$ -DECLARE - first_try INTEGER := 1; -BEGIN - LOOP - -- first try to update the key - UPDATE #{quoted_table_name} SET #{setter_column_definitions.map(&:to_setter).join(', ')} - WHERE #{selector_column_definitions.map(&:to_selector).join(' AND ') }; - IF found THEN - RETURN; - END IF; - -- not there, so try to insert the key - -- if someone else inserts the same key concurrently, - -- we could get a unique-key failure - BEGIN - INSERT INTO #{quoted_table_name}(#{setter_column_definitions.map(&:quoted_name).join(', ')}) VALUES (#{setter_column_definitions.map(&:quoted_setter_name).join(', ')}); - RETURN; - EXCEPTION WHEN unique_violation THEN - -- seamusabshere 9/20/12 only retry once - IF (first_try = 1) THEN - first_try := 0; - ELSE - RETURN; - END IF; - -- Do nothing, and loop to try the UPDATE again. - END; - END LOOP; -END; -$$ -LANGUAGE plpgsql; -EOS - end - end - end - end -end diff --git a/lib/upsert/buffer/sqlite3_database.rb b/lib/upsert/buffer/sqlite3_database.rb deleted file mode 100644 index 4fd9331..0000000 --- a/lib/upsert/buffer/sqlite3_database.rb +++ /dev/null @@ -1,21 +0,0 @@ -class Upsert - class Buffer - # @private - class SQLite3_Database < Buffer - def ready - return if rows.empty? - row = rows.shift - connection = parent.connection - bind_setter_values = row.setter.values.map(&:bind_value) - quoted_setter_names = row.setter.values.map(&:quoted_name) - quoted_selector_names = row.selector.values.map(&:quoted_name) - - insert_or_ignore_sql = %{INSERT OR IGNORE INTO #{parent.quoted_table_name} (#{quoted_setter_names.join(',')}) VALUES (#{Array.new(bind_setter_values.length, '?').join(',')})} - connection.execute insert_or_ignore_sql, bind_setter_values - - update_sql = %{UPDATE #{parent.quoted_table_name} SET #{quoted_setter_names.map { |qk| "#{qk}=?" }.join(',')} WHERE #{quoted_selector_names.map { |qk| "#{qk}=?" }.join(' AND ')}} - connection.execute update_sql, (bind_setter_values + row.selector.values.map(&:bind_value)) - end - end - end -end diff --git a/lib/upsert/cell.rb b/lib/upsert/cell.rb index 0c53e01..1d997eb 100644 --- a/lib/upsert/cell.rb +++ b/lib/upsert/cell.rb @@ -1,7 +1,3 @@ -require 'upsert/cell/mysql2_client' -require 'upsert/cell/pg_connection' -require 'upsert/cell/sqlite3_database' - class Upsert # @private class Cell diff --git a/lib/upsert/column_definition.rb b/lib/upsert/column_definition.rb new file mode 100644 index 0000000..41a4ab8 --- /dev/null +++ b/lib/upsert/column_definition.rb @@ -0,0 +1,43 @@ +class Upsert + # @private + class ColumnDefinition + class << self + # activerecord-3.2.X/lib/active_record/connection_adapters/XXXXXXXXX_adapter.rb#column_definitions + def all(connection, table_name) + raise "not impl" + end + end + + attr_reader :name + attr_reader :sql_type + attr_reader :default + attr_reader :quoted_name + attr_reader :quoted_selector_name + attr_reader :quoted_setter_name + + def initialize(connection, name, sql_type, default) + @name = name + @sql_type = sql_type + @default = default + @quoted_name = connection.quote_ident name + @quoted_selector_name = connection.quote_ident "#{name}_sel" + @quoted_setter_name = connection.quote_ident "#{name}_set" + end + + def to_selector_arg + "#{quoted_selector_name} #{sql_type}" + end + + def to_setter_arg + "#{quoted_setter_name} #{sql_type}" + end + + def to_setter + "#{quoted_name} = #{quoted_setter_name}" + end + + def to_selector + "#{quoted_name} = #{quoted_selector_name}" + end + end +end diff --git a/lib/upsert/column_definition/Mysql2_Client.rb b/lib/upsert/column_definition/Mysql2_Client.rb new file mode 100644 index 0000000..5536917 --- /dev/null +++ b/lib/upsert/column_definition/Mysql2_Client.rb @@ -0,0 +1,24 @@ +class Upsert + class ColumnDefinition + # @private + class Mysql2_Client < ColumnDefinition + class << self + def all(connection, table_name) + connection.execute("SHOW COLUMNS FROM #{connection.quote_ident(table_name)}").map do |row| + name, type, default = if row.is_a?(Array) + # you don't know if mysql2 is going to give you an array or a hash... and you shouldn't specify, because it's sticky + # ["name", "varchar(255)", "YES", "UNI", nil, ""] + row.values_at(0,1,4) + else + # {"Field"=>"name", "Type"=>"varchar(255)", "Null"=>"NO", "Key"=>"PRI", "Default"=>nil, "Extra"=>""} + [row['Field'], row['Type'], row['Default']] + end + new connection, name, type, default + end.sort_by do |cd| + cd.name + end + end + end + end + end +end diff --git a/lib/upsert/column_definition/PG_Connection.rb b/lib/upsert/column_definition/PG_Connection.rb new file mode 100644 index 0000000..8924a8b --- /dev/null +++ b/lib/upsert/column_definition/PG_Connection.rb @@ -0,0 +1,24 @@ +class Upsert + class ColumnDefinition + # @private + class PG_Connection < ColumnDefinition + class << self + # activerecord-3.2.5/lib/active_record/connection_adapters/postgresql_adapter.rb#column_definitions + def all(connection, table_name) + res = connection.execute <<-EOS +SELECT a.attname AS name, format_type(a.atttypid, a.atttypmod) AS sql_type, d.adsrc AS default +FROM pg_attribute a LEFT JOIN pg_attrdef d + ON a.attrelid = d.adrelid AND a.attnum = d.adnum +WHERE a.attrelid = '#{connection.quote_ident(table_name)}'::regclass +AND a.attnum > 0 AND NOT a.attisdropped +EOS + res.map do |row| + new connection, row['name'], row['sql_type'], row['default'] + end.sort_by do |cd| + cd.name + end + end + end + end + end +end diff --git a/lib/upsert/column_definition/SQLite3_Database.rb b/lib/upsert/column_definition/SQLite3_Database.rb new file mode 100644 index 0000000..9b9e3e5 --- /dev/null +++ b/lib/upsert/column_definition/SQLite3_Database.rb @@ -0,0 +1,7 @@ +class Upsert + class ColumnDefinition + # @private + class SQLite3_Database < ColumnDefinition + end + end +end diff --git a/lib/upsert/connection.rb b/lib/upsert/connection.rb index 094b307..54225c7 100644 --- a/lib/upsert/connection.rb +++ b/lib/upsert/connection.rb @@ -1,15 +1,11 @@ -require 'upsert/connection/mysql2_client' -require 'upsert/connection/pg_connection' -require 'upsert/connection/sqlite3_database' - class Upsert # @private class Connection - attr_reader :parent + attr_reader :controller attr_reader :raw_connection - def initialize(parent, raw_connection) - @parent = parent + def initialize(controller, raw_connection) + @controller = controller @raw_connection = raw_connection end diff --git a/lib/upsert/merge_function.rb b/lib/upsert/merge_function.rb new file mode 100644 index 0000000..ea84864 --- /dev/null +++ b/lib/upsert/merge_function.rb @@ -0,0 +1,72 @@ +require 'zlib' + +class Upsert + # @private + class MergeFunction + MAX_NAME_LENGTH = 63 + + class << self + def execute(controller, row) + merge_function = lookup controller, row + merge_function.execute row + end + + def unique_name(table_name, selector_keys, setter_keys) + parts = [ + 'upsert', + table_name, + 'SEL', + selector_keys.join('_A_'), + 'SET', + setter_keys.join('_A_') + ].join('_') + if parts.length > MAX_NAME_LENGTH + # maybe i should md5 instead + crc32 = Zlib.crc32(parts).to_s + [ parts.first(MAX_NAME_LENGTH-11), crc32 ].join + else + parts + end + end + + def lookup(controller, row) + @lookup ||= {} + selector_keys = row.selector.keys + setter_keys = row.setter.keys + key = [controller.table_name, selector_keys, setter_keys] + @lookup[key] ||= new(controller, selector_keys, setter_keys) + end + end + + attr_reader :controller + attr_reader :selector_keys + attr_reader :setter_keys + + def initialize(controller, selector_keys, setter_keys) + @controller = controller + @selector_keys = selector_keys + @setter_keys = setter_keys + create! + end + + def name + @name ||= MergeFunction.unique_name table_name, selector_keys, setter_keys + end + + def connection + controller.connection + end + + def table_name + controller.table_name + end + + def quoted_table_name + controller.quoted_table_name + end + + def column_definitions + controller.column_definitions + end + end +end diff --git a/lib/upsert/merge_function/Mysql2_Client.rb b/lib/upsert/merge_function/Mysql2_Client.rb new file mode 100644 index 0000000..d37cbe6 --- /dev/null +++ b/lib/upsert/merge_function/Mysql2_Client.rb @@ -0,0 +1,87 @@ +require 'digest/md5' + +class Upsert + class MergeFunction + # @private + class Mysql2_Client < MergeFunction + MAX_NAME_LENGTH = 63 + + class << self + # http://stackoverflow.com/questions/733349/list-of-stored-procedures-functions-mysql-command-line + def clear(connection) + connection.execute("SHOW PROCEDURE STATUS WHERE Db = DATABASE() AND Name LIKE 'upsert_%'").map { |row| row['Name'] }.each do |name| + connection.execute "DROP PROCEDURE IF EXISTS #{connection.quote_ident(name)}" + end + end + end + + def execute(row) + first_try = true + begin + connection.execute sql(row) + rescue Mysql2::Error => e + if e.message =~ /PROCEDURE.*does not exist/i + if first_try + Upsert.logger.info %{[upsert] Function #{name.inspect} went missing, trying to recreate} + first_try = false + create! + retry + else + Upsert.logger.info %{[upsert] Failed to create function #{name.inspect} for some reason} + raise e + end + else + raise e + end + end + end + + def sql(row) + quoted_params = (row.selector.values + row.setter.values).map(&:quoted_value) + %{CALL #{name}(#{quoted_params.join(', ')})} + end + + # http://stackoverflow.com/questions/11371479/how-to-translate-postgresql-merge-db-aka-upsert-function-into-mysql/ + def create! + Upsert.logger.info "[upsert] Creating or replacing database function #{name.inspect} on table #{table_name.inspect} for selector #{selector_keys.map(&:inspect).join(', ')} and setter #{setter_keys.map(&:inspect).join(', ')}" + selector_column_definitions = column_definitions.select { |cd| selector_keys.include?(cd.name) } + setter_column_definitions = column_definitions.select { |cd| setter_keys.include?(cd.name) } + connection.execute(%{ + CREATE PROCEDURE #{name}(#{(selector_column_definitions.map(&:to_selector_arg) + setter_column_definitions.map(&:to_setter_arg)).join(', ')}) + BEGIN + DECLARE done BOOLEAN; + REPEAT + BEGIN + -- If there is a unique key constraint error then + -- someone made a concurrent insert. Reset the sentinel + -- and try again. + DECLARE ER_DUP_UNIQUE CONDITION FOR 23000; + DECLARE ER_INTEG CONDITION FOR 1062; + DECLARE CONTINUE HANDLER FOR ER_DUP_UNIQUE BEGIN + SET done = FALSE; + END; + + DECLARE CONTINUE HANDLER FOR ER_INTEG BEGIN + SET done = TRUE; + END; + + SET done = TRUE; + SELECT COUNT(*) INTO @count FROM #{quoted_table_name} WHERE #{selector_column_definitions.map(&:to_selector).join(' AND ')}; + -- Race condition here. If a concurrent INSERT is made after + -- the SELECT but before the INSERT below we'll get a duplicate + -- key error. But the handler above will take care of that. + IF @count > 0 THEN + -- UPDATE table_name SET b = b_SET WHERE a = a_SEL; + UPDATE #{quoted_table_name} SET #{setter_column_definitions.map(&:to_setter).join(', ')} WHERE #{selector_column_definitions.map(&:to_selector).join(' AND ')}; + ELSE + -- INSERT INTO table_name (a, b) VALUES (k, data); + INSERT INTO #{quoted_table_name} (#{setter_column_definitions.map(&:quoted_name).join(', ')}) VALUES (#{setter_column_definitions.map(&:quoted_setter_name).join(', ')}); + END IF; + END; + UNTIL done END REPEAT; + END + }) + end + end + end +end diff --git a/lib/upsert/merge_function/PG_Connection.rb b/lib/upsert/merge_function/PG_Connection.rb new file mode 100644 index 0000000..501f8d8 --- /dev/null +++ b/lib/upsert/merge_function/PG_Connection.rb @@ -0,0 +1,114 @@ +class Upsert + class MergeFunction + # @private + class PG_Connection < MergeFunction + MAX_NAME_LENGTH = 63 + + class << self + def clear(connection) + # http://stackoverflow.com/questions/7622908/postgresql-drop-function-without-knowing-the-number-type-of-parameters + connection.execute(%{ + CREATE OR REPLACE FUNCTION pg_temp.upsert_delfunc(text) + RETURNS void AS + $BODY$ + DECLARE + _sql text; + BEGIN + FOR _sql IN + SELECT 'DROP FUNCTION ' || quote_ident(n.nspname) + || '.' || quote_ident(p.proname) + || '(' || pg_catalog.pg_get_function_identity_arguments(p.oid) || ');' + FROM pg_catalog.pg_proc p + LEFT JOIN pg_catalog.pg_namespace n ON n.oid = p.pronamespace + WHERE p.proname = $1 + AND pg_catalog.pg_function_is_visible(p.oid) -- you may or may not want this + LOOP + EXECUTE _sql; + END LOOP; + END; + $BODY$ + LANGUAGE plpgsql; + }) + connection.execute(%{SELECT proname FROM pg_proc WHERE proname LIKE 'upsert_%'}).each do |row| + k = row['proname'] + next if k == 'upsert_delfunc' + Upsert.logger.info %{[upsert] Dropping function #{k.inspect}} + connection.execute %{SELECT pg_temp.upsert_delfunc('#{k}')} + end + end + end + + def execute(row) + first_try = true + bind_selector_values = row.selector.values.map(&:bind_value) + bind_setter_values = row.setter.values.map(&:bind_value) + begin + connection.execute sql, (bind_selector_values + bind_setter_values) + rescue PG::Error => pg_error + if pg_error.message =~ /function #{name}.* does not exist/i + if first_try + Upsert.logger.info %{[upsert] Function #{name.inspect} went missing, trying to recreate} + first_try = false + create! + retry + else + Upsert.logger.info %{[upsert] Failed to create function #{name.inspect} for some reason} + raise pg_error + end + else + raise pg_error + end + end + end + + def sql + @sql ||= begin + bind_params = [] + 1.upto(selector_keys.length + setter_keys.length) { |i| bind_params << "$#{i}" } + %{SELECT #{name}(#{bind_params.join(', ')})} + end + end + + # the "canonical example" from http://www.postgresql.org/docs/9.1/static/plpgsql-control-structures.html#PLPGSQL-UPSERT-EXAMPLE + # differentiate between selector and setter + def create! + Upsert.logger.info "[upsert] Creating or replacing database function #{name.inspect} on table #{table_name.inspect} for selector #{selector_keys.map(&:inspect).join(', ')} and setter #{setter_keys.map(&:inspect).join(', ')}" + selector_column_definitions = column_definitions.select { |cd| selector_keys.include?(cd.name) } + setter_column_definitions = column_definitions.select { |cd| setter_keys.include?(cd.name) } + connection.execute(%{ + CREATE OR REPLACE FUNCTION #{name}(#{(selector_column_definitions.map(&:to_selector_arg) + setter_column_definitions.map(&:to_setter_arg)).join(', ')}) RETURNS VOID AS + $$ + DECLARE + first_try INTEGER := 1; + BEGIN + LOOP + -- first try to update the key + UPDATE #{quoted_table_name} SET #{setter_column_definitions.map(&:to_setter).join(', ')} + WHERE #{selector_column_definitions.map(&:to_selector).join(' AND ') }; + IF found THEN + RETURN; + END IF; + -- not there, so try to insert the key + -- if someone else inserts the same key concurrently, + -- we could get a unique-key failure + BEGIN + INSERT INTO #{quoted_table_name}(#{setter_column_definitions.map(&:quoted_name).join(', ')}) VALUES (#{setter_column_definitions.map(&:quoted_setter_name).join(', ')}); + RETURN; + EXCEPTION WHEN unique_violation THEN + -- seamusabshere 9/20/12 only retry once + IF (first_try = 1) THEN + first_try := 0; + ELSE + RETURN; + END IF; + -- Do nothing, and loop to try the UPDATE again. + END; + END LOOP; + END; + $$ + LANGUAGE plpgsql; + }) + end + end + end +end diff --git a/lib/upsert/merge_function/SQLite3_Database.rb b/lib/upsert/merge_function/SQLite3_Database.rb new file mode 100644 index 0000000..ded35d6 --- /dev/null +++ b/lib/upsert/merge_function/SQLite3_Database.rb @@ -0,0 +1,29 @@ +class Upsert + class MergeFunction + # @private + class SQLite3_Database < MergeFunction + attr_reader :quoted_setter_names + attr_reader :quoted_selector_names + + def initialize(*) + super + @quoted_setter_names = setter_keys.map { |k| connection.quote_ident k } + @quoted_selector_names = selector_keys.map { |k| connection.quote_ident k } + end + + def create! + # not necessary + end + + def execute(row) + bind_setter_values = row.setter.values.map(&:bind_value) + + insert_or_ignore_sql = %{INSERT OR IGNORE INTO #{quoted_table_name} (#{quoted_setter_names.join(',')}) VALUES (#{Array.new(bind_setter_values.length, '?').join(',')})} + connection.execute insert_or_ignore_sql, bind_setter_values + + update_sql = %{UPDATE #{quoted_table_name} SET #{quoted_setter_names.map { |qk| "#{qk}=?" }.join(',')} WHERE #{quoted_selector_names.map { |qk| "#{qk}=?" }.join(' AND ')}} + connection.execute update_sql, (bind_setter_values + row.selector.values.map(&:bind_value)) + end + end + end +end diff --git a/lib/upsert/row.rb b/lib/upsert/row.rb index 413fd6c..d18bb28 100644 --- a/lib/upsert/row.rb +++ b/lib/upsert/row.rb @@ -1,7 +1,3 @@ -require 'upsert/row/mysql2_client' -require 'upsert/row/pg_connection' -require 'upsert/row/sqlite3_database' - class Upsert # @private class Row @@ -20,9 +16,9 @@ class Row attr_reader :setter - def initialize(parent, raw_selector, raw_setter) - connection = parent.connection - cell_class = parent.cell_class + def initialize(controller, raw_selector, raw_setter) + connection = controller.connection + cell_class = controller.cell_class @selector = raw_selector.inject({}) do |memo, (k, v)| memo[k.to_s] = cell_class.new(connection, k, v) diff --git a/lib/upsert/row/mysql2_client.rb b/lib/upsert/row/mysql2_client.rb index 5ae7284..ec4c2de 100644 --- a/lib/upsert/row/mysql2_client.rb +++ b/lib/upsert/row/mysql2_client.rb @@ -4,7 +4,7 @@ class Row class Mysql2_Client < Row attr_reader :original_setter_keys - def initialize(parent, raw_selector, raw_setter) + def initialize(controller, raw_selector, raw_setter) super @original_setter_keys = raw_setter.keys.map(&:to_s) end diff --git a/lib/upsert/version.rb b/lib/upsert/version.rb index 1b0459b..5295f47 100644 --- a/lib/upsert/version.rb +++ b/lib/upsert/version.rb @@ -1,3 +1,3 @@ class Upsert - VERSION = "0.5.0" + VERSION = "1.0.0" end diff --git a/spec/correctness_spec.rb b/spec/correctness_spec.rb index 12c5986..86d11a3 100644 --- a/spec/correctness_spec.rb +++ b/spec/correctness_spec.rb @@ -32,6 +32,20 @@ u.row(selector, setter) Pet.find_by_name('Jerry').tag_number.should == 20 end + + it "really limits its effects to the selector" do + p = Pet.new + p.name = 'Jerry' + p.gender = 'blue' + p.tag_number = 777 + p.save! + Pet.find_by_name_and_gender('Jerry', 'blue').tag_number.should == 777 + u = Upsert.new($conn, :pets) + selector = {name: 'Jerry', gender: 'red'} # this shouldn't select anything + setter = {tag_number: 888} + u.row(selector, setter) + Pet.find_by_name_and_gender('Jerry', 'blue').tag_number.should == 777 + end end describe "is just as correct as other ways" do describe 'compared to native ActiveRecord' do @@ -101,6 +115,6 @@ end end end - + end end diff --git a/spec/database_functions_spec.rb b/spec/database_functions_spec.rb index 3c85848..978e0ab 100644 --- a/spec/database_functions_spec.rb +++ b/spec/database_functions_spec.rb @@ -1,36 +1,42 @@ require 'spec_helper' require 'stringio' describe Upsert do - if ENV['ADAPTER'] == 'postgresql' - describe 'PostgreSQL database functions' do - it "re-uses merge functions across connections" do - begin - io = StringIO.new - old_logger = Upsert.logger - Upsert.logger = Logger.new io, Logger::INFO + def fresh_connection + case ENV['ADAPTER'] + when 'postgresql' + PGconn.new $conn_config + when 'mysql2' + Mysql2::Client.new $conn_config + end + end + describe 'database functions' do + it "re-uses merge functions across connections" do + begin + io = StringIO.new + old_logger = Upsert.logger + Upsert.logger = Logger.new io, Logger::INFO - # clear - Upsert.clear_database_functions(PGconn.new(:dbname => 'upsert_test')) - - # create - Upsert.new(PGconn.new(:dbname => 'upsert_test'), :pets).row :name => 'hello' + # clear + Upsert.clear_database_functions(fresh_connection) + + # create + Upsert.new(fresh_connection, :pets).row :name => 'hello' - # clear - Upsert.clear_database_functions(PGconn.new(:dbname => 'upsert_test')) + # clear + Upsert.clear_database_functions(fresh_connection) - # create (#2) - Upsert.new(PGconn.new(:dbname => 'upsert_test'), :pets).row :name => 'hello' + # create (#2) + Upsert.new(fresh_connection, :pets).row :name => 'hello' - # no create! - Upsert.new(PGconn.new(:dbname => 'upsert_test'), :pets).row :name => 'hello' - - io.rewind - hits = io.read.split("\n").grep(/Creating or replacing/) - hits.length.should == 2 - ensure - Upsert.logger = old_logger - end + # no create! + Upsert.new(fresh_connection, :pets).row :name => 'hello' + + io.rewind + hits = io.read.split("\n").grep(/Creating or replacing/) + hits.length.should == 2 + ensure + Upsert.logger = old_logger end end end -end +end if %w{ postgresql mysql2 }.include?(ENV['ADAPTER']) diff --git a/spec/logger_spec.rb b/spec/logger_spec.rb index 91625e2..72d52c5 100644 --- a/spec/logger_spec.rb +++ b/spec/logger_spec.rb @@ -1,18 +1,18 @@ require 'spec_helper' describe Upsert do describe "logger" do - it "logs to stderr by default" do + it "logs where you tell it" do begin - old_stderr = $stderr old_logger = Upsert.logger - Upsert.logger = nil - $stderr = StringIO.new - Upsert.logger.warn "hello" - $stderr.rewind - $stderr.read.chomp.should == 'hello' + io = StringIO.new + Thread.exclusive do + Upsert.logger = Logger.new(io) + Upsert.logger.warn "hello" + io.rewind + io.read.chomp.should == 'hello' + end ensure Upsert.logger = old_logger - $stderr = old_stderr end end diff --git a/spec/spec_helper.rb b/spec/spec_helper.rb index bc85219..c1b3602 100644 --- a/spec/spec_helper.rb +++ b/spec/spec_helper.rb @@ -14,23 +14,29 @@ system %{ dropdb upsert_test } system %{ createdb upsert_test } ActiveRecord::Base.establish_connection :adapter => 'postgresql', :database => 'upsert_test' - $conn = PGconn.new(:dbname => 'upsert_test') + $conn_config = { :dbname => 'upsert_test' } + $conn = PGconn.new $conn_config when 'mysql2' system %{ mysql -u root -ppassword -e "DROP DATABASE IF EXISTS upsert_test" } system %{ mysql -u root -ppassword -e "CREATE DATABASE upsert_test CHARSET utf8" } ActiveRecord::Base.establish_connection "#{RUBY_PLATFORM == 'java' ? 'mysql' : 'mysql2'}://root:password@127.0.0.1/upsert_test" - $conn = Mysql2::Client.new(:username => 'root', :password => 'password', :database => 'upsert_test') + $conn_config = { :username => 'root', :password => 'password', :database => 'upsert_test'} + $conn = Mysql2::Client.new $conn_config when 'sqlite3' ActiveRecord::Base.establish_connection :adapter => 'sqlite3', :database => ':memory:' $conn = ActiveRecord::Base.connection.raw_connection + $conn_config = :use_active_record_raw_connection_yo else raise "not supported" end -if ENV['UPSERT_DEBUG'] == 'true' - require 'logger' - ActiveRecord::Base.logger = Logger.new($stdout) +require 'logger' +ActiveRecord::Base.logger = Logger.new('test.log') + +if ENV['VERBOSE'] == 'true' ActiveRecord::Base.logger.level = Logger::DEBUG +else + ActiveRecord::Base.logger.level = Logger::WARN end class Pet < ActiveRecord::Base diff --git a/upsert.gemspec b/upsert.gemspec index 9650fef..ed77a44 100644 --- a/upsert.gemspec +++ b/upsert.gemspec @@ -27,6 +27,7 @@ Gem::Specification.new do |gem| gem.add_development_dependency 'active_record_inline_schema' gem.add_development_dependency 'faker' gem.add_development_dependency 'yard' + gem.add_development_dependency 'redcarpet' # github-flavored markdown gem.add_development_dependency 'activerecord-import' gem.add_development_dependency 'pry'