Handle arrays of two element arrays as filter hash values automatically

Previously, Sequel's behavior of arrays of two element arrays as hash values was suboptimal (some would say broken). Because the values were an array, an IN operator was used, but because arrays of two element arrays are considered condition specifiers, the argument was converted to a boolean condition: DB[:a].filter([:a, :b]=>[[1, 2], [3, 4]]) # SELECT * FROM a WHERE ((a, b) IN ((1 = 2) AND (3 = 4))) This changes the behavior so that you now get the SQL you expect: SELECT * FROM a WHERE ((a, b) IN ((1, 2), (3, 4))) Previously, you had to call sql_array (which was added in Sequel 2.7.0) on the array to get the correct SQL, but that is no longer necessary when the array is a filter hash value. However, it is still necessary to use a method that wraps the array to two element arrays if you are going to use it as a placeholder value: DB[:a].filter('(a, b) IN ?', [[1, 2], [3, 4]]) This is because Sequel doesn't have the context to know that the array of two element arrays is being used as a an IN predicate (as Sequel does not parse SQL). This actually started when I decided that SQLArray was a bad name for the class, since arrays in standard SQL (starting with SQL99) are very different that what SQLArray actually represented, which was an IN predicate value list. To that end, SQLArray has been renamed to ValueList, though SQLArray still exists as an alias. Likewise Array#sql_value_list has been added, though Array#sql_array still exists as an alias to it. However, in most cases, the methods are no longer necessary, and Sequel internally no longer uses ValueList, as the only time it wants to deal with arrays of two element arrays as value lists, it uses them as hash values, which Sequel can now handle automatically. Another main difference is that ValueList now inherits from ::Array, not from Expression. This simplified some code, and since you can't really perform any operations on an SQL value list by itself (until it is used as part of an IN expression), there's no reason it needs to be an Expression subclass. While here, fix a minor bug in Expression#inspect.
jeremyevans · Jun 24, 2010 · 65abb56 · 65abb56
1 parent b7c3059
commit 65abb56
Show file tree

Hide file tree

Showing 11 changed files with 66 additions and 51 deletions.
diff --git a/CHANGELOG b/CHANGELOG
@@ -1,5 +1,7 @@
 === HEAD
 
+* Handle arrays of two element arrays as filter hash values automatically (jeremyevans)
+
 * Allow :frame option for windows to take a string that is used literally (jeremyevans)
 
 * Support transaction isolation levels on PostgreSQL, MySQL, and MSSQL (jeremyevans)

diff --git a/lib/sequel/core.rb b/lib/sequel/core.rb
@@ -109,7 +109,7 @@ def self.condition_specifier?(obj)
     when Hash
       true
     when Array
-      !obj.empty? && obj.all?{|i| (Array === i) && (i.length == 2)}
+      !obj.empty? && !obj.is_a?(SQL::ValueList) && obj.all?{|i| (Array === i) && (i.length == 2)}
     else
       false
     end

diff --git a/lib/sequel/core_sql.rb b/lib/sequel/core_sql.rb
@@ -33,15 +33,19 @@ def case(default, expression = nil)
     ::Sequel::SQL::CaseExpression.new(self, default, expression)
   end
 
-  # Return a Sequel::SQL::Array created from this array.  Used if this array contains
-  # all two pairs and you want it treated as an SQL array instead of a ordered hash-like
-  # conditions.
-  #
-  #   [[1, 2], [3, 4]] # SQL: 1 = 2 AND 3 = 4
-  #   [[1, 2], [3, 4]].sql_array # SQL: ((1, 2), (3, 4))
-  def sql_array
-    ::Sequel::SQL::SQLArray.new(self)
-  end
+  # Return a Sequel::SQL::ValueList created from this array.  Used if this array contains
+  # all two element arrays and you want it treated as an SQL value list (IN predicate) 
+  # instead of a ordered hash-like conditions.  This is not necessary if you are using
+  # this array as a value in a filter, but may be necessary if you are using it as a
+  # value with placeholder SQL:
+  #
+  #   DB[:a].filter([:a, :b]=>[[1, 2], [3, 4]]) # SQL: (a, b) IN ((1, 2), (3, 4))
+  #   DB[:a].filter('(a, b) IN ?', [[1, 2], [3, 4]]) # SQL: (a, b) IN ((1 = 2) AND (3 = 4))
+  #   DB[:a].filter('(a, b) IN ?', [[1, 2], [3, 4]].sql_value_list) # SQL: (a, b) IN ((1, 2), (3, 4))
+  def sql_value_list
+    ::Sequel::SQL::ValueList.new(self)
+  end
+  alias sql_array sql_value_list
 
   # Return a Sequel::SQL::BooleanExpression created from this array, matching all of the
   # conditions.  Rarely do you need to call this explicitly, as Sequel generally

diff --git a/lib/sequel/dataset/sql.rb b/lib/sequel/dataset/sql.rb
@@ -261,10 +261,10 @@ def complex_expression_sql(op, args)
       when :IN, :"NOT IN"
         cols = args.at(0)
         vals = args.at(1)
-        col_array = true if cols.is_a?(Array) || cols.is_a?(SQL::SQLArray)
-        if vals.is_a?(Array) || vals.is_a?(SQL::SQLArray)
+        col_array = true if cols.is_a?(Array)
+        if vals.is_a?(Array)
           val_array = true
-          empty_val_array = vals.to_a == []
+          empty_val_array = vals == []
         end
         if col_array
           if empty_val_array
@@ -284,7 +284,10 @@ def complex_expression_sql(op, args)
               complex_expression_sql(op, [cols, vals.map!{|x| x.values_at(*val_cols)}])
             end
           else
-            "(#{literal(cols)} #{op} #{literal(vals)})"
+            # If the columns and values are both arrays, use array_sql instead of
+            # literal so that if values is an array of two element arrays, it
+            # will be treated as a value list instead of a condition specifier.
+            "(#{literal(cols)} #{op} #{val_array ? array_sql(vals) : literal(vals)})"
           end
         else
           if empty_val_array
@@ -817,8 +820,6 @@ def qualified_expression(e, table)
         SQL::Function.new(e.f, *qualified_expression(e.args, table))
       when SQL::ComplexExpression 
         SQL::ComplexExpression.new(e.op, *qualified_expression(e.args, table))
-      when SQL::SQLArray 
-        SQL::SQLArray.new(qualified_expression(e.array, table))
       when SQL::Subscript 
         SQL::Subscript.new(qualified_expression(e.f, table), qualified_expression(e.sub, table))
       when SQL::WindowFunction

diff --git a/lib/sequel/model/associations.rb b/lib/sequel/model/associations.rb
@@ -778,7 +778,7 @@ def def_many_to_many(opts)
             h = eo[:key_hash][left_pk]
             eo[:rows].each{|object| object.associations[name] = []}
             r = uses_rcks ? rcks.zip(opts.right_primary_keys) : [[right, opts.right_primary_key]]
-            l = uses_lcks ? [[lcks.map{|k| SQL::QualifiedIdentifier.new(join_table, k)}, SQL::SQLArray.new(h.keys)]] : [[left, h.keys]]
+            l = uses_lcks ? [[lcks.map{|k| SQL::QualifiedIdentifier.new(join_table, k)}, h.keys]] : [[left, h.keys]]
             model.eager_loading_dataset(opts, opts.associated_class.inner_join(join_table, r + l), Array(opts.select), eo[:associations], eo).all do |assoc_record|
               hash_key = if uses_lcks
                 left_key_alias.map{|k| assoc_record.values.delete(k)}
@@ -849,7 +849,7 @@ def def_many_to_one(opts)
             # Skip eager loading if no objects have a foreign key for this association
             unless keys.empty?
               klass = opts.associated_class
-              model.eager_loading_dataset(opts, klass.filter(uses_cks ? {opts.primary_keys.map{|k| SQL::QualifiedIdentifier.new(klass.table_name, k)}=>SQL::SQLArray.new(keys)} : {SQL::QualifiedIdentifier.new(klass.table_name, opts.primary_key)=>keys}), opts.select, eo[:associations], eo).all do |assoc_record|
+              model.eager_loading_dataset(opts, klass.filter(uses_cks ? {opts.primary_keys.map{|k| SQL::QualifiedIdentifier.new(klass.table_name, k)}=>keys} : {SQL::QualifiedIdentifier.new(klass.table_name, opts.primary_key)=>keys}), opts.select, eo[:associations], eo).all do |assoc_record|
                 hash_key = uses_cks ? opts.primary_keys.map{|k| assoc_record.send(k)} : assoc_record.send(opts.primary_key)
                 next unless objects = h[hash_key]
                 objects.each{|object| object.associations[name] = assoc_record}
@@ -899,7 +899,7 @@ def def_one_to_many(opts)
             end
             reciprocal = opts.reciprocal
             klass = opts.associated_class
-            model.eager_loading_dataset(opts, klass.filter(uses_cks ? {cks.map{|k| SQL::QualifiedIdentifier.new(klass.table_name, k)}=>SQL::SQLArray.new(h.keys)} : {SQL::QualifiedIdentifier.new(klass.table_name, key)=>h.keys}), opts.select, eo[:associations], eo).all do |assoc_record|
+            model.eager_loading_dataset(opts, klass.filter(uses_cks ? {cks.map{|k| SQL::QualifiedIdentifier.new(klass.table_name, k)}=>h.keys} : {SQL::QualifiedIdentifier.new(klass.table_name, key)=>h.keys}), opts.select, eo[:associations], eo).all do |assoc_record|
               hash_key = uses_cks ? cks.map{|k| assoc_record.send(k)} : assoc_record.send(key)
               next unless objects = h[hash_key]
               if one_to_one

diff --git a/lib/sequel/plugins/lazy_attributes.rb b/lib/sequel/plugins/lazy_attributes.rb
@@ -70,7 +70,7 @@ module InstanceMethods
         # the attribute for the current object.
         def lazy_attribute_lookup(a)
           primary_key = model.primary_key
-          model.select(*(Array(primary_key) + [a])).filter(primary_key=>::Sequel::SQL::SQLArray.new(retrieved_with.map{|o| o.pk})).all if model.identity_map && retrieved_with
+          model.select(*(Array(primary_key) + [a])).filter(primary_key=>retrieved_with.map{|o| o.pk}).all if model.identity_map && retrieved_with
           values[a] = this.select(a).first[a] unless values.include?(a)
           values[a]
         end

diff --git a/lib/sequel/plugins/many_through_many.rb b/lib/sequel/plugins/many_through_many.rb
@@ -185,7 +185,7 @@ def def_many_through_many(opts)
             ds = opts.associated_class 
             opts.reverse_edges.each{|t| ds = ds.join(t[:table], Array(t[:left]).zip(Array(t[:right])), :table_alias=>t[:alias])}
             ft = opts[:final_reverse_edge]
-            conds = uses_lcks ? [[left_keys.map{|k| SQL::QualifiedIdentifier.new(ft[:table], k)}, SQL::SQLArray.new(h.keys)]] : [[left_key, h.keys]]
+            conds = uses_lcks ? [[left_keys.map{|k| SQL::QualifiedIdentifier.new(ft[:table], k)}, h.keys]] : [[left_key, h.keys]]
             ds = ds.join(ft[:table], Array(ft[:left]).zip(Array(ft[:right])) + conds, :table_alias=>ft[:alias])
             model.eager_loading_dataset(opts, ds, Array(opts.select), eo[:associations], eo).all do |assoc_record|
               hash_key = if uses_lcks

diff --git a/lib/sequel/sql.rb b/lib/sequel/sql.rb
@@ -87,7 +87,7 @@ def hash
       # Show the class name and instance variables for the object, necessary
       # for correct operation on ruby 1.9.2.
       def inspect
-        "#<#{self.class} #{instance_variables.map{|iv| "#{iv}=>#{instance_variable_get(iv)}"}.join(', ')}>"
+        "#<#{self.class} #{instance_variables.map{|iv| "#{iv}=>#{instance_variable_get(iv).inspect}"}.join(', ')}>"
       end
 
       # Returns self, because SQL::Expression already acts like
@@ -132,13 +132,16 @@ class ComplexExpression < Expression
       # Hash of ruby operator symbols to SQL operators, used in BooleanMethods
       BOOLEAN_OPERATOR_METHODS = {:& => :AND, :| =>:OR}
 
+      # Operators that use IN/NOT IN for inclusion/exclusion
+      IN_OPERATORS = [:IN, :'NOT IN']
+
       # Operators that use IS, used for special casing to override literal true/false values
       IS_OPERATORS = [:IS, :'IS NOT']
 
       # Operator symbols that take exactly two arguments
       TWO_ARITY_OPERATORS = [:'=', :'!=', :LIKE, :'NOT LIKE', \
-        :~, :'!~', :'~*', :'!~*', :IN, :'NOT IN', :ILIKE, :'NOT ILIKE'] + \
-        INEQUALITY_OPERATORS + BITWISE_OPERATORS + IS_OPERATORS 
+        :~, :'!~', :'~*', :'!~*', :ILIKE, :'NOT ILIKE'] + \
+        INEQUALITY_OPERATORS + BITWISE_OPERATORS + IS_OPERATORS + IN_OPERATORS
 
       # Operator symbols that take one or more arguments
       N_ARITY_OPERATORS = [:AND, :OR, :'||'] + MATHEMATICAL_OPERATORS
@@ -157,7 +160,8 @@ class ComplexExpression < Expression
       # Raise an error if the operator doesn't allow boolean input and a boolean argument is given.
       # Raise an error if the wrong number of arguments for a given operator is used.
       def initialize(op, *args)
-        args.map!{|a| Sequel.condition_specifier?(a) ? SQL::BooleanExpression.from_value_pairs(a) : a}
+        orig_args = args
+        args = args.map{|a| Sequel.condition_specifier?(a) ? SQL::BooleanExpression.from_value_pairs(a) : a}
         case op
         when *N_ARITY_OPERATORS
           raise(Error, "The #{op} operator requires at least 1 argument") unless args.length >= 1
@@ -166,6 +170,10 @@ def initialize(op, *args)
           old_args.each{|a| a.is_a?(self.class) && a.op == op ? args.concat(a.args) : args.push(a)}
         when *TWO_ARITY_OPERATORS
           raise(Error, "The #{op} operator requires precisely 2 arguments") unless args.length == 2
+          # With IN/NOT IN, even if the second argument is an array of two element arrays,
+          # don't convert it into a boolean expression, since it's definitely being used
+          # as a value list.
+          args[1] = orig_args[1] if IN_OPERATORS.include?(op)
         when *ONE_ARITY_OPERATORS
           raise(Error, "The #{op} operator requires a single argument") unless args.length == 1
         else
@@ -319,7 +327,7 @@ module InequalityMethods
       ComplexExpression::INEQUALITY_OPERATORS.each do |o|
         define_method(o) do |ce|
           case ce
-          when BooleanExpression, TrueClass, FalseClass, NilClass, Hash, Array
+          when BooleanExpression, TrueClass, FalseClass, NilClass, Hash, ::Array
             raise(Error, "cannot apply #{o} to a boolean expression")
           else  
             BooleanExpression.new(o, self, ce)
@@ -338,7 +346,7 @@ module NoBooleanInputMethods
       def initialize(op, *args)
         args.each do |a|
           case a
-          when BooleanExpression, TrueClass, FalseClass, NilClass, Hash, Array
+          when BooleanExpression, TrueClass, FalseClass, NilClass, Hash, ::Array
             raise(Error, "cannot apply #{op} to a boolean expression")
           end
         end
@@ -486,7 +494,7 @@ def self.from_value_pairs(pairs, op=:AND, negate=false)
           ce = case r
           when Range
             new(:AND, new(:>=, l, r.begin), new(r.exclude_end? ? :< : :<=, l, r.end))
-          when Array, ::Sequel::Dataset, SQLArray
+          when ::Array, ::Sequel::Dataset
             new(:IN, l, r)
           when NegativeBooleanConstant
             new(:"IS NOT", l, r.constant)
@@ -864,22 +872,6 @@ def self.like_element(re) # :nodoc:
       private_class_method :like_element
     end
 
-    # Represents an SQL array.  Added so it is possible to deal with a
-    # ruby array of all two pairs as an SQL array instead of an ordered
-    # hash-like conditions specifier.
-    class SQLArray < Expression
-      # The array of objects this SQLArray wraps
-      attr_reader :array
-      alias to_a array
-
-      # Create an object with the given array.
-      def initialize(array)
-        @array = array
-      end
-
-      to_s_method :array_sql, '@array'
-    end
-
     # Represents an SQL array access, with multiple possible arguments.
     class Subscript < GenericExpression
       # The SQL array column
@@ -902,6 +894,15 @@ def |(sub)
       to_s_method :subscript_sql
     end
 
+    # Represents an SQL value list (IN/NOT IN predicate value).  Added so it is possible to deal with a
+    # ruby array of all two pairs as an SQL value list instead of an ordered
+    # hash-like conditions specifier.
+    class ValueList < ::Array
+    end
+
+    # Old name for +ValueList+, used for backwards compatibility
+    SQLArray = ValueList
+
     # The purpose of this class is to allow the easy creation of SQL identifiers and functions
     # without relying on methods defined on Symbol.  This is useful if another library defines
     # the methods defined by Sequel, or if you are running on ruby 1.9.

diff --git a/spec/core/core_sql_spec.rb b/spec/core/core_sql_spec.rb
@@ -51,14 +51,20 @@
   end
 end
 
-context "Array#sql_array" do
+context "Array#sql_value_list and #sql_array" do
   before do
     @d = Sequel::Dataset.new(nil)
   end
 
-  specify "should treat the array as an SQL array instead of conditions" do
-    @d.literal([[:x, 1], [:y, 2]]).should == '((x = 1) AND (y = 2))'
-    @d.literal([[:x, 1], [:y, 2]].sql_array).should == '((x, 1), (y, 2))'
+  specify "should treat the array as an SQL value list instead of conditions when used as a placeholder value" do
+    @d.filter("(a, b) IN ?", [[:x, 1], [:y, 2]]).sql.should == 'SELECT * WHERE ((a, b) IN ((x = 1) AND (y = 2)))'
+    @d.filter("(a, b) IN ?", [[:x, 1], [:y, 2]].sql_value_list).sql.should == 'SELECT * WHERE ((a, b) IN ((x, 1), (y, 2)))'
+    @d.filter("(a, b) IN ?", [[:x, 1], [:y, 2]].sql_array).sql.should == 'SELECT * WHERE ((a, b) IN ((x, 1), (y, 2)))'
+  end
+
+  specify "should be no difference when used as a hash value" do
+    @d.filter([:a, :b]=>[[:x, 1], [:y, 2]]).sql.should == 'SELECT * WHERE ((a, b) IN ((x, 1), (y, 2)))'
+    @d.filter([:a, :b]=>[[:x, 1], [:y, 2]].sql_value_list).sql.should == 'SELECT * WHERE ((a, b) IN ((x, 1), (y, 2)))'
     @d.filter([:a, :b]=>[[:x, 1], [:y, 2]].sql_array).sql.should == 'SELECT * WHERE ((a, b) IN ((x, 1), (y, 2)))'
   end
 end

diff --git a/spec/core/dataset_spec.rb b/spec/core/dataset_spec.rb
@@ -465,13 +465,15 @@ def v.values; {}; end
     @dataset.filter([:id1, :id2] => @d1.select(:id1, :id2)).sql.should == "SELECT * FROM test WHERE ((id1, id2) IN (SELECT id1, id2 FROM test WHERE (region = 'Asia')))"
     @dataset.filter([:id1, :id2] => []).sql.should == "SELECT * FROM test WHERE ((id1 != id1) AND (id2 != id2))"
     @dataset.filter([:id1, :id2] => [[1, 2], [3,4]].sql_array).sql.should == "SELECT * FROM test WHERE ((id1, id2) IN ((1, 2), (3, 4)))"
+    @dataset.filter([:id1, :id2] => [[1, 2], [3,4]]).sql.should == "SELECT * FROM test WHERE ((id1, id2) IN ((1, 2), (3, 4)))"
 
     @dataset.exclude(:id => @d1.select(:id)).sql.should == "SELECT * FROM test WHERE (id NOT IN (SELECT id FROM test WHERE (region = 'Asia')))"
     @dataset.exclude(:id => []).sql.should == "SELECT * FROM test WHERE (1 = 1)"
     @dataset.exclude(:id => [1, 2]).sql.should == "SELECT * FROM test WHERE (id NOT IN (1, 2))"
     @dataset.exclude([:id1, :id2] => @d1.select(:id1, :id2)).sql.should == "SELECT * FROM test WHERE ((id1, id2) NOT IN (SELECT id1, id2 FROM test WHERE (region = 'Asia')))"
     @dataset.exclude([:id1, :id2] => []).sql.should == "SELECT * FROM test WHERE (1 = 1)"
     @dataset.exclude([:id1, :id2] => [[1, 2], [3,4]].sql_array).sql.should == "SELECT * FROM test WHERE ((id1, id2) NOT IN ((1, 2), (3, 4)))"
+    @dataset.exclude([:id1, :id2] => [[1, 2], [3,4]]).sql.should == "SELECT * FROM test WHERE ((id1, id2) NOT IN ((1, 2), (3, 4)))"
   end
 
   specify "should handle IN/NOT IN queries with multiple columns and an array where the database doesn't support it" do
@@ -3453,8 +3455,8 @@ def @ds.fetch_rows(sql, &block)
     @ds.filter{(a+b)<(c-3)}.qualify_to_first_source.sql.should == 'SELECT t.* FROM t WHERE ((t.a + t.b) < (t.c - 3))'
   end
 
-  specify "should handle SQL::SQLArrays" do
-    @ds.filter(:a=>[:b, :c].sql_array).qualify_to_first_source.sql.should == 'SELECT t.* FROM t WHERE (t.a IN (t.b, t.c))'
+  specify "should handle SQL::ValueLists" do
+    @ds.filter(:a=>[:b, :c].sql_value_list).qualify_to_first_source.sql.should == 'SELECT t.* FROM t WHERE (t.a IN (t.b, t.c))'
   end
 
   specify "should handle SQL::Subscripts" do

diff --git a/spec/extensions/lazy_attributes_spec.rb b/spec/extensions/lazy_attributes_spec.rb
@@ -24,7 +24,6 @@ def ds.fetch_rows(sql)
           end
         else
           i = where.args.last
-          i = i.instance_variable_get(:@array) if i.is_a?(Sequel::SQL::SQLArray)
           Array(i).each do |x|
             if sql =~ /SELECT name FROM/
               yield(block[:name=>x.to_s])