Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Preloading more than 1000 associated records causes ActiveRecord::StatementInvalid when using Oracle #585

Closed
lighthouse-import opened this Issue · 44 comments

9 participants

@lighthouse-import

Imported from Lighthouse. Original ticket at: http://rails.lighthouseapp.com/projects/8994/tickets/1533
Created by Ed Lebert - 2008-12-08 19:28:34 UTC

Rails often uses "in" sql clauses when preloading associations. Oracle has a 1000 term restriction on such clauses, so that you cannot say something like "WHERE parent_id IN (1,2,3,4 ... 1001)". One possible solution is to break it up like this:

"WHERE (parent_id IN (1,2,3,4,...) OR parent_id IN (1001,1002,...)".

It has been proposed that the adapters can supply their "in clause limit", and the abstract adapter can just default to return nil if it has no limit. But the oracle adapter can return 1000.

I searched to make sure this isn't a duplicate ticket first. Sorry if it is. This is my first rails bug.

@lighthouse-import

Imported from Lighthouse.
Comment by Frederick Cheung - 2008-12-08 20:19:02 UTC

I vaguely recall someone having the same issue, although I can't remember whether it was here or a post on one of the various mailing lists. This does seem reasonable, I suppose the only question I would ask would be is it preferable to do

Select * from foos where id in ( ... ) or id in ( ... )

or to do two queries, one for each chunk. I can't imagine that either would be very hard to do

@lighthouse-import

Imported from Lighthouse.
Comment by Ed Lebert - 2008-12-08 20:28:43 UTC

I think the first option is best, but you're the expert.

I used to just monkey patch ActiveRecord::AssociationPreload::ClassMethods::Find_assicoated_records. But another place where this popped up was a habtm relationship where activerecord performed a join query. I'm sure you know where this is in the code better than I do.

However, another question is how far reaching will you make this fix? It seems like a lot of work to make this fix complete, as you'd have to at least put it anywhere in activerecord where there's an 'in' clause. Yikes. Also, what about when a user does a condition on :conditions => ['column_name IN (?)', ids]

I hate oracle. But this is for a big corporate application the only database they allow is oracle. Unfortunately that is a reality for a lot of web developers.

@lighthouse-import

Imported from Lighthouse.
Comment by Giorgio Gonnella - 2008-12-17 11:57:09 UTC

I have this problem with Oracle too.

Here I found a discussion about it:
http://www.ruby-forum.com/topic/164681

The solution proposed there seems not to work in my case, however.

@lighthouse-import

Imported from Lighthouse.
Comment by Frederick Cheung - 2008-12-17 12:53:08 UTC

Personally I wouldn't worry about the case when the user does :conditions => {:foos => list_of_ids}. Users can work their way round that, whereas with the associations stuff you'd be stuck. I suspect that in the case with a big number of records you'd be better off triggering a fallback to the older joins based include

@lighthouse-import

Imported from Lighthouse.
Comment by Giorgio Gonnella - 2008-12-17 13:11:42 UTC

OK. The solution in the ruby-forum is of course incomplete, as patches only find_associated_records and not e.g. preload_belongs_to_association. (which is why is not working for me)

@lighthouse-import

Imported from Lighthouse.
Comment by Giorgio Gonnella - 2008-12-17 14:48:04 UTC

I "quick"-fixed it as in the following diff:

@lighthouse-import

Imported from Lighthouse.
Comment by Giorgio Gonnella - 2009-03-02 17:08:33 UTC

Here is a patch (based on the rails-2.2 branch, can be easily adapted for the master branch)

@lighthouse-import

Imported from Lighthouse.
Comment by Frederick Cheung - 2009-03-02 17:50:12 UTC

2 things:

it looks to me like you are missing some parentheses in your sql - it will generate stuff that looks like

id in (...) or id in (...) AND other conditions

instead of

(id in (...) or id in (...)) AND other conditions

secondly, no tests :-)

@lighthouse-import

Imported from Lighthouse.
Comment by Giorgio Gonnella - 2009-03-02 20:56:37 UTC

Hi Frederick, thank you for reviewing it.

Reply to your objections:

  • missing parenthesis: of course a couple of parenthesis more around it can only be good, you're right, I will correct it

  • no tests: I should definitely prepare them. I probably have still too less experience in testing ActiveRecord features, but I will have a look on the tests suite you guys provide

@lighthouse-import

Imported from Lighthouse.
Comment by Elad Meidar - 2009-08-09 21:45:04 UTC

-1 Not sure it's the right way to patch it, someone considered the performance issues with such a select? it's one of those things that are better left for the DBA to solve (that limit is configurable if i recall correctly).

anyway, should not be patched in ActiveRecord, but rather on the OracleAdapter.

@lighthouse-import

Imported from Lighthouse.
Comment by Elad Meidar - 2009-08-09 21:45:48 UTC

+1 verified on a sample app and a none-configured Oracle database with limit on.

@lighthouse-import

Imported from Lighthouse.
Comment by Nick M - 2010-01-15 21:20:09 UTC

From what I can tell, it's not possible to change this limit within Oracle. I also don't think that it's possible for the oracle adapter to cleanly fix this problem since it's Rails that generates all of this raw SQL. This means that any query that happens to be including more than 1000 associated records fails with this error: "OCIError: ORA-01795: maximum number of expressions in a list is 1000".

The earlier patch fixes this problem, but I'm also unsure if there might be performance implications of constructing the query that way for all other databases. Although, personally, for queries loading 1000s of associations, I'm more concerned with getting actual results than performance.

I'm also attaching an simpler monkey patch that surely has worse performance. The only real benefit is that you can drop this into config/initializers and it only overrides one method to fix things, and it's only triggered if there are more than 1000 items. The earlier patch is a better solution, but I didn't want to maintain fully patched copies of the various methods involved.

@lighthouse-import

Imported from Lighthouse.
Comment by Nick M - 2010-01-15 21:24:19 UTC

Oops, here's the simpler monkey patch that can be dropped in config/initializers

@lighthouse-import

Imported from Lighthouse.
Comment by Steve - 2010-01-20 17:14:12 UTC

Hi Giorgio,
I have the same issue and tried your solution and this is not working for me.
Can you please upload your file /vendor/rails/activerecord/lib/active_record/association_preload.rb ?
btw, I'm using rails 2.2

Thanks

@lighthouse-import

Imported from Lighthouse.
Comment by Rizwan Reza - 2010-02-12 12:46:13 UTC

[bulk edit]

@lighthouse-import

Imported from Lighthouse.
Comment by Student - 2010-03-04 15:42:55 UTC

I've done a bit of checking. It seems that the database adapters are not shy about opening ActiveRecord::Base to make changes. (See, for instance vendor/rails/activerecord/lib/active_record/connection_adapters/mysql_adapter.rb, and gems/activerecord-oracle-adapter-1.0.0.9250/lib/active_record/connection_adapters/oracle_adapter.rb).

This indicates that the fix should probably be in the oracle adapter.

Also, while it has been commented that the user should take care of :conditions => {:foos => list_of_ids} themselves, I have problems with this. If someone is using a plugin, the plugin might well generate the list. So they have to hack someone else's code. Furthermore, it is not a fix, but fixes that we need, as every instance would need to be corrected.

Note that ActiveRecord::Base#quote_bound_value does not go through in_or_equals_for_ids, so the wrapper proposed is incomplete.

@lighthouse-import

Imported from Lighthouse.
Comment by Student - 2010-03-04 23:25:02 UTC

Well, ... I lied. I have a fix which should cover all issues. The best solution, it would seem to me, is to adjust ActiveRecord::Base just a little to support the fix in the Oracle adapter:

diff --git a/activerecord/lib/active_record/base.rb b/activerecord/lib/active_record/base.rb
index cd67490..68dec91 100755
--- a/activerecord/lib/active_record/base.rb
+++ b/activerecord/lib/active_record/base.rb
@@ -1639,13 +1639,17 @@ module ActiveRecord #:nodoc:
if value.respond_to?(:empty?) && value.empty?
connection.quote(nil)
else

  • value.map { |v| connection.quote(v) }.join(',')
  •          join_quoted_values_for_condition(value.map{|v| connection.quote(v)})
         end
       else
         connection.quote(value)
       end
     end
    
  •    def join_quoted_values_for_condition(values) #:nodoc:
    
  • values * ','
  • end + def raise_if_bind_arity_mismatch(statement, expected, provided) #:nodoc: unless expected == provided raise PreparedStatementInvalid, "wrong number of bind variables (#{provided} for #{expected}) in: #{statement}" -- 1.6.0.4

This allows a clean fix to the oracle adapter (enhanced)
diff --git a/lib/active_record/connection_adapters/oracle_enhanced_adapter.rb b/lib/active_record/connection_adapters/oracle_enhance
index e01b07d..f92167a 100644
--- a/lib/active_record/connection_adapters/oracle_enhanced_adapter.rb
+++ b/lib/active_record/connection_adapters/oracle_enhanced_adapter.rb
@@ -107,6 +107,10 @@ module ActiveRecord
end
private :enhanced_write_lobs

  • def join_quoted_values_for_condition(values) #:nodoc:
  • values * ','
  • end + class << self # patch ORDER BY to work with LOBs def add_order_with_lobs!(sql, order, scope = :auto)
@lighthouse-import

Imported from Lighthouse.
Comment by Student - 2010-03-05 00:43:51 UTC

Uggh. Formatting problems AND I grabbed the wrong changes... (The patch above should be good)

---
activerecord/lib/active_record/base.rb |    6 +++++-
1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/activerecord/lib/active_record/base.rb b/activerecord/lib/active_record/base.rb
index cd67490..68dec91 100755
--- a/activerecord/lib/active_record/base.rb
+++ b/activerecord/lib/active_record/base.rb
@@ -1639,13 +1639,17 @@ module ActiveRecord #:nodoc:
            if value.respond_to?(:empty?) && value.empty?
              connection.quote(nil)
            else
-              value.map { |v| connection.quote(v) }.join(',')
+              join_quoted_values_for_condition(value.map{|v| connection.quote(v)})
            end
          else
            connection.quote(value)
          end
        end

+        def join_quoted_values_for_condition(values) #:nodoc:
+          values * ','
+        end
+
        def raise_if_bind_arity_mismatch(statement, expected, provided) #:nodoc:
          unless expected == provided
            raise PreparedStatementInvalid, "wrong number of bind variables (#{provided} for #{expected}) in: #{statement}"
-- 
1.6.0.4

This supports the fix in the oracle adapter:

---
 .../connection_adapters/oracle_enhanced_adapter.rb |   14 ++++++++++++++
 1 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/lib/active_record/connection_adapters/oracle_enhanced_adapter.rb b/lib/active_record/connection_adapters/oracle_enhanced_adapter.rb
index e01b07d..5f62266 100644
--- a/lib/active_record/connection_adapters/oracle_enhanced_adapter.rb
+++ b/lib/active_record/connection_adapters/oracle_enhanced_adapter.rb
@@ -107,6 +107,20 @@ module ActiveRecord
     end
     private :enhanced_write_lobs

+    ORACLE_IN_LIMIT = 1000
+
+    def join_quoted_values_for_condition(values)
+      return values * ',' unless values.length > ORACLE_IN_LIMIT
+
+      values.uniq!
+      return values * ',' unless values.length > ORACLE_IN_LIMIT
+
+      quoted_chunks = values.in_groups_of(ORACLE_IN_LIMIT) do |chunk|
+        "(SELECT * FROM TABLE(sys.odcinumberlist(#{chunk * ','})))"
+      end
+      quoted_chunks * " UNION "
+    end
+
     class << self
       # patch ORDER BY to work with LOBs
       def add_order_with_lobs!(sql, order, scope = :auto)
-- 
1.6.0.4
@lighthouse-import

Imported from Lighthouse.
Comment by Student - 2010-03-05 14:46:05 UTC

Well.. I found another place that needs to be touched...

---
 activerecord/lib/active_record/base.rb |    8 ++++++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/activerecord/lib/active_record/base.rb b/activerecord/lib/active_record/base.rb
index cd67490..563a645 100755
--- a/activerecord/lib/active_record/base.rb
+++ b/activerecord/lib/active_record/base.rb
@@ -695,7 +695,7 @@ module ActiveRecord #:nodoc:
         }.join(", ")

         if id.is_a?(Array)
-          ids_list = id.map {|i| quote_value(i)}.join(', ')
+          ids_list = join_quoted_values_for_condition(id.map{|i| quote_value(i)})
           condition = "IN  (#{ids_list})"
         else
           condition = "= #{quote_value(id)}"
@@ -1639,13 +1639,17 @@ module ActiveRecord #:nodoc:
             if value.respond_to?(:empty?) && value.empty?
               connection.quote(nil)
             else
-              value.map { |v| connection.quote(v) }.join(',')
+              join_quoted_values_for_condition(value.map{|v| connection.quote(v)})
             end
           else
             connection.quote(value)
           end
         end

+        def join_quoted_values_for_condition(values) #:nodoc:
+          values * ','
+        end
+
         def raise_if_bind_arity_mismatch(statement, expected, provided) #:nodoc:
           unless expected == provided
             raise PreparedStatementInvalid, "wrong number of bind variables (#{provided} for #{expected}) in: #{statement}"
-- 
1.6.0.4
@lighthouse-import

Imported from Lighthouse.
Comment by Jeremy Kemper - 2010-03-05 16:32:58 UTC

Needs a test case as well.

@lighthouse-import

Imported from Lighthouse.
Comment by Student - 2010-03-05 17:11:14 UTC

This patch is a nop for the rails core. Its purpose is to provide a hook that the Oracle adapter can use. Or do we need a test for join_quoted_values_for_condition ?

@lighthouse-import

Imported from Lighthouse.
Comment by Jeremy Kemper - 2010-03-05 18:38:52 UTC

Ah, right. No test needed.

@lighthouse-import

Imported from Lighthouse.
Comment by Rizwan Reza - 2010-05-16 01:41:08 UTC

[bulk edit]

@lighthouse-import

Imported from Lighthouse.
Comment by Jeremy Kemper - 2010-05-23 16:54:42 UTC

[bulk edit]

@lighthouse-import

Imported from Lighthouse.
Comment by Jeremy Kemper - 2010-05-24 08:40:50 UTC

[bulk edit]

@lighthouse-import

Imported from Lighthouse.
Comment by Jeremy Kemper - 2010-05-25 22:45:37 UTC

[bulk edit]

@lighthouse-import

Imported from Lighthouse.
Comment by Jeremy Kemper - 2010-08-30 01:28:48 UTC

[bulk edit]

@lighthouse-import

Imported from Lighthouse.
Comment by Santiago Pastorino - 2011-02-02 16:33:07 UTC

This issue has been automatically marked as stale because it has not been commented on for at least three months.

The resources of the Rails core team are limited, and so we are asking for your help. If you can still reproduce this error on the 3-0-stable branch or on master, please reply with all of the information you have about it and add "[state:open]" to your comment. This will reopen the ticket for review. Likewise, if you feel that this is a very important feature for Rails to include, please reply with your explanation so we can consider it.

Thank you for all your contributions, and we hope you will understand this step to focus our efforts where they are most helpful.

@lighthouse-import

Imported from Lighthouse.
Comment by Robert Tarrall - 2011-03-01 22:58:24 UTC

[state:open]
This is still an issue in 3.0.4. For example, assuming User has_many :posts and Post has_one :content... this:

  User.where(:username=>'somebody').includes(:posts)

works fine, but this:

  User.where(:username=>'somebody').includes(:posts => :content)

will give us "ORA-01795: maximum number of expressions in a list is 1000" when the user has more than 1000 posts.

The SQL generated looks like this:

  SELECT "contents".* FROM "contents"
    WHERE ("contents"."post_id" IN (ginormous list))

which is extremely non-performant in Oracle. Ed's original suggestion of splitting the list of IDs should generate SQL which works in Oracle, but will be even worse performance-wise.

If possible, we're much better off if the Content eager-load generates SQL using JOINs rather than "where ID in (x,y,z,...)":

  SELECT contents.* FROM contents
    INNER JOIN posts ON posts.id=contents.post_id
    INNER JOIN users ON users.id=posts.user_id
    WHERE users.nid=1
@cmrichards

Why was this ticket closed if it has not been resolved?

@carlosantoniodasilva

@cmrichards please feel free to provide a new example showing the issue, and we can either reopen or start a new issue. Thanks!

@rafaelfranca

It was closed by the importer. We can reopen it

@rafaelfranca rafaelfranca reopened this
@steveklabnik
Collaborator

That said, even though it's re-opened, an example that's confirmed to demonstrate the issue would be great.

@yahonda

Just glanced this reopened issue because its subject has Oracle. but it would require some tasks to create a test case from these discussions. It would be very helpful if exact test case with exact Models, associations are provided.

@steveklabnik
Collaborator

@cmrichards can you provide @yahonda with a test case of some kind? If we don't have a way to reproduce the error, we should just close this.

@al2o3cr

Note that the request (from 2011) by Robert Tarrall is now at least partly possible - adding references to the query will force a load with joins rather than IDs.

@senny
Owner

Is this still an issue? While browsing the preloader source I noticed the following lines:

# Some databases impose a limit on the number of ids in a list (in Oracle it's 1000)
# Make several smaller queries if necessary or make one query if the adapter supports it
sliced  = owner_keys.each_slice(klass.connection.in_clause_length || owner_keys.size)
records = sliced.map { |slice| records_for(slice).to_a }.flatten
@senny
Owner

Thanks for linking the issue. I think we can close this then.

@senny senny closed this
@plentz

@senny yes, the preloading issue is solved. The problem that remains(as far as I know) is when you do something like I said here:

SomeModel.where(:id => array_with_one_thousand_elements)

Do you know if there's an issue for that?

@senny
Owner

@plentz I don't think there is an issue for that but I'm not sure this is an actual active_record problem. I see it as a limitation of the database that the user of the database needs to know.

@plentz

@senny I agree that's a database limitation, but I think that active_record could handle that case the same way we're handling this one. In fact, is the same limitation, but one case active_record handles it and in the other don't.

@senny
Owner

I think the two cases are very different. The case in this ticket is about preloading. This is an AR internal functionality that the user can't control. It must work on all different database adapters.

The case you mentioned is a normal query. There the user has full control and needs to act within the limitations of the database he is using.

@plentz

I think that even if the user "can" control the number of elements inside a IN query, it doesn't mean that he should be obligated to do it.

If we can ease users lives, so, why not? (even if his database sucks)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.