Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eager loading/preloading should be worked regardless of large number of records #33844

Merged
merged 1 commit into from Sep 13, 2018

Conversation

kamipo
Copy link
Member

@kamipo kamipo commented Sep 11, 2018

Since 213796f, bind params are used for IN clause if enabled prepared
statements.

Unfortunately, most adapter modules have a limitation for # of bind
params (mysql2 65535, pg 65535, sqlite3 250000). So if eager loading
large number of records at once, that query couldn't be sent to the
database.

Since eager loading/preloading queries are auto-generated by Active
Record itself, so it should be worked regardless of large number of
records like as before.

Fixes #33702.

…of records

Since 213796f, bind params are used for IN clause if enabled prepared
statements.

Unfortunately, most adapter modules have a limitation for # of bind
params (mysql2 65535, pg 65535, sqlite3 250000). So if eager loading
large number of records at once, that query couldn't be sent to the
database.

Since eager loading/preloading queries are auto-generated by Active
Record itself, so it should be worked regardless of large number of
records like as before.

Fixes rails#33702.
@kamipo kamipo merged commit 823f9e0 into rails:master Sep 13, 2018
@kamipo kamipo deleted the too_many_eager_load_ids branch September 13, 2018 12:30
kamipo added a commit that referenced this pull request Sep 13, 2018
Eager loading/preloading should be worked regardless of large number of records
@matthewd
Copy link
Member

While I would prefer that we fix/avoid the too-many-parameters problem, but I don't like the idea of globally ditching bind params for this edge case... we're getting to the point where I'd almost consider anything that doesn't use a bind to be a bug.

Could we teach the connection adapters about the limits, and have them re-substitute the binds back to plain values?

@kamipo
Copy link
Member Author

kamipo commented Sep 13, 2018

Do you mean like here?

diff --git a/activerecord/lib/active_record/connection_adapters/abstract/database_limits.rb b/activerecord/lib/active_record/connection_adapters/abstract/database_limits.rb
index ad148efcfe..0a6651a4af 100644
--- a/activerecord/lib/active_record/connection_adapters/abstract/database_limits.rb
+++ b/activerecord/lib/active_record/connection_adapters/abstract/database_limits.rb
@@ -60,6 +60,10 @@ def in_clause_length
         nil
       end
 
+      def bind_params_length
+        65535
+      end
+
       # Returns the maximum length of an SQL query.
       def sql_query_length
         1048575
diff --git a/activerecord/lib/active_record/connection_adapters/abstract/database_statements.rb b/activerecord/lib/active_record/connection_adapters/abstract/database_statements.rb
index fdc9ffa688..4ee9a4a3c6 100644
--- a/activerecord/lib/active_record/connection_adapters/abstract/database_statements.rb
+++ b/activerecord/lib/active_record/connection_adapters/abstract/database_statements.rb
@@ -46,7 +46,10 @@ def cacheable_query(klass, arel) # :nodoc:
       def select_all(arel, name = nil, binds = [], preparable: nil)
         arel = arel_from_relation(arel)
         sql, binds = to_sql_and_binds(arel, binds)
-        if !prepared_statements || (arel.is_a?(String) && preparable.nil?)
+        if binds.length > bind_params_length
+          sql, binds = unprepared_statement { to_sql_and_binds(arel) }
+          preparable = false
+        elsif !prepared_statements || (arel.is_a?(String) && preparable.nil?)
           preparable = false
         else
           preparable = visitor.preparable

@matthewd
Copy link
Member

Yeah I think so.

It's probably less efficient, but does mean we're protected against any other pathological attempt to construct an over-sized bind list -- and only affects queries that are already going to be Not Very Efficient just due to their length.

@kamipo
Copy link
Member Author

kamipo commented Sep 13, 2018

I see, agreed with your point. I'll make that fix after dinner.

kamipo added a commit to kamipo/rails that referenced this pull request Sep 13, 2018
This is a follow up and/or an alternative of rails#33844.

Unlike rails#33844, this would attempt to construct unprepared statement only
when bind params limit (mysql2 65535, pg 65535, sqlite3 249999) is
exceeded.

I only defined 65535 as the limit, not defined 249999 for sqlite3, since
it is an edge case, I'm not excited to add less worth extra code.
@dzohrob
Copy link

dzohrob commented Sep 17, 2018

@kamipo Many thanks for tackling this! 💯

kamipo added a commit to kamipo/rails that referenced this pull request Oct 24, 2018
Since rails#33844, eager loading/preloading with too many and/or too large
ids won't be broken by pre-checking whether the value is constructable
or not.

But the pre-checking caused the type to be evaluated at relation build
time instead of at the query execution time, that is breaking an
expectation for some apps.

I've made the pre-cheking lazy as much as possible, that is no longer
happend at relation build time.
rafaelfranca pushed a commit to rafaelfranca/omg-rails that referenced this pull request Oct 24, 2018
Since rails#33844, eager loading/preloading with too many and/or too large
ids won't be broken by pre-checking whether the value is constructable
or not.

But the pre-checking caused the type to be evaluated at relation build
time instead of at the query execution time, that is breaking an
expectation for some apps.

I've made the pre-cheking lazy as much as possible, that is no longer
happend at relation build time.
@s-lee-kwong
Copy link

I have updated to 5.2.2 where this fix was implemented but anytime I do a where on more than the limit I still get the error that it is beyond the bounds.

@thaiden
Copy link

thaiden commented Jan 22, 2019

on Rails 5.2.2 and also seeing this issue
Failed to submit event: ActiveRecord::StatementInvalid: PG::UnableToSend: number of parameters must be between 0 and 65535

Some info on this would be appreciated, thx!

kamipo added a commit to kamipo/rails that referenced this pull request May 10, 2020
Before IN clause optimization 70ddb8a, Active Record had generated an
SQL with binds when `prepared_statements: true`:

```ruby
# prepared_statements: true
#
#   SELECT `authors`.* FROM `authors` WHERE `authors`.`id` IN (?, ?, ?)
#
# prepared_statements: false
#
#   SELECT `authors`.* FROM `authors` WHERE `authors`.`id` IN (1, 2, 3)
#
Author.where(id: [1, 2, 3]).to_a
```

But now, binds in IN clause is substituted regardless of whether
`prepared_statements: true` or not:

```ruby
# prepared_statements: true
#
#   SELECT `authors`.* FROM `authors` WHERE `authors`.`id`IN (1,2,3)
#
# prepared_statements: false
#
#   SELECT `authors`.* FROM `authors` WHERE `authors`.`id`IN (1,2,3)
#
Author.where(id: [1, 2, 3]).to_a
```

I suppose that is considered as a regression for the context:

> While I would prefer that we fix/avoid the too-many-parameters
problem, but I don't like the idea of globally ditching bind params for
this edge case... we're getting to the point where I'd almost consider
anything that doesn't use a bind to be a bug.

rails#33844 (comment)

This makes binds consider whether `prepared_statements: true` or not
(i.e. restore the original behavior as before), but still gain that
optimization when need the substitute binds (`prepared_statements: false`,
`relation.to_sql`). Even when `prepared_statements: true`, it still
much faster than before by optimized (bind node less) binds generation.

```ruby
class Post < ActiveRecord::Base
end

ids = (1..1000).each.map do |n|
  Post.create!.id
end

puts "prepared_statements: #{Post.connection.prepared_statements.inspect}"

Benchmark.ips do |x|
  x.report("where with ids") do
    Post.where(id: ids).to_a
  end
end
```

* Before (200058b)

`prepared_statements: true`:

```
Warming up --------------------------------------
      where with ids     6.000  i/100ms
Calculating -------------------------------------
      where with ids     63.806  (± 7.8%) i/s -    318.000  in   5.015903s
```

`prepared_statements: false`:

```
Warming up --------------------------------------
      where with ids     7.000  i/100ms
Calculating -------------------------------------
      where with ids     73.550  (± 8.2%) i/s -    371.000  in   5.085672s
```

* Now with this change

`prepared_statements: true`:

```
Warming up --------------------------------------
      where with ids     9.000  i/100ms
Calculating -------------------------------------
      where with ids     91.992  (± 7.6%) i/s -    459.000  in   5.020817s
```

`prepared_statements: false`:

```
Warming up --------------------------------------
      where with ids    10.000  i/100ms
Calculating -------------------------------------
      where with ids    104.335  (± 8.6%) i/s -    520.000  in   5.026425s
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants