New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Eager loading/preloading should be worked regardless of large number of records #33844
Conversation
…of records Since 213796f, bind params are used for IN clause if enabled prepared statements. Unfortunately, most adapter modules have a limitation for # of bind params (mysql2 65535, pg 65535, sqlite3 250000). So if eager loading large number of records at once, that query couldn't be sent to the database. Since eager loading/preloading queries are auto-generated by Active Record itself, so it should be worked regardless of large number of records like as before. Fixes rails#33702.
Eager loading/preloading should be worked regardless of large number of records
While I would prefer that we fix/avoid the too-many-parameters problem, but I don't like the idea of globally ditching bind params for this edge case... we're getting to the point where I'd almost consider anything that doesn't use a bind to be a bug. Could we teach the connection adapters about the limits, and have them re-substitute the binds back to plain values? |
Do you mean like here? diff --git a/activerecord/lib/active_record/connection_adapters/abstract/database_limits.rb b/activerecord/lib/active_record/connection_adapters/abstract/database_limits.rb
index ad148efcfe..0a6651a4af 100644
--- a/activerecord/lib/active_record/connection_adapters/abstract/database_limits.rb
+++ b/activerecord/lib/active_record/connection_adapters/abstract/database_limits.rb
@@ -60,6 +60,10 @@ def in_clause_length
nil
end
+ def bind_params_length
+ 65535
+ end
+
# Returns the maximum length of an SQL query.
def sql_query_length
1048575
diff --git a/activerecord/lib/active_record/connection_adapters/abstract/database_statements.rb b/activerecord/lib/active_record/connection_adapters/abstract/database_statements.rb
index fdc9ffa688..4ee9a4a3c6 100644
--- a/activerecord/lib/active_record/connection_adapters/abstract/database_statements.rb
+++ b/activerecord/lib/active_record/connection_adapters/abstract/database_statements.rb
@@ -46,7 +46,10 @@ def cacheable_query(klass, arel) # :nodoc:
def select_all(arel, name = nil, binds = [], preparable: nil)
arel = arel_from_relation(arel)
sql, binds = to_sql_and_binds(arel, binds)
- if !prepared_statements || (arel.is_a?(String) && preparable.nil?)
+ if binds.length > bind_params_length
+ sql, binds = unprepared_statement { to_sql_and_binds(arel) }
+ preparable = false
+ elsif !prepared_statements || (arel.is_a?(String) && preparable.nil?)
preparable = false
else
preparable = visitor.preparable |
Yeah I think so. It's probably less efficient, but does mean we're protected against any other pathological attempt to construct an over-sized bind list -- and only affects queries that are already going to be Not Very Efficient just due to their length. |
I see, agreed with your point. I'll make that fix after dinner. |
This is a follow up and/or an alternative of rails#33844. Unlike rails#33844, this would attempt to construct unprepared statement only when bind params limit (mysql2 65535, pg 65535, sqlite3 249999) is exceeded. I only defined 65535 as the limit, not defined 249999 for sqlite3, since it is an edge case, I'm not excited to add less worth extra code.
@kamipo Many thanks for tackling this! 💯 |
Since rails#33844, eager loading/preloading with too many and/or too large ids won't be broken by pre-checking whether the value is constructable or not. But the pre-checking caused the type to be evaluated at relation build time instead of at the query execution time, that is breaking an expectation for some apps. I've made the pre-cheking lazy as much as possible, that is no longer happend at relation build time.
Since rails#33844, eager loading/preloading with too many and/or too large ids won't be broken by pre-checking whether the value is constructable or not. But the pre-checking caused the type to be evaluated at relation build time instead of at the query execution time, that is breaking an expectation for some apps. I've made the pre-cheking lazy as much as possible, that is no longer happend at relation build time.
I have updated to 5.2.2 where this fix was implemented but anytime I do a where on more than the limit I still get the error that it is beyond the bounds. |
on Rails 5.2.2 and also seeing this issue Some info on this would be appreciated, thx! |
Before IN clause optimization 70ddb8a, Active Record had generated an SQL with binds when `prepared_statements: true`: ```ruby # prepared_statements: true # # SELECT `authors`.* FROM `authors` WHERE `authors`.`id` IN (?, ?, ?) # # prepared_statements: false # # SELECT `authors`.* FROM `authors` WHERE `authors`.`id` IN (1, 2, 3) # Author.where(id: [1, 2, 3]).to_a ``` But now, binds in IN clause is substituted regardless of whether `prepared_statements: true` or not: ```ruby # prepared_statements: true # # SELECT `authors`.* FROM `authors` WHERE `authors`.`id`IN (1,2,3) # # prepared_statements: false # # SELECT `authors`.* FROM `authors` WHERE `authors`.`id`IN (1,2,3) # Author.where(id: [1, 2, 3]).to_a ``` I suppose that is considered as a regression for the context: > While I would prefer that we fix/avoid the too-many-parameters problem, but I don't like the idea of globally ditching bind params for this edge case... we're getting to the point where I'd almost consider anything that doesn't use a bind to be a bug. rails#33844 (comment) This makes binds consider whether `prepared_statements: true` or not (i.e. restore the original behavior as before), but still gain that optimization when need the substitute binds (`prepared_statements: false`, `relation.to_sql`). Even when `prepared_statements: true`, it still much faster than before by optimized (bind node less) binds generation. ```ruby class Post < ActiveRecord::Base end ids = (1..1000).each.map do |n| Post.create!.id end puts "prepared_statements: #{Post.connection.prepared_statements.inspect}" Benchmark.ips do |x| x.report("where with ids") do Post.where(id: ids).to_a end end ``` * Before (200058b) `prepared_statements: true`: ``` Warming up -------------------------------------- where with ids 6.000 i/100ms Calculating ------------------------------------- where with ids 63.806 (± 7.8%) i/s - 318.000 in 5.015903s ``` `prepared_statements: false`: ``` Warming up -------------------------------------- where with ids 7.000 i/100ms Calculating ------------------------------------- where with ids 73.550 (± 8.2%) i/s - 371.000 in 5.085672s ``` * Now with this change `prepared_statements: true`: ``` Warming up -------------------------------------- where with ids 9.000 i/100ms Calculating ------------------------------------- where with ids 91.992 (± 7.6%) i/s - 459.000 in 5.020817s ``` `prepared_statements: false`: ``` Warming up -------------------------------------- where with ids 10.000 i/100ms Calculating ------------------------------------- where with ids 104.335 (± 8.6%) i/s - 520.000 in 5.026425s ```
Since 213796f, bind params are used for IN clause if enabled prepared
statements.
Unfortunately, most adapter modules have a limitation for # of bind
params (mysql2 65535, pg 65535, sqlite3 250000). So if eager loading
large number of records at once, that query couldn't be sent to the
database.
Since eager loading/preloading queries are auto-generated by Active
Record itself, so it should be worked regardless of large number of
records like as before.
Fixes #33702.