Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

All of queries should return correct result even if including large number #30000

Merged

Conversation

kamipo
Copy link
Member

@kamipo kamipo commented Jul 30, 2017

Currently several queries cannot return correct result due to incorrect
RangeError handling.

First example:

assert_equal true, Topic.where(id: [1, 9223372036854775808]).exists?
assert_equal true, Topic.where.not(id: 9223372036854775808).exists?

The first example is obviously to be true, but currently it returns
false.

Second example:

assert_equal topics(:first), Topic.where(id: 1..9223372036854775808).find(1)

The second example also should return the object, but currently it
raises RecordNotFound.

It can be seen from the examples, the queries including large number
assuming empty result is not always correct.

Therefore, This change handles RangeError to generate executable SQL
instead of raising RangeError to users to always return correct
result. By this change, it is no longer raised RangeError to users.

@kamipo kamipo force-pushed the all_of_queries_should_return_correct_result branch 2 times, most recently from 73bdbcb to 09bc8b5 Compare July 30, 2017 08:02
@matthewd
Copy link
Member

Ugh... I'm sure I was worried about these sorts of cases when we added that rescue 😕

I don't think replacing binds with inline values is the solution, though. It provides a trivial means of blowing out the prepared statement cache, and even in the best case gives a surprise inefficient query.. what was an index lookup is now a full table scan.

If we can't pick out just the known cases, like find(hugenum), I think we have to just let the exception raise.

@kamipo
Copy link
Member Author

kamipo commented Jul 31, 2017

I changed the implementation to build "1=0" predicate rather than replacing binds with inline values. I think it is no longer given inefficient query.

@kamipo kamipo force-pushed the all_of_queries_should_return_correct_result branch from bfb5d4e to 04b1c71 Compare August 1, 2017 08:15
@kamipo kamipo force-pushed the all_of_queries_should_return_correct_result branch from 04b1c71 to aa54da6 Compare August 19, 2017 08:48
@rafaelfranca
Copy link
Member

I'm thing this new implementation is good. @matthewd do you still have concerns?

Copy link
Contributor

@sgrif sgrif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

build_query_attribute appears to be doing something really implicit. I'd like to see it be made much more explicit, but this seems fine to me overall. The additional test cases make sense.

I would like to see the intent of build_query_attribute made more clear before this is merged though

Copy link
Member

@matthewd matthewd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still opens the possibility of 2**N query cache entries for a single query with N integer parameters, but that's far more bounded than infinity. And it avoids wasting database effort on the known-false condition, so that's good.

I agree the current rescues are catching too much, so something must change. If this can fix it while avoiding even more exceptions, so much the better.

@kamipo kamipo force-pushed the all_of_queries_should_return_correct_result branch from aa54da6 to 176c6b8 Compare January 18, 2018 03:50
@rafaelfranca
Copy link
Member

What Topic.where(id: [1, 9223372036854775808]).to_a would do? Raise an exception or return an empty array? I feel if it return an empty array it will be really hard to track down that your query is not returning any object because one or more of the ids you passed are too big.

Copy link
Member

@rafaelfranca rafaelfranca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me, but let's wait @matthewd's approval

@kamipo kamipo force-pushed the all_of_queries_should_return_correct_result branch from cedd0b2 to cc507e3 Compare January 29, 2018 02:07
@kamipo
Copy link
Member Author

kamipo commented Jan 30, 2018

In this PR, finder methods no longer raise RangeError.
So StatementCache#execute is the only place to raise the exception for finder queries.

#29988 closely related with this PR to avoid catching the exception in much places.
Thus I cherry-picked #29988 in this PR.

@@ -106,6 +106,8 @@ def execute(params, connection, &block)
sql = query_builder.sql_for bind_values, connection

klass.find_by_sql(sql, bind_values, preparable: true, &block)
rescue ::RangeError
[]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This rescue makes me nervous, just because it sounds far-reaching.

I agree it's fine to put it here in practice, because of the very limited set of statements that are eligible for this cache, as you noted in the commit message. I think that's worth a comment here too, though: both to hopefully draw attention if the cache ever gets used for other queries, and just to reassure future readers me next time I read through this code 😅

@kamipo kamipo force-pushed the all_of_queries_should_return_correct_result branch 2 times, most recently from 59d7e9f to 611e8b7 Compare February 2, 2018 04:24
@kamipo kamipo force-pushed the all_of_queries_should_return_correct_result branch from 611e8b7 to 368d4aa Compare February 23, 2018 02:35
@kamipo kamipo force-pushed the all_of_queries_should_return_correct_result branch from 368d4aa to 0bb5bc7 Compare June 21, 2018 19:32
@kamipo kamipo force-pushed the all_of_queries_should_return_correct_result branch from 0bb5bc7 to 694909f Compare September 17, 2018 19:38
@kamipo kamipo force-pushed the all_of_queries_should_return_correct_result branch from 694909f to 317ce94 Compare October 4, 2018 23:18
@kamipo kamipo added this to the 6.0.0 milestone Oct 17, 2018
@kamipo kamipo force-pushed the all_of_queries_should_return_correct_result branch from 317ce94 to 0b8bf09 Compare November 3, 2018 06:13
@kamipo kamipo force-pushed the all_of_queries_should_return_correct_result branch from 0b8bf09 to 2d32895 Compare January 18, 2019 06:39
…umber

Currently several queries cannot return correct result due to incorrect
`RangeError` handling.

First example:

```ruby
assert_equal true, Topic.where(id: [1, 9223372036854775808]).exists?
assert_equal true, Topic.where.not(id: 9223372036854775808).exists?
```

The first example is obviously to be true, but currently it returns
false.

Second example:

```ruby
assert_equal topics(:first), Topic.where(id: 1..9223372036854775808).find(1)
```

The second example also should return the object, but currently it
raises `RecordNotFound`.

It can be seen from the examples, the queries including large number
assuming empty result is not always correct.

Therefore, This change handles `RangeError` to generate executable SQL
instead of raising `RangeError` to users to always return correct
result. By this change, it is no longer raised `RangeError` to users.
Since 31ffbf8, finder methods no longer raise `RangeError`. So
`StatementCache#execute` is the only place to raise the exception for
finder queries.

`StatementCache` is used for simple equality queries in the codebase.
This means that if `StatementCache#execute` raises `RangeError`, the
result could always be regarded as empty.
So `StatementCache#execute` just return nil in that range error case,
and treat that as empty in the caller side, then we can avoid catching
the exception in much places.
@kamipo kamipo force-pushed the all_of_queries_should_return_correct_result branch from 2d32895 to c196ca7 Compare January 18, 2019 07:02
@kamipo
Copy link
Member Author

kamipo commented Jan 18, 2019

I'd be going to merge this PR since I think I've addressed all suggestions, and this PR already got two approvals and the intent of the code is clear enough.

Let me know if anyone has any suggestions, I think I could address that before the final 6.0 release.

cc @rafaelfranca @matthewd

@rafaelfranca
Copy link
Member

:shipit:

@kamipo kamipo merged commit ff3d1a4 into rails:master Jan 18, 2019
@kamipo kamipo deleted the all_of_queries_should_return_correct_result branch January 18, 2019 19:19
suketa added a commit to suketa/rails_sandbox that referenced this pull request Jul 26, 2019
All of queries should return correct result even if including large number
rails/rails#30000
kamipo added a commit that referenced this pull request Mar 11, 2021
brendon pushed a commit to brendon/rails that referenced this pull request Jun 17, 2021
wbotelhos pushed a commit to wbotelhos/rails that referenced this pull request Jul 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants