Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix ActiveRecord::FinderMethods.find when passing multiple ids and primary key is not selected #45695

Merged
merged 1 commit into from
Sep 2, 2022

Conversation

fatkodima
Copy link
Member

Fixes #45694.

I think it is easier and better to just raise than to introduce an implicit behavior by adding a primary key to the selected values.

@xjunior
Copy link
Contributor

xjunior commented Jul 29, 2022

The behavior in the bug report is what I would expect, not an exception. Selecting attributes without the primary key is a valid scenario.

Also, checking if id is "truthy" doesn't work when the primary key is something else.

@fatkodima
Copy link
Member Author

Selecting attributes without the primary key is a valid scenario.

Sure, but you should not call find on it. You can get the relation from somewhere else and calling find on it will get empty results and you will be very surprised (not to mention it should take some time when this surprise will show itself), like the OP.

Also, checking if id is "truthy" doesn't work when the primary key is something else.

No, because the primary can not be nil or false.

@xjunior
Copy link
Contributor

xjunior commented Jul 29, 2022

No, because the primary can not be nil or false.

I mean if the model in question doesn't use id as the primary key.

@fatkodima
Copy link
Member Author

If you mean id as a column name, then calling #id method actually returns the record's primary key - https://api.rubyonrails.org/classes/ActiveRecord/AttributeMethods/PrimaryKey.html#method-i-id

@xjunior
Copy link
Contributor

xjunior commented Aug 1, 2022

If you mean id as a column name, then calling #id method actually returns the record's primary key - https://api.rubyonrails.org/classes/ActiveRecord/AttributeMethods/PrimaryKey.html#method-i-id

that I didn't know, good info!

@simi
Copy link
Contributor

simi commented Aug 2, 2022

Why is ordering done in Ruby world? Can't we just rewrite the logic to not use in_order_of on Array, but on Relation (https://api.rubyonrails.org/classes/ActiveRecord/QueryMethods.html#method-i-in_order_of) instead? That way there is no need for primary_key selection and we can guarantee the order.

@fatkodima
Copy link
Member Author

We can't use it, at least currently, because, probably, not every dbms supports such ordering (but I was unable to find any by checking right now all the most popular). But at least not all activerecord adapters support it -

.order!(connection.field_ordered_value(arel_column, values))

We can move field_ordered_value to the AbstractAdapter, because all the SQL dbms have basically the same syntax. We need to merge #45670 first, I think.

@simi
Copy link
Contributor

simi commented Aug 2, 2022

We can move field_ordered_value to the AbstractAdapter, because all the SQL dbms have basically the same syntax. We need to merge #45670 first, I think.

Ok, let's focus on that one for now. 👀

@fatkodima
Copy link
Member Author

Another reason to order in memory: find queries are very popular and the generated SQL queries are nice, clean and simple, and should perform faster too. So if to order in database, this will change. If we find 10 records, that query will became monstrous already.

@simi
Copy link
Contributor

simi commented Aug 2, 2022

Another reason to order in memory: find queries are very popular and the generated SQL queries are nice, clean and simple, and should perform faster too. So if to order in database, this will change. If we find 10 records, that query will became monstrous already.

I'm not sure that's the main purpose of AR to create nice clean and simple queries. Check the common query generated by eager_load. AFAIK AR is in here so you don't need to care about SQL queries at all if possible.

@fschwahn
Copy link
Contributor

I did a simple benchmark, and sorting in DB (using in_order_of) is noticeably slower:

require "benchmark/ips"
id_list = 100.times.map { User.create!.id }.shuffle

Benchmark.ips do |x|
  x.report("sort in ruby") { User.find(id_list) }
  x.report("sort in DB") { User.in_order_of(:id, id_list).to_a }

  x.compare!
end
Warming up --------------------------------------
        sort in ruby    80.000  i/100ms
          sort in DB    50.000  i/100ms
Calculating -------------------------------------
        sort in ruby    795.172  (± 5.2%) i/s -      4.000k in   5.044808s
          sort in DB    512.828  (± 1.8%) i/s -      2.600k in   5.071509s

Comparison:
        sort in ruby:      795.2 i/s
          sort in DB:      512.8 i/s - 1.55x  (± 0.00) slower

The effect increases when id_list becomes larger.

@casperisfine
Copy link
Contributor

Ok, I looked at the issue a bit, and I don't think either of the proposed resolutions so far are really satisfactory.

Raising an explicit error while better than returning an empty array seems like giving up to me.

Ordering in DB is mentioned before may have very detrimental performance effects.

I'm wondering wether implicitly selecting the primary key when passing a list to find() wouldn't be better.

@@ -507,6 +508,7 @@ def find_some_ordered(ids)
result = except(:limit, :offset).where(primary_key => ids).records

if result.size == ids.size
raise ArgumentError, "Primary key is not included in the custom select clause" unless result.first.id
result.in_order_of(:id, ids.map { |id| @klass.type_for_attribute(primary_key).cast(id) })
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said, it's an orthogonal concern, but I think we should probably raise an error if in_order_of returns less elements than result.

@fatkodima
Copy link
Member Author

@byroot Thank you for the review!

I'm wondering wether implicitly selecting the primary key when passing a list to find() wouldn't be better.

Updated with this approach.

@byroot byroot merged commit 4d4f0b2 into rails:main Sep 2, 2022
@byroot
Copy link
Member

byroot commented Sep 2, 2022

thanks!

@@ -448,6 +448,7 @@ def find_with_ids(*ids)
ids = ids.flatten.compact.uniq

model_name = @klass.name
_select!(table[primary_key]) unless select_values.empty?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum, actually, this mutates the current instance, I don't think we should do that. Can you do a quick followup that doesn't do this?

I think you can just do it in find_some_ordered, at the same point where where(primary_key => ids).

Sorry, I should have been more careful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Find with array returns nothing if primary key is not selected
6 participants