Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warn on instantiating records w/ ambiguous columns #44236

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

khasinski
Copy link
Contributor

Summary

It is possible to query for models with a result that has duplicate
columns. For example:

Post.joins(:author).select("*").first

will have at least two id columns (and possibly more).

Loading record matches the column names and will overwrite
the resulting column several times on duplicates which might
lead to unexpected query results.

Fixes #41151

Other Information

I'd love to have a better warning message and some feedback if this is the right place to issue this warning.

I've decided to check for duplicates in-line instead of adding another method as it seems ActiveRecord::Querying is intentionally slim. I also decided against any clever way of finding duplicate names faster than O(n^2) since the number of columns isn't usually large.

@rails-bot
Copy link

rails-bot bot commented Apr 23, 2022

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
Thank you for your contributions.

@rails-bot rails-bot bot added the stale label Apr 23, 2022
@rails-bot rails-bot bot closed this Apr 30, 2022
@khasinski
Copy link
Contributor Author

Bump, since the ticket is still open

@khasinski khasinski force-pushed the warn-when-instantiating-records-with-duplicate-column-names branch 2 times, most recently from f119e66 to 99066a2 Compare November 5, 2022 13:26
@khasinski khasinski requested a review from p8 November 5, 2022 13:27
@khasinski khasinski force-pushed the warn-when-instantiating-records-with-duplicate-column-names branch from 99066a2 to 4b50a62 Compare November 5, 2022 13:40
@khasinski
Copy link
Contributor Author

I'm not sure why the build is failing, but I've rebased it on top of main since it was quite a lot behind the current version.

@@ -69,6 +69,11 @@ def _load_from_sql(result_set, &block) # :nodoc:
column_types = column_types.reject { |k, _| attribute_types.key?(k) }
end

duplicate_column_names = result_set.columns.tally.select { |_, v| v > 1 }.keys
if duplicate_column_names.any?
logger.warn "Duplicate column names: #{duplicate_column_names.join(', ')}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This message isn't actionable. And the SQL can be executed very far away form the code that generated it, so we need to include which SQL has the duplicated names. In fact, I'm not sure a warning is the right thing to do here. Is there any case were this behavior is intended and not a mistake?

Copy link
Contributor Author

@khasinski khasinski Nov 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This behaviour might be intended (in terms of user who can explicitly name two columns the same), but results in a model that might have odd properties. I'd say the most common problem is when you have multiple identity columns, since you can inadvertently fetch a record, change it and save it with a wrong id, overwriting some data.

I've stumbled upon this behaviour in client's app code that's why I'm interested in at least logging this case. Perhaps we can just add a warning for multiple identity columns?

Also - any suggestions for a better error message?

@khasinski khasinski force-pushed the warn-when-instantiating-records-with-duplicate-column-names branch from 63a3b60 to 33fa86e Compare February 21, 2023 14:38
@khasinski
Copy link
Contributor Author

This got stale (and had some conflicts 🤔), so I've rebased it on top of main

@khasinski khasinski force-pushed the warn-when-instantiating-records-with-duplicate-column-names branch 3 times, most recently from e84aa54 to 6730b63 Compare February 21, 2023 16:10
@zzak
Copy link
Member

zzak commented Feb 22, 2023

There are a bunch of failures in CI, would you mind taking a look?

It is possible to query for models with a result that has duplicate
columns. For example

```ruby
Post.joins(:author).select("*").first
```

Will have at least two `id` columns (and possibly more).

Loading record matches the column names and will overwrite
the resulting column several times on duplicates which might
lead to unexpected query results.

Fixes rails#41151
@khasinski khasinski force-pushed the warn-when-instantiating-records-with-duplicate-column-names branch from 6730b63 to b152b0c Compare February 22, 2023 09:48
@khasinski
Copy link
Contributor Author

@zzak all done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Join table data assigned to Model
5 participants