-
Notifications
You must be signed in to change notification settings - Fork 21.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add option to skip joins for associations. #41937
Add option to skip joins for associations. #41937
Conversation
I went back and forth about the option a few times - in our gem it's |
activerecord/CHANGELOG.md
Outdated
|
||
```ruby | ||
class Dog | ||
has_many :treats, through: :humans, skip_joins: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think :disable_joins
is more descriptive on the intention of disabling the join in the association table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that, will update tomorrow.
records = @records | ||
|
||
if records.length > TOO_MANY_RECORDS | ||
warn("You've requested to order #{records.length} in memory. This may have an impact on the performance of this query. Use with caution.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I doubt this warning will be useful. It will be buried in a lot of logs and depending in the configuration, since it is not going to the logger but to STDOUT it will just not be seeing. I think it is better to remove.
abf5635
to
69c9e80
Compare
So the postgres tests that are failing are related to that change I mistakingly made about a month ago where the The tests that are failing aren't failing on the disabled joins, they're actually failing on the joins because it's trying to compare a bigint with a string.
Is this supposed to work or is it supposed to fail? I'm not sure what the correct fix here is. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some skip_joins
in the doc but other than that
|
||
```ruby | ||
class Dog < AnimalsRecord | ||
has_many :treats, through: :humans, skip_joins: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
has_many :treats, through: :humans, skip_joins: true | |
has_many :treats, through: :humans, disable_joins: true |
|
||
As of Rails 7.0+, Active Record has an option for handling associations that would perform | ||
a join across multiple databases. If you have a has many through association that you want to | ||
skip joining and perform 2 or more queries, pass the `skip_joins: true` option. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
skip joining and perform 2 or more queries, pass the `skip_joins: true` option. | |
skip joining and perform 2 or more queries, pass the `disable_joins: true` option. |
69c9e80
to
7fd9e5a
Compare
For what I could get this test is going to fail because the join really can't work with type cast in the query. The type added in #14855 is correct thought, it will not work for joins but it should work for association accessors which is what that PR is fixing. Maybe we should change the test of that PR to use a different model/association or this PR should use a different model/association? |
Yea I think that makes sense. I'll do that tomorrow or later in the week. |
Is this generally necessary? In theory we already know the two models live in different databases, so it feels like we should be able to infer that an in-DB join is inappropriate 🤔 (Note that's not "I don't think we should have this option", but "I don't think people should normally need to specify it themselves"... am I missing something that makes it impractical for us to detect this situation automatically, or is that just a future possible improvement?) |
Aaron and I spent a lot of time trying to skip the whole needing to specify the disable joins, but it's not really possible with how Rails association loading works. The problem is that associations are lazily loaded so by the time the |
def initialize(klass, *args) | ||
key, ids = args |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def initialize(klass, *args) | |
key, ids = args | |
def initialize(klass, key, ids) |
de24b93
to
1e20d0d
Compare
1e20d0d
to
e2302dd
Compare
In a multiple database application, associations can't join across databases. When set, this option tells Rails to make 2 or more queries rather than using joins for associations. Set the option on a has many through association: ```ruby class Dog has_many :treats, through: :humans, disable_joins: true has_many :humans end ``` Then instead of generating join SQL, two queries are used for `@dog.treats`: ``` SELECT "humans"."id" FROM "humans" WHERE "humans"."dog_id" = ? [["dog_id", 1]] SELECT "treats".* FROM "treats" WHERE "treats"."human_id" IN (?, ?, ?) [["human_id", 1], ["human_id", 2], ["human_id", 3]] ``` This code is extracted from a gem we use internally at GitHub which means the implementation here is used in production daily and isn't experimental. I often get the question "why can't Rails do this automatically" so I figured I'd include the answer in the commit. Rails can't do this automatically because associations are lazily loaded. `dog.treats` needs to load `Dog`, then `Human` and then `Treats`. When `dog.treats` is called Rails pre-generates the SQL that will be run and puts that information into a reflection object. Because the SQL parts are pre-generated, as soon as `dog.treats` is loaded it's too late to skip a join. The join is already available on the object and that join is what's run to load `treats` from `dog` through `humans`. I think the only way to avoid setting an option on the association is to rewrite how and when the SQL is generated for associations which is a large undertaking. Basically the way that Active Record associations are designed, it is currently impossible to have Rails figure out to not join (loading the association will cause the join to occur, and that join will raise an error if the models don't live in the same db). The original implementation was written by me and Aaron. Lee helped port over tests, and I refactored the extraction to better match Rails style. Co-authored-by: Lee Quarella <leequarella@gmail.com> Co-authored-by: Aaron Patterson <aaron@rubyonrails.org>
e2302dd
to
de6b4ef
Compare
Thank you for adding this! I would love to have this option for Edit: I've opened #42079 to do this. |
@eileencodes Just curious if there is a reason why this was added only on |
Because a regular |
Got it. thank you! |
@eileencodes I was wondering if you could help me clarify this; According to my findings, the so considering the following pseudo-rails-code
Then However, Are you able to confirm I'm not missing something here? Any chance you can comment on a possible workaround, or suggest any different approach? (this is a problem for me because Warehouse Many thanks, and my apologies for reviving an old thread.. |
I don't have the bandwidth to offer application support. If there's a bug please open a new issue, if it's just you're not sure how to use the feature start a thread on the forum. But calling |
no problem. I did not ask for application support. I'm only suggesting your feedback was not quite accurate and was wondering if you could contribute any internal Rails knowledge to confirm my findings.
Rails DOES produce a join query with normal has_many queries, when a where condition is used and that where condition references a relationship. So
is not accurate unless I'm missing something. |
A has many in which you are not manually calling joins does not produce a join query. Given the following associations: class Post
has_many :comments
end
class Comment
belongs_to :post
end Calling Post.first.comments The SQL: Post Load (0.0ms) SELECT "posts".* FROM "posts" ORDER BY "posts"."id" ASC LIMIT ? [["LIMIT", 1]]
Comment Load (0.0ms) SELECT "comments".* FROM "comments" WHERE "comments"."post_id" = ? /* loading for inspect */ LIMIT ? [["post_id", 1], ["LIMIT", 11]] |
Thanks @eileencodes Anyone else landing here with the same problem, please read my previous comment:
As such: Post.joins(:comments).where(:comments => { approved:true }) is what triggers a JOIN to be created instead (or variations thereof) Post.joins(:comments).where(:comments => { approved:true }).first
Post.joins(:comments).where(:comments => { approved:true }).first.comments The SQL: Post Load (1.0ms) SELECT "posts".* FROM "posts" INNER JOIN "comments" ON "comments"."post_id" = "posts"."id" WHERE "comments"."approved" = ? ORDER BY "posts"."id" ASC LIMIT ? [["approved", 1], ["LIMIT", 1]] This is all valid SQL and valid operations, however the lack of a Rails Documentation is arguably not wrong, but not exactly accurate either. |
But you're calling |
Ha! Sorry, yes, my attempt to simplify my code has obviously led to using a badly chosen example! 🤣 After subsequent investigation I discovered that the source of the problem originated from some Rails magic ( I was something along the lines of (don't have my notes with me atm)
|
In a multiple database application, associations can't join across
databases. When set, this option tells Rails to make 2 or more queries
rather than using joins for associations.
Set the option on a has many through association:
Then instead of generating join SQL, two queries are used for
@dog.treats
:This code is extracted from a gem we use internally at GitHub which
means the implementation here is used in production daily and isn't
experimental.
I often get the question "why can't Rails do this automatically" so I
figured I'd include the answer in the commit. Rails can't do this
automatically because associations are lazily loaded.
dog.treats
needsto load
Dog
, thenHuman
and thenTreats
. Whendog.treats
iscalled Rails pre-generates the SQL that will be run and puts that
information into a reflection object. Because the SQL parts are pre-generated,
as soon as
dog.treats
is loaded it's too late to skip a join. The joinis already available on the object and that join is what's run to load
treats
fromdog
throughhumans
. I think the only way to avoid settingan option on the association is to rewrite how and when the SQL is
generated for associations which is a large undertaking. Basically the
way that Active Record associations are designed, it is currently
impossible to have Rails figure out to not join (loading the association
will cause the join to occur, and that join will raise an error if the
models don't live in the same db).
The original implementation was written by me and Aaron. Lee helped port
over tests, and I refactored the extraction to better match Rails style.
Co-authored-by: Lee Quarella leequarella@gmail.com
Co-authored-by: Aaron Patterson aaron@rubyonrails.org