Skip to content

Keep INNER JOIN when merging relations #27063

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 21, 2017

Conversation

MaxLap
Copy link
Contributor

@MaxLap MaxLap commented Nov 16, 2016

Doing Author.joins(:posts).merge(Post.joins(:comments)) does this
SELECT ... INNER JOIN posts ON... LEFT OUTER JOIN comments ON...
instead of doing
SELECT ... INNER JOIN posts ON... INNER JOIN comments ON....

This behavior is unexpected and makes little sense as, basically, doing
Post.joins(:comments) means I want posts that have comments. Turning
it to a LEFT JOIN means I want posts and join the comments data, if
any.

We can see this problem directly in the existing tests.
The test_relation_merging_with_merged_joins_as_symbols only does joins
from posts to comments to ratings while the ratings fixture isn't
loaded, but the count is non-zero.

The only thing I'm not sure about is if my fix should use make_outer_joins as it was before and as is used in walk or if it should use make_left_outer_joins which is used above in the same method.

@rails-bot
Copy link

Thanks for the pull request, and welcome! The Rails team is excited to review your changes, and you should hear from @pixeltrix (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

This repository is being automatically checked for code quality issues using Code Climate. You can see results for this analysis in the PR status below. Newly introduced issues should be fixed before a Pull Request is considered ready to review.

Please see the contribution instructions for more information.

@mijoharas
Copy link

Is anyone available to review this PR? it fixes a bug that has stung me (and I assume other people). Thanks for the work @MaxLap

@mijoharas
Copy link

Follows from #26195

@maclover7
Copy link
Contributor

I believe this would also solve #28219

@pixeltrix
Copy link
Contributor

From the discussion on #16140 and #12933 and that @sgrif seems to have looked at this behaviour in his comment on #12933 I reckon that this must be by design - at the time it was implemented we didn't have the left_joins method so it would be a way of getting one without writing raw SQL. As other people have pointed out you can use a hash inside joins to construct a inner join and also you can add a has_many :through association definition so it's not like there's no way to achieve what you want.

Personally, I've never used merge as it seems ill-defined conceptually what merging two relations should do - I prefer to use the more explicit query builder methods.

@MaxLap thanks for your PR but sorry I can't merge it.

@pixeltrix pixeltrix closed this Mar 12, 2017
@MaxLap
Copy link
Contributor Author

MaxLap commented Mar 14, 2017

Nobody can explain the rationale because there is none.

I agree that merge can feel ill defined in what it should do exactly, but the doc says Merges in the conditions from other, if other is an ActiveRecord::Relation [...]

You can use a joins for 3 reasons:

  • You want to have some kinds of duplicates
  • You want to add conditions using that joins
  • You want to ignore some data if it doesn't map to some data (by using an INNER JOIN isntead of LEFT JOIN)

Sounds to me like the last 2 choices are "conditions". So changing the INNER JOIN to LEFT JOIN seems like changing the condition, which is clearly unexpected.

To me, the point of merge is to allow code reuse. You have a scope on posts which uses a joins, and then you want to use that scope to find only author that have a matching posts, then you can do: Author.joins(:posts).merge(Post.some_scope). Making a has_many through will either need to duplicate the code, or manually do what merge should be doing to allow code reuse.

You can't pass this off as a feature that could create LEFT JOIN before the left_joins methods appeared:

If you want to have a LEFT JOIN as 2nd join, you can, instead of writing it in text, do this:
Author.joins(:posts).merge(Post.joins(:comments))
Which will do the "expected":
SELECT "authors".* FROM "authors" INNER JOIN "posts" ON "posts"."author_id" = "authors"."id" LEFT OUTER JOIN "comments" ON "comments"."post_id" = "posts"."id"
Note that, this only work when it is nested! Doing:
Author.merge(Author.joins(:posts))
will keep the classical join behavior:
SELECT "authors".* FROM "authors" INNER JOIN "posts" ON "posts"."author_id" = "authors"."id"

Honestly, I just noticed this bug while trying to make a PR which got refused in the mailing list and thought it was a pretty easy fix. I don't need this feature, i'm just trying to help others that landed on this bug.

@mijoharas
Copy link

Just my two cents. My reading of the situation was that no-one was sure if it was by design, and given that it does cause bugs without this fix, and no-one can find a case where it would be needed, shouldn't we change it?

@pixeltrix
Copy link
Contributor

Honestly, I just noticed this bug while trying to make a PR which got refused in the mailing list and thought it was a pretty easy fix. I don't need this feature, i'm just trying to help others that landed on this bug.

@MaxLap and thanks for your contribution, but often someone's bug is another person's feature so we have to trade off both sides when we decide about whether to merge something. In this instance it's a definite change in behaviour that we can revisit when we've shipped 5.1 but I think it's too risky to do it now.

@pixeltrix
Copy link
Contributor

Just my two cents. My reading of the situation was that no-one was sure if it was by design

@mijoharas from @thedarkone's comment:

@joanniclaborde it is by design, all "association" joins are a merged
(via scope.merge(relation)) as OUTER JOINS, I can't give you a rationale,
because I don't know it, but if I "fix" merge to use  INNER JOINS all kind
of ActiveRecord tests start failing.

If tests start failing that's a pretty good indication that something was by design - even for Rails 😉

@MaxLap
Copy link
Contributor Author

MaxLap commented Mar 14, 2017

My PR fixes this issue and only needs to change a single test: a broken test that relied on missing fixtures, which was introduced in the PR that added the joins handling to merge (the PR root of this problem).

There are many ways to fix bugs, the use of double quotes around "fix" in that comment make it sound like it was just a quick change that may not have been as complete or thorough as it needed. For example, if the change also made #includes use a INNER JOIN, of course all hell would break lose.

@MaxLap
Copy link
Contributor Author

MaxLap commented Mar 14, 2017

And also, as a previous comment of mine mentions, Author.merge(Author.joins(:posts)) will leave it as a INNER JOIN, so not all "association" joins are merged as OUTER JOIN. Only the nested ones.

@pixeltrix
Copy link
Contributor

@MaxLap it's still a change in long-standing behavior after we've shipped a beta release, I promise to look again as soon as we branch for 5.1.

@MaxLap
Copy link
Contributor Author

MaxLap commented Mar 14, 2017

@pixeltrix Thank you!

@meinac
Copy link
Contributor

meinac commented Mar 17, 2017

I would love to see this behaviour change, thanks @MaxLap.

if join_root.match? oj.join_root
walk join_root, oj.join_root
else
oj.join_root.children.flat_map { |child|
make_outer_joins oj.join_root, child
if join_type == Arel::Nodes::OuterJoin
make_left_outer_joins oj.join_root, child
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why using make_left_outer_joins instead of the previous make_outer_joins?

Copy link
Contributor Author

@MaxLap MaxLap Jun 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a question I asked in the initial message of the pull request:

The only thing I'm not sure about is if my fix should use make_outer_joins as it was before and as is used in walk or if it should use make_left_outer_joins which is used above in the same method.

Since I'm doing basically the same code as what is done at line 110, I used the same one as is used there. The code of those 2 methods is similar, except for one using table_aliases_for, which is very obcsure to me.

So I leave that one last choice up to someone more experienced for those internals.

Note, this has been changed on HEAD, the 2 methods were replaced by a single one, but it has a parameter to specify if aliasing is to be done or not. Once the choice is made, i'll rebase with the chosen way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is fine to keep it the same as in the line 110. Could you please rebase. I'll merge it to master. I'm not going to backport since even that it is a bug it may cause behavior change in a stable branch for some people.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rafaelfranca The upgrade guide for whichever version this is included in probably needs to warn people about this loudly. It could lead to a lot of stuff being excluded from queries where it was previously included.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point. Usually we don't start the upgrade guide until before the release but I agree we should warn about this.

@@ -1,3 +1,8 @@
* Merging two relations which have joins no longer transforms the joins of
the merged relation into LEFT OUTER JOIN.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a new line here with: TODO: Add to the Rails 5.2 upgrade guide

@MaxLap MaxLap force-pushed the merge_keep_inner_join branch from 31b2d61 to 486b825 Compare June 21, 2017 00:46
Doing `Author.joins(:posts).merge(Post.joins(:comments))` does this
`SELECT ... INNER JOIN posts ON... LEFT OUTER JOIN comments ON...`
instead of doing
`SELECT ... INNER JOIN posts ON... INNER JOIN comments ON...`.

This behavior is unexpected and makes little sense as, basically, doing
`Post.joins(:comments)` means I want posts that have comments. Turning
it to a LEFT JOIN means I want posts and join the comments data, if
any.

We can see this problem directly in the existing tests.
The test_relation_merging_with_merged_joins_as_symbols only does joins
from posts to comments to ratings while the ratings fixture isn't
loaded, but the count is non-zero.
@MaxLap MaxLap force-pushed the merge_keep_inner_join branch from 486b825 to 249ddd0 Compare June 21, 2017 00:47
@MaxLap
Copy link
Contributor Author

MaxLap commented Jun 21, 2017

Rebased, added the message to CHANGELOG for the update guide. I also clarified in the CHANGELOG what actually changed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants