Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix association with scope including joins #29413

Merged

Conversation

@kamipo
Copy link
Member

@kamipo kamipo commented Jun 11, 2017

Fixes #28324.

@robotdana
Copy link
Contributor

@robotdana robotdana commented Jun 14, 2017

I was going to set up a new issue then found this, so I have some extra test cases:

begin
  require "bundler/inline"
rescue LoadError => e
  $stderr.puts "Bundler version 1.10 or later is required. Please update your Bundler"
  raise e
end

gemfile(true) do
  source "https://rubygems.org"
  # gem "rails", github: "rails/rails"
  gem "rails", github: "kamipo/rails", branch: "fix_association_with_scope_including_joins"
  gem "arel", github: "rails/arel"
  gem "sqlite3"
end

require "active_record"
require "minitest/autorun"
require "logger"

# This connection will do for database-independent bug reports.
ActiveRecord::Base.establish_connection(adapter: "sqlite3", database: ":memory:")
ActiveRecord::Base.logger = Logger.new(STDOUT)

ActiveRecord::Schema.define do
  create_table :posts, force: true do |t|
  end

  create_table :comments, force: true do |t|
    t.integer :post_id
    t.integer :author_id
  end

  create_table :likes, force: true do |t|
    t.integer :comment_id
    t.integer :author_id
  end

  create_table :authors, force: true do |t|
    t.boolean :verified
  end
end

class Post < ActiveRecord::Base
  has_many :comments
  has_many :likes, through: :comments
  has_many :likes_workaround, -> { joins(:author) }, through: :comments, source: :likes, class_name: 'Like'
  # `has_many :likes, -> { joins(:author) }, through: :comments` is all that would be necessary if I didn't rename it
end

class Comment < ActiveRecord::Base
  belongs_to :post
  has_many :likes, -> { verified }
end

class Like < ActiveRecord::Base
  belongs_to :comment
  belongs_to :author
  scope :verified, -> { joins(:author).where(authors: { verified: true }) }
end

class Author < ActiveRecord::Base
  has_many :likes
end

class BugTest < Minitest::Test
  def setup
    @post = Post.create!
    @comment = Comment.create!(post: @post)
    @author = Author.create!(verified: true)
    @like = Like.create!(author: @author, comment: @comment)
    @post_relation = Post.where(id: @post.id)
  end

  def test_stuff_that_should_work
    assert_equal 1, @post.comments.count
    assert_equal 1, @comment.likes.count
    assert_equal 1, @post.likes_workaround.count # works because we repeated the join in the has_many through
    assert_equal 1, @post.likes.joins(:author).count # works because we repeated the join here.
    # breaking out `includes` into `preload` and `eager_load` because they work differently, a.k.a: "includes sometimes works."
    assert_equal [@post], @post_relation.preload(:likes).to_a # works because it's a separate query
  end

  def test_has_many_through_eager_load_the_join_scope_too
    # passes in master, fails on your branch
    assert_equal [@post], @post_relation.eager_load(likes: :author).to_a
  end

  def test_has_many_through_count_with_a_join_scope
    # fails in your branch & master
    assert_equal 1, @post.likes.count # the authors joins in the scope gets lost.
  end

  def test_has_many_through_eager_load_with_a_join_scope
    # passes in your branch, fails in master
    assert_equal [@post], @post_relation.eager_load(:likes).to_a # the authors joins in the scope gets lost
  end

  def test_has_many_through_eager_load_with_a_repeated_join_scope
    # passes in your branch, fails in master
    assert_equal [@post], @post_relation.eager_load(:likes_workaround).to_a # even the repeated join in the has_many through is forgotten
  end
end

@robotdana
Copy link
Contributor

@robotdana robotdana commented Jun 14, 2017

These all look like the same issue:
#28703
#28440
#26780
#22538
#14110

@kamipo kamipo force-pushed the fix_association_with_scope_including_joins branch from 063f361 to f688f38 Jun 18, 2017
@rafaelfranca
Copy link
Member

@rafaelfranca rafaelfranca commented Jun 28, 2017

Could rebase this and add a CHANGELOG entry?

@kamipo
Copy link
Member Author

@kamipo kamipo commented Jun 28, 2017

@dnl Thank you for your testing. I investigated the failures.

About test_has_many_through_eager_load_the_join_scope_too. The test will only pass for sqlite3 adapter because SQLite3 don't care JOIN tables order. It seems that most adapters doesn't pass the test even if master branch (at least mysql2 and postgresql adapters failed). The failure on this PR is due to join scope includes the same table with main relation and constructing join scope doesn't care the join table list in main relation. It is hard to fix for now.

About test_has_many_through_count_with_a_join_scope. In this PR, Post.preload(:likes).first.likes.to_a and Post.eager_load(:likes).first.likes.to_a will work, but Post.first.likes.to_a doesn't work. It seems an issue on AssociationScope. I'll fix the issue at first. After that, I'll revisit this PR.

@kamipo kamipo force-pushed the fix_association_with_scope_including_joins branch from f688f38 to db3ff25 Jul 3, 2017
@kamipo
Copy link
Member Author

@kamipo kamipo commented Jul 4, 2017

Rebased and added a CHANGELOG entry.

@rafaelfranca rafaelfranca merged commit 990a4db into rails:master Jul 4, 2017
2 checks passed
@kamipo kamipo deleted the fix_association_with_scope_including_joins branch Jul 4, 2017
@joelvh
Copy link

@joelvh joelvh commented Dec 5, 2017

@rafaelfranca will this be released in 5.1.x or wait for 5.2.0?

tbrisker added a commit to tbrisker/foreman that referenced this issue Aug 27, 2018
This commit improves on the fix in 0180919 which has been found to
cause significant performance regressions. It also adds several other
improvements to the API scope performance, the most significant one
being only checking that the scope work in `parent_scope` rather than
loading all of it into an in-memory array which can be very heavy.

The original fix was needed because of a bug in the way Rails merges
scopes, which will be fixed in Rails 5.2 by
rails/rails#29413. Once we upgrade to Rails 5.2
the workaround can be removed since it still comes with worse
performance compared to the previous implementation.
tbrisker added a commit to tbrisker/foreman that referenced this issue Aug 28, 2018
This commit improves on the fix in 0180919 which has been found to
cause significant performance regressions. It also adds several other
improvements to the API scope performance, the most significant one
being only checking that the scope work in `parent_scope` rather than
loading all of it into an in-memory array which can be very heavy.

The original fix was needed because of a bug in the way Rails merges
scopes, which will be fixed in Rails 5.2 by
rails/rails#29413. Once we upgrade to Rails 5.2
the workaround can be removed since it still comes with worse
performance compared to the previous implementation.
xprazak2 added a commit to theforeman/foreman that referenced this issue Aug 30, 2018
This commit improves on the fix in 0180919 which has been found to
cause significant performance regressions. It also adds several other
improvements to the API scope performance, the most significant one
being only checking that the scope work in `parent_scope` rather than
loading all of it into an in-memory array which can be very heavy.

The original fix was needed because of a bug in the way Rails merges
scopes, which will be fixed in Rails 5.2 by
rails/rails#29413. Once we upgrade to Rails 5.2
the workaround can be removed since it still comes with worse
performance compared to the previous implementation.
tbrisker added a commit to theforeman/foreman that referenced this issue Aug 30, 2018
This commit improves on the fix in 0180919 which has been found to
cause significant performance regressions. It also adds several other
improvements to the API scope performance, the most significant one
being only checking that the scope work in `parent_scope` rather than
loading all of it into an in-memory array which can be very heavy.

The original fix was needed because of a bug in the way Rails merges
scopes, which will be fixed in Rails 5.2 by
rails/rails#29413. Once we upgrade to Rails 5.2
the workaround can be removed since it still comes with worse
performance compared to the previous implementation.

(cherry picked from commit e650380)
tbrisker added a commit to tbrisker/foreman that referenced this issue Aug 30, 2018
This commit improves on the fix in 0180919 which has been found to
cause significant performance regressions. It also adds several other
improvements to the API scope performance, the most significant one
being only checking that the scope work in `parent_scope` rather than
loading all of it into an in-memory array which can be very heavy.

The original fix was needed because of a bug in the way Rails merges
scopes, which will be fixed in Rails 5.2 by
rails/rails#29413. Once we upgrade to Rails 5.2
the workaround can be removed since it still comes with worse
performance compared to the previous implementation.

(cherry picked from commit e650380)
xprazak2 added a commit to theforeman/foreman that referenced this issue Aug 30, 2018
This commit improves on the fix in 0180919 which has been found to
cause significant performance regressions. It also adds several other
improvements to the API scope performance, the most significant one
being only checking that the scope work in `parent_scope` rather than
loading all of it into an in-memory array which can be very heavy.

The original fix was needed because of a bug in the way Rails merges
scopes, which will be fixed in Rails 5.2 by
rails/rails#29413. Once we upgrade to Rails 5.2
the workaround can be removed since it still comes with worse
performance compared to the previous implementation.

(cherry picked from commit e650380)
glekner added a commit to glekner/foreman that referenced this issue Aug 30, 2018
This commit improves on the fix in 0180919 which has been found to
cause significant performance regressions. It also adds several other
improvements to the API scope performance, the most significant one
being only checking that the scope work in `parent_scope` rather than
loading all of it into an in-memory array which can be very heavy.

The original fix was needed because of a bug in the way Rails merges
scopes, which will be fixed in Rails 5.2 by
rails/rails#29413. Once we upgrade to Rails 5.2
the workaround can be removed since it still comes with worse
performance compared to the previous implementation.
louietyj added a commit to louietyj/rails that referenced this issue Oct 2, 2018
…_including_joins

Fix association with scope including joins
louietyj added a commit to louietyj/coursemology2 that referenced this issue Oct 2, 2018
louietyj added a commit to louietyj/coursemology2 that referenced this issue Oct 3, 2018
louietyj added a commit to louietyj/coursemology2 that referenced this issue Oct 3, 2018
louietyj added a commit to louietyj/coursemology2 that referenced this issue Oct 4, 2018
kamipo added a commit to kamipo/rails that referenced this issue Aug 7, 2020
…iation with join scope

I had found the issue while working on fixing rails#33525.

That is if duplicated association has a scope which has `where` with
explicit table name condition (e.g. `where("categories.name": "General")`),
that condition in all duplicated associations will filter the first one
only, other all duplicated associations are not filtered, since
duplicated joins will be aliased except the first one (e.g.
`INNER JOIN "categories" "categories_categorizations"`).

```ruby
class Author < ActiveRecord::Base
  has_many :general_categorizations, -> { joins(:category).where("categories.name": "General") }, class_name: "Categorization"
  has_many :general_posts, through: :general_categorizations, source: :post
end

authors = Author.eager_load(:general_categorizations, :general_posts).to_a
```

Generated eager loading query:

```sql
SELECT "authors"."id" AS t0_r0, ... FROM "authors"

-- `has_many :general_categorizations, -> { joins(:category).where("categories.name": "General") }`
LEFT OUTER JOIN "categorizations" ON "categorizations"."author_id" = "authors"."id"
INNER JOIN "categories" ON "categories"."id" = "categorizations"."category_id" AND "categories"."name" = ?

-- `has_many :general_posts, through: :general_categorizations, source: :post`
---- duplicated `through: :general_categorizations` part
LEFT OUTER JOIN "categorizations" "general_categorizations_authors_join" ON "general_categorizations_authors_join"."author_id" = "authors"."id"
INNER JOIN "categories" "categories_categorizations" ON "categories_categorizations"."id" = "general_categorizations_authors_join"."category_id" AND "categories"."name" = ? -- <-- filtering `"categories"."name" = ?` won't work
---- `source: :post` part
LEFT OUTER JOIN "posts" ON "posts"."id" = "general_categorizations_authors_join"."post_id"
```

Originally eager loading with join scope didn't work before Rails 5.2
(rails#29413), and duplicated through association with join scope raised a
duplicated alias error before alias tracking is improved in 590b045.

But now it will potentially be got incorrect result instead of an error,
it is worse than an error.

To fix the issue, it makes eager loading to deduplicate / re-use
duplicated through association if possible, like as `preload`.

```sql
SELECT "authors"."id" AS t0_r0, ... FROM "authors"

-- `has_many :general_categorizations, -> { joins(:category).where("categories.name": "General") }`
LEFT OUTER JOIN "categorizations" ON "categorizations"."author_id" = "authors"."id"
INNER JOIN "categories" ON "categories"."id" = "categorizations"."category_id" AND "categories"."name" = ?

-- `has_many :general_posts, through: :general_categorizations, source: :post`
---- `through: :general_categorizations` part is deduplicated / re-used
LEFT OUTER JOIN "posts" ON "posts"."id" = "categorizations"."post_id"
```

Fixes rails#32819.
kamipo added a commit to kamipo/rails that referenced this issue Aug 7, 2020
…iation with join scope

I had found the issue while working on fixing rails#33525.

That is if duplicated association has a scope which has `where` with
explicit table name condition (e.g. `where("categories.name": "General")`),
that condition in all duplicated associations will filter the first one
only, other all duplicated associations are not filtered, since
duplicated joins will be aliased except the first one (e.g.
`INNER JOIN "categories" "categories_categorizations"`).

```ruby
class Author < ActiveRecord::Base
  has_many :general_categorizations, -> { joins(:category).where("categories.name": "General") }, class_name: "Categorization"
  has_many :general_posts, through: :general_categorizations, source: :post
end

authors = Author.eager_load(:general_categorizations, :general_posts).to_a
```

Generated eager loading query:

```sql
SELECT "authors"."id" AS t0_r0, ... FROM "authors"

-- `has_many :general_categorizations, -> { joins(:category).where("categories.name": "General") }`
LEFT OUTER JOIN "categorizations" ON "categorizations"."author_id" = "authors"."id"
INNER JOIN "categories" ON "categories"."id" = "categorizations"."category_id" AND "categories"."name" = ?

-- `has_many :general_posts, through: :general_categorizations, source: :post`
---- duplicated `through: :general_categorizations` part
LEFT OUTER JOIN "categorizations" "general_categorizations_authors_join" ON "general_categorizations_authors_join"."author_id" = "authors"."id"
INNER JOIN "categories" "categories_categorizations" ON "categories_categorizations"."id" = "general_categorizations_authors_join"."category_id" AND "categories"."name" = ? -- <-- filtering `"categories"."name" = ?` won't work
---- `source: :post` part
LEFT OUTER JOIN "posts" ON "posts"."id" = "general_categorizations_authors_join"."post_id"
```

Originally eager loading with join scope didn't work before Rails 5.2
(rails#29413), and duplicated through association with join scope raised a
duplicated alias error before alias tracking is improved in 590b045.

But now it will potentially be got incorrect result instead of an error,
it is worse than an error.

To fix the issue, it makes eager loading to deduplicate / re-use
duplicated through association if possible, like as `preload`.

```sql
SELECT "authors"."id" AS t0_r0, ... FROM "authors"

-- `has_many :general_categorizations, -> { joins(:category).where("categories.name": "General") }`
LEFT OUTER JOIN "categorizations" ON "categorizations"."author_id" = "authors"."id"
INNER JOIN "categories" ON "categories"."id" = "categorizations"."category_id" AND "categories"."name" = ?

-- `has_many :general_posts, through: :general_categorizations, source: :post`
---- `through: :general_categorizations` part is deduplicated / re-used
LEFT OUTER JOIN "posts" ON "posts"."id" = "categorizations"."post_id"
```

Fixes rails#32819.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

4 participants