Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix different `count` calculation when using `size` with DISTINCT `select` #35361

Merged
merged 1 commit into from Feb 26, 2019

Conversation

Projects
None yet
3 participants
@jvillarejo
Copy link
Contributor

jvillarejo commented Feb 21, 2019

Summary

When using select with 'DISTINCT( ... )' if you use method size on a non loaded relation it overrides the column selected by passing :all so it returns different value than count.

Fixes #35214

@rails-bot rails-bot bot added the activerecord label Feb 21, 2019

@jvillarejo jvillarejo changed the title fixed different count calculation when using size with DISTINCT select Fix different count calculation when using size with DISTINCT select Feb 21, 2019

@jvillarejo jvillarejo changed the title Fix different count calculation when using size with DISTINCT select Fix different `count` calculation when using `size` with DISTINCT `select` Feb 21, 2019

@jvillarejo jvillarejo force-pushed the jvillarejo:fix_wrong_size_query_with_distinct_select branch from c4020d9 to 957ea79 Feb 21, 2019

@kamipo

This comment has been minimized.

Copy link
Member

kamipo commented Feb 21, 2019

This change also affects to count(:all).
Can you add test case for count(:all) as well (e.g. ebc09ed)?

@jvillarejo

This comment has been minimized.

Copy link
Contributor Author

jvillarejo commented Feb 22, 2019

Thanks for your reply! Totally! I will add more tests cases.

@jvillarejo

This comment has been minimized.

Copy link
Contributor Author

jvillarejo commented Feb 22, 2019

Hey @kamipo I added more test cases, as some of them didn't gave the correct original count value when reassigning the column_name.

So in order to fix the issue and keep the functionality instead of changing the column_name, distinct is assigned with the select_for_count values.

I think this fixes the issue with the minor impact.

However in terms of code I don't think it's very understandable. Assigning distinct to select_for_count but then 4 lines below assign distinct to nil when column_name isn't :all is very confusing.

What do you think would be better?

Also I think the test cases about size shouldn't be on test/cases/calculations_test.rb file but inside test/cases/relation folder. Do you think that creating a file test/cases/relation/size_test.rb there would be a good approach?

Thanks!

end

def test_size_with_manual_select_with_group_with
c = Account.select("DISTINCT accounts.firm_id").group("accounts.firm_id").size

This comment has been minimized.

@kamipo

kamipo Feb 23, 2019

Member

This should return 5, since it should be the same result with Account.select("DISTINCT accounts.firm_id").group("accounts.firm_id").load.size.

This comment has been minimized.

@jvillarejo

jvillarejo Feb 25, 2019

Author Contributor

Somehow executing:

Account.select("DISTINCT accounts.firm_id").group("accounts.firm_id").size
# { nil => 1, 1 => 1, 2 => 1, 6 => 2, 9 => 1 }

Now returns the hash. But I agree that it should return the same of:

Account.select("DISTINCT accounts.firm_id").group("accounts.firm_id").load.size
# 5

Should I investigate this for this PR ?

This comment has been minimized.

@kamipo

kamipo Feb 25, 2019

Member

Feel free to create another PR.

This comment has been minimized.

@localhostdotdev

localhostdotdev Feb 26, 2019

As an user, I expect .group.size (same as .group.count) to return an hash. And doing .load.size / to_a.size would return a collection of the instances of the model and then how many of them there are.

This comment has been minimized.

@kamipo

kamipo Feb 26, 2019

Member

Originally, relation.size behaves fragile, that is a breaking change though.

relation = Post.left_joins(:comments).group(:id)
relation.size # => return a hash
relation.each { |record| puts record.id }
relation.size # => return an integer

relation = Post.left_joins(:comments).group(:id)
relation.count # => return a hash
relation.each { |record| puts record.id }
relation.count # => return a hash

relation = Post.left_joins(:comments).group(:id)
relation.length # => return an integer
relation.each { |record| puts record.id }
relation.length # => return an integer

This comment has been minimized.

@jvillarejo

jvillarejo Feb 26, 2019

Author Contributor

But in my opinion size and load.size should have the same return value the difference should be that when using just size if the collection is loaded it will not execute a query but if it isn't it executes a query without loading the collection instances.

That they return different values when the collection is loaded and when it isn't is confusing.

def test_size_with_manual_distinct_select_with_loaded_size
size = Account.select("DISTINCT accounts.firm_id").size
loaded_size = Account.select("DISTINCT accounts.firm_id").load.size
assert_equal size, loaded_size

This comment has been minimized.

@kamipo

kamipo Feb 23, 2019

Member

Also I think the test cases about size shouldn't be on test/cases/calculations_test.rb file

Yes, just put them in test/cases/relations_test.rb for now.

def test_size_with_distinct
posts = Post.distinct.select(:author_id, :comments_count)
assert_queries(1) { assert_equal 8, posts.size }
assert_queries(1) { assert_equal 8, posts.load.size }
end
def test_size_with_eager_loading_and_custom_order
posts = Post.includes(:comments).order("comments.id")
assert_queries(1) { assert_equal 11, posts.size }
assert_queries(1) { assert_equal 11, posts.load.size }
end
def test_size_with_eager_loading_and_custom_order_and_distinct
posts = Post.includes(:comments).order("comments.id").distinct
assert_queries(1) { assert_equal 11, posts.size }
assert_queries(1) { assert_equal 11, posts.load.size }
end

This comment has been minimized.

@jvillarejo

jvillarejo Feb 25, 2019

Author Contributor

There is a weird situation added a test with includes:

def test_size_with_eager_loading_and_manual_distinct_select_and_includes
  accounts = Account.select("DISTINCT accounts.firm_id").includes(:firm).order("firms.id")

  assert_queries(1) { assert_equal 5, accounts.size }
  assert_queries(1) { assert_equal 5, accounts.load.size }
end

It raises:
# ActiveRecord::StatementInvalid: SQLite3::SQLException: near "DISTINCT": syntax error
I can investigate further if you want.

But this also gets raised when using:

Account.distinct(:credit_limit).select("DISTINCT accounts.firm_id").count(:all)
# ActiveRecord::StatementInvalid: SQLite3::SQLException: near "DISTINCT": syntax error

What do you think about this?

This comment has been minimized.

@kamipo

kamipo Feb 25, 2019

Member

This is an issue for count with eager loading.
I'd prefer to address it in another PR, since it is a little complex than what this PR fixes.

This comment has been minimized.

@jvillarejo

jvillarejo Feb 26, 2019

Author Contributor

Ok I'll work it in another PR!

@kamipo

kamipo approved these changes Feb 26, 2019

Copy link
Member

kamipo left a comment

Looks good to me.
Can you add changelog entry and squash your commits into one?

@jvillarejo jvillarejo force-pushed the jvillarejo:fix_wrong_size_query_with_distinct_select branch from 1b0a059 to fa2c61f Feb 26, 2019

fixes different `count` calculation when using `size` manual `select`…
… with DISTINCT

When using `select` with `'DISTINCT( ... )'` if you use method `size` on a non loaded relation it overrides the column selected by passing `:all` so it returns different value than count.

This fixes #35214
@jvillarejo

This comment has been minimized.

Copy link
Contributor Author

jvillarejo commented Feb 26, 2019

Done! Thank you so much for your help! This is my first Rails PR 😄. Is it correct?

@kamipo kamipo merged commit fa2c61f into rails:master Feb 26, 2019

3 checks passed

buildkite/rails Build #59105 passed (15 minutes, 19 seconds)
Details
codeclimate All good!
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

kamipo added a commit that referenced this pull request Feb 26, 2019

Merge pull request #35361 from jvillarejo/fix_wrong_size_query_with_d…
…istinct_select

Fix different `count` calculation when using `size` with DISTINCT `select`
@kamipo

This comment has been minimized.

Copy link
Member

kamipo commented Feb 26, 2019

Thanks! And congrats on your first Rails contribution 🎉

@jvillarejo jvillarejo deleted the jvillarejo:fix_wrong_size_query_with_distinct_select branch Feb 26, 2019

kamipo added a commit that referenced this pull request Feb 28, 2019

Merge pull request #35361 from jvillarejo/fix_wrong_size_query_with_d…
…istinct_select

Fix different `count` calculation when using `size` with DISTINCT `select`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.