Optimize none? and one? relation query methods to use LIMIT 1 and COUNT. #12670

Closed
wants to merge 1 commit into
from

Conversation

Projects
None yet
4 participants
Contributor

egilburg commented Oct 28, 2013

I was surprised to find out that relation's none? method was loading the entire collection as an array,
even if no block or limit was given. This contrasts how any? and many? work, which is to use LIMIT 1 and COUNT respectively, if no block and limit is given.

IMO it's more natural to use .none? query rather than either empty? (because empty? is an Array method while none? is an Enumerable method and thus seems more suitable for a generic set such a a set of database rows). I therefore changed none? to follow the already existing behavior for any? and many?.

I also implemented an optimized one? for similar reasons and for completion sake - just as me, others could be surprised that one? behaves differently than any? and many?, and I don't see a drawback to being consistent in providing an efficient implementation if possible.

This change applies to relations (e.g. User.all) as well as associations (e.g. account.users).

Before:

users.none?
SELECT "users".* FROM "users"

users.one?
SELECT "users".* FROM "users"

After:

users.none?
SELECT 1 AS one FROM "users" LIMIT 1

users.one?
SELECT COUNT(*) FROM "users"

NullRelation has been updated to short-curciut none? and one?, as it already does with any? and many?.

Also, improved method documentation a bit.

I added the (appropriately modified) same set of tests as present for any? and many? methods in relations_test.rb and associations/has_many_association_test.rb unit tests.

Contributor

egilburg commented Oct 28, 2013

Seems the build is also erroring out on master branch.

@egilburg egilburg Optimize none? and one? relation query methods to use LIMIT and COUNT.
Use SQL COUNT and LIMIT 1 queries for none? and one? methods if no block or limit is given,
instead of loading the entire collection to memory. The any? and many? methods already
follow this behavior.
d657eec
Contributor

egilburg commented Oct 28, 2013

@core team: I'm not sure whether I should define one? and none? on CollectionAssociation in addition to defining it on Relation. Tests pass without this definition (because CollectionAssociation forwards to Relation), but any? and many? are defined on both CollectionAssociation and Relation, with slight implementation differences, although I'm not sure what the difference actually does and whether it's needed for my case.

Member

seuros commented May 27, 2014

👍

@cefigueiredo cefigueiredo commented on the diff Jun 16, 2014

...test/cases/associations/has_many_associations_test.rb
+ def test_calling_one_should_return_false_if_zero
+ firm = companies(:another_firm)
+ assert ! firm.clients_like_ms.one?
+ assert_equal 0, firm.clients_like_ms.size
+ end
+
+ def test_calling_one_should_return_true_if_one
+ firm = companies(:first_firm)
+ assert firm.limited_clients.one?
+ assert_equal 1, firm.limited_clients.size
+ end
+
+ def test_calling_one_should_return_false_if_more_than_one
+ firm = companies(:first_firm)
+ assert ! firm.clients.one?
+ assert_equal 2, firm.clients.size
@cefigueiredo

cefigueiredo Jun 16, 2014

Contributor

The database was changed since you did that test, so it's failing because now there are 3 clients...
I think you should rebase and change that test, because you are basing your test just in a quantity of data that could change over time on the test database... broking your test just because you forbidden the data to grow up...

Contributor

cefigueiredo commented Jun 16, 2014

When a block is not given to one?/none?, there is no point in loading the collection just to check if there is only one element or not, even when a limit is set...

Because even when a limit is given, the attributes of the element does not matter. What matter is just know if there is only one element or not.

So, it would be better do:

def one?
  if block_given?
    to_a.one? { |*block_args| yield(*block_args) }
  else
    size == 1
  end
end

With that, when calling one? and giving a block, it still loads the entire collection as an array and call the method one? from Enumerator. But when a block is not given, it call size, that translating to query is just a SELECT COUNT(*) that is far more efficient than retrieve all the data, and then, it checks if the size is equal 1...

@rafaelfranca rafaelfranca modified the milestone: 4.2.0, 5.0.0 Aug 18, 2014

Owner

rafaelfranca commented Feb 12, 2015

Merged at d7a7a05

@rafaelfranca rafaelfranca modified the milestone: 5.0.0 [temp], 5.0.0 Dec 30, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment