New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of ActiveRecord::Relation#blank? #10539

Merged
merged 1 commit into from Jun 7, 2013

Conversation

Projects
None yet
7 participants
@davidcelis
Contributor

davidcelis commented May 9, 2013

This is an SQL improvement to ActiveRecord::Relation#blank?. Currently,
it calls to_a on the Relation, which loads all records in the
association, and calls blank? on the loaded Array. There are other
ways, however, to check the emptiness of an association that are far
more performant. #empty?, #exists? and #any? all attach a LIMIT 1 to the SQL query before firing it off, which is a nice query
improvement. #blank? should do the same! This is easily done by
making #blank? an alias of #empty?.

Bonus performance improvements will also happen for #present?, which
merely calls the negation of #blank?

Signed-off-by: David Celis me@davidcel.is

@egilburg

This comment has been minimized.

Contributor

egilburg commented May 9, 2013

There is a distinction between SQL-specific syntax and basic Ruby array syntax. For example as of Rails 4, .distinct would do an SQL count, but .uniq would load all into memory and use Array's / Enumerable's uniq. For your case, there is a method called .exists? that uses SQL, and empty? seems to map to it, while blank? seems not to.

That being said, I agree with latest direction of making all commands that have an SQL equivalent use that equivalent, unless the user explicitly states they want Ruby array, such as the latest change that makes .all return scope, forcing user to explicitly use .to_a if they really want an array. This way novice users won't get burned if they use .all because they don't know about .scoped.

So I support this change - IMO all common methods (.empty?, .blank?, .any?, .uniq, I'd even go as far as .select and .detect) should map to their nearest equivalent SQL scope (.exists? for blank?/any?/ empty?, .distinct for uniq, .where for select, .where with LIMIT 1for .detect etc.), unless user explicitly does .to_a. first. The idea is to make the commands work best for the common case and for the novice user - experienced users who have edge case requirements would figure out how to explicitly cast to array on their own.


More importantly, however, I think Rails should pick a direction and stick to it - either map all common Ruby Array / Enumerable methods (and their Active Support extensions) to an SQL equivalent if it exists, or map none of them. The case where we map .empty? to .exists? but not .blank?, and similar cases like this, seem like a rather bad discrepancy.

Edit - realized isn't not quite as simple - exists? comes from Querying while blank? is on Relation, but the idea stands.

@schuetzm

This comment has been minimized.

Contributor

schuetzm commented May 10, 2013

@egilburg It's even more complicated - uniq (and distinct), when applied to a CollectionProxy, return an array (and thus are non-chainable), but for scopes and models return a Relation. Do you want to open a bug report/feature request? Unfortunately I don't have time to do it myself right now.

@egilburg

This comment has been minimized.

Contributor

egilburg commented May 10, 2013

Unfortunately I never quite understood the intricate relationship details between Relation and Association - I kinda feel this should be better documented within Active Record, perhaps even mentioned in the Guides. It is indeed very confusing to see some calls dispatched here and others there, and even more confusing when you throw the occasional alias of some Array/Enumerable methods (but not others) into the mix.

I remember a while back seeing a PR wanting to make Association subclass Relation, but not sure what happened to that.

@rafaelfranca

This comment has been minimized.

Member

rafaelfranca commented May 12, 2013

I agree with the change proposed here but I want to hear @jonleighton too.

@klass.connection.update(
um,

This comment has been minimized.

@vipulnsward

vipulnsward May 13, 2013

Member

Should these lines change?

This comment has been minimized.

@davidcelis

davidcelis May 13, 2013

Contributor

Those are just whitespace changes

@davidcelis

This comment has been minimized.

Contributor

davidcelis commented May 22, 2013

@rafaelfranca @jonleighton Where are we at with this?

@davidcelis

This comment has been minimized.

Contributor

davidcelis commented Jun 4, 2013

Any update? I really think this should be a no brainer.

@@ -21,6 +21,7 @@ class Relation
alias :model :klass
alias :loaded? :loaded
alias :default_scoped? :default_scoped
alias :empty? :blank?

This comment has been minimized.

@jonleighton

jonleighton Jun 7, 2013

Member

This alias appears to be the wrong way around, hence as far as I can see this change actually makes empty? slower rather than blank? faster.

This comment has been minimized.

@davidcelis

davidcelis Jun 7, 2013

Contributor

Wow, you're right. I accidentally swapped the alias. I have no idea how I didn't catch that; fixing!

ActiveRecord::Relation#blank? should `LIMIT 1`
This is an SQL improvement to ActiveRecord::Relation#blank?. Currently,
it calls `to_a` on the Relation, which loads all records in the
association, and calls `blank?` on the loaded Array. There are other
ways, however, to check the emptiness of an association that are far
more performant. `#empty?`, `#exists?` and `#any?` all attach a `LIMIT
1` to the SQL query before firing it off, which is a nice query
improvement. `#blank?` should do the same!

Bonus performance improvements will also happen for `#present?`, which
merely calls the negation of `#blank?`

Signed-off-by: David Celis <me@davidcel.is>

jonleighton added a commit that referenced this pull request Jun 7, 2013

Merge pull request #10539 from davidcelis/ar-sql-improvements
Improve performance of ActiveRecord::Relation#blank?

@jonleighton jonleighton merged commit 257fa68 into rails:master Jun 7, 2013

@davidcelis davidcelis deleted the davidcelis:ar-sql-improvements branch Jun 7, 2013

jonleighton added a commit that referenced this pull request Jun 7, 2013

@jonleighton

This comment has been minimized.

Member

jonleighton commented Jun 7, 2013

Hi, the build failed and I have reverted. If you want to fix and submit a new PR, please ensure that the tests pass.

@davidcelis

This comment has been minimized.

Contributor

davidcelis commented Jun 7, 2013

I've fixed the actual build errors, it seems, but there's a failure proving a bit difficult to debug in RelationsTest#test_presence. It appears that #empty? is not properly loading the relation, resulting in additional queries for presence. That test probably passed because #present? calls !blank? which called to_a and explicitly loaded records. This makes #test_presence smell like a bad test to me.

@davidcelis

This comment has been minimized.

Contributor

davidcelis commented Jun 7, 2013

And, indeed, adding an explicit to_a into that test makes it pass:

def test_presence
    topics = Topic.all

    # the first query is triggered because there are no topics yet.
    assert_queries(1) { assert topics.to_a.present? } # Test fails without this to_a

    # checking if there are topics is used before you actually display them,
    # thus it shouldn't invoke an extra count query.
    assert_no_queries { assert topics.present? }
    assert_no_queries { assert !topics.blank? }

    # shows count of topics and loops after loading the query should not trigger extra queries either.
    assert_no_queries { topics.size }
    assert_no_queries { topics.length }
    assert_no_queries { topics.each }

    # count always trigger the COUNT query.
    assert_queries(1) { topics.count }

    assert topics.loaded?
  end
@@ -21,6 +21,7 @@ class Relation
alias :model :klass
alias :loaded? :loaded
alias :default_scoped? :default_scoped
alias :blank? :empty?

This comment has been minimized.

@harmdewit

harmdewit Nov 5, 2013

Should this line not be removed and let blank? just fall back to it's default implementation in Object i.e.:

def blank?
  respond_to?(:empty?) ? empty? : !self
end

That way blank? and present? etc. won't give errors if the collection is nil and will still use the optimized sql of empty? in case of an activerecord object.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment