Skip to content

Loading…

ActiveRecord::Base.map method for direct select by single column #1915

Merged
merged 1 commit into from
@bogdan

In order to select single column from the ActiveRecord objects scope
Implemented ActiveRecord::Base.map method that does direct query if
column name given or process as Array#map when block given.

Person.map(:id) # SELECT people.id FROM people
Person.map(:role, :distinct => true) # SELECT DISTINCT role FROM people
Person.map(:id, :limit => 5, :conditions => {:confirmed => true})

Method is also available in associtions:

project.team_members.map(:role)

Backward compatibility - process as regular Array#map if block given:

Person.limit(5).map(&:id)
project.team_members.map(&:role)
@sikachu
Ruby on Rails member

What's the difference from this and using

Person.select(:id)
Person.select(:role, :distinct => true)
Person.select(:id).limit(5).where(:confirmed => true)

I think it does the same thing. Please correct me if I'm wrong.

@bogdan

The main difference is that #select returns an Array of AR models and #map returns an Array of primitive types like Fixnum, Date, String etc.

The following two lines will return same result

Person.select(:id).map(&:id)
Person.map(:id)

But map require less memory because it's direct SQL. You can query million records without memory usage tsunami.

Second argument for #map is implemented only compatibility with #sum #count etc.

@sikachu
Ruby on Rails member

That sounds good then. @tenderlove do you mind looking into this?

@jeremy
Ruby on Rails member

Calling it map gives this feature too much prominence. On connection, it's select_value. How about Person.select_attribute(:id), project.team_members.select_attribute(:role).

@bogdan

select and find already behaves in a way that: if block given - perform method in Array and if no block given - use SQL.
https://github.com/rails/rails/blob/master/activerecord/lib/active_record/relation/finder_methods.rb#L95

I called this method map for consistency.

@jeremy
Ruby on Rails member

What's it consistent with?

Map yields each element to a block and returns an array of the results. Map with &:attr would call attr on all those records. But now map with :attr would return raw column values, having not iterated over the records at all, or called the attr method, which may be overridden.

This could make sense on the relation itself, but that's just project.

@bogdan

See:

Person.first do |p|
  p.id > 2
end

Person.first(:conditions => "id > 2")

Same behavior for select, first and sum. I understand this is really tricky and not clean, but it exists and people rely on it.

And Calling it map will create consistency with this behavior.
If you consider this a weak argument - I would gladly accept select_value name. In my use case it doesn't matter.

@fxn
Ruby on Rails member

We've discussed this a bit in Campfire. The feature looks good, but "map" is a name that has a well-established meaning in Ruby, and in particular people already use it in collections. Behaving differently depending on an ampersand does not seem to be good.

The name at that level of abstraction could be something related to "attributes". What about "read_attribute"? or "select_attribute"? Something in that line.

@jonleighton
Ruby on Rails member

How about 'pluck' (borrowing from Backbone). Or 'fetch'. I agree about not using map.

@fxn
Ruby on Rails member

The AR query interface guide should also be updated within this PR.

@jeremy
Ruby on Rails member

pluck does capture it nicely

@bogdan

ok. let it be pluck

@bogdan

Renamed.

@vijaydev vijaydev commented on an outdated diff
activerecord/lib/active_record/relation.rb
@@ -12,7 +12,7 @@ module ActiveRecord
include FinderMethods, Calculations, SpawnMethods, QueryMethods, Batches
# These are explicitly delegated to improve performance (avoids method_missing)
- delegate :to_xml, :to_yaml, :length, :collect, :map, :each, :all?, :include?, :to => :to_a
@vijaydev Ruby on Rails member

Why is :map removed here?

@bogdan
bogdan added a note

My mistake.

But it was not covered by test suite and I didn't mention it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@vijaydev vijaydev commented on an outdated diff
activerecord/lib/active_record/relation/calculations.rb
@@ -166,6 +166,31 @@ module ActiveRecord
0
end
+
+
+ # This method is designed to perform select by a single column as direct SQL query
+ # Returns <tt>Array</tt> with values of the specied column name
@vijaydev Ruby on Rails member

*specified

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@vijaydev
Ruby on Rails member

@bogdan Please update the AR query interface guide as well.

@vijaydev
Ruby on Rails member

Also, can you squash the commits please and change the commit message to reflect the name pluck

@tenderlove tenderlove commented on an outdated diff
activerecord/lib/active_record/relation/calculations.rb
((4 lines not shown))
+
+
+ # This method is designed to perform select by a single column as direct SQL query
+ # Returns <tt>Array</tt> with values of the specied column name
+ # The values has same data type as column.
+ # See +calculate+ for list of supported options.
+ #
+ # Examples:
+ #
+ # Person.pluck(:id) # SELECT people.id FROM people
+ # Person.pluck(:role, :distinct => true) # SELECT DISTINCT role FROM people
+ # Person.pluck(:id, :limit => 5, :conditions => {:confirmed => true})
+ #
+ def pluck(column_name, options = {})
+ distinct = options.delete(:distinct)
+ if options.present?
@tenderlove Ruby on Rails member

Can we just use options.any?? Let's not call AS from AR if we don't need to.

@bogdan
bogdan added a note

sure

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@tenderlove
Ruby on Rails member

pluck seems good. :+1:

@jonleighton
Ruby on Rails member

I don't want us to support arbitrary finder options on this method. We should be moving away from such a style, and therefore not adding it to new parts of the API.

E.g. bad:

Foo.pluck(:bla, :conditions => { :omg => true })

good:

Foo.where(:omg => true).pluck(:bla)

I think we should also add a uniq method to Relation, so we could do Foo.uniq.pluck(:bla), but that's a separate addition so for now the :distinct => true option is ok with me.

@jonleighton jonleighton commented on an outdated diff
...rd/lib/active_record/associations/collection_proxy.rb
@@ -121,6 +121,10 @@ module ActiveRecord
proxy_association.reload
self
end
+
+ def pluck(*args)
@jonleighton Ruby on Rails member

for consistency, please just add this to the delegate call at the top of the file

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@bogdan

Updates:

  • Updated AR query guide
  • Add changelog entry
  • Removed support of :conditions, :order, :limit .... options
  • Sqashed commits
  • Removed ActiveSupport dependency

Thanks

@vijaydev
Ruby on Rails member

You have to rebase, the PR cant be merged. Also, in the changelog, looks like you merged in others' logs as yours?

@bogdan

I did the rebase. And after that rebase the changes appeared in changelog as result of conflict merge.

https://github.com/bogdan/rails/blob/master/activerecord/CHANGELOG

@vijaydev
Ruby on Rails member

Ok. Can you please rebase once again?

@bogdan

done

@vijaydev
Ruby on Rails member

Sorry, but there are two 'AR::Relation#pluck' commits now. Can you please squash and also get rid of that merge commit if possible?

@bogdan

I tried doing it before.
In that case the diff of pluck commit will include diff of merge.
Probably this is bad idea for the history.

@vijaydev
Ruby on Rails member

Needs another rebase :( @tenderlove can this be merged once the rebase is done?

@bogdan

I've updated the PR according to latest changes:

  • Rebase with master
  • Squashed commits
  • Changelong in markdown
  • support :distinct option with #unique scope ( according to @jonleighton's suggestion)
@jonleighton jonleighton commented on an outdated diff
activerecord/lib/active_record/relation/calculations.rb
((7 lines not shown))
+ # Returns <tt>Array</tt> with values of the specified column name
+ # The values has same data type as column.
+ #
+ # Options:
+ #
+ # * <tt>:distinct</tt> - Set this to true to make this a distinct select
+ #
+ # Examples:
+ #
+ # Person.pluck(:id) # SELECT people.id FROM people
+ # Person.pluck(:role, :distinct => true) # SELECT DISTINCT role FROM people
+ # Person.where(:confirmed => true).limit(5).pluck(:id)
+ #
+ def pluck(column_name, options = {})
+ scope = self.select(column_name)
+ scope = scope.uniq if options[:distinct]
@jonleighton Ruby on Rails member

Now that there is a Relation#uniq method, I don't think it's necessary to allow :distinct as an option. So could you remove the options hash please?

@jonleighton Ruby on Rails member

I should add that it's still good to keep the example, but make it Person.uniq.pluck(:role).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@jonleighton jonleighton commented on an outdated diff
.../active_record/associations/collection_association.rb
@@ -171,6 +171,10 @@ module ActiveRecord
end
end
+ def pluck(*args)
@jonleighton Ruby on Rails member

Unless I am missing something, this method doesn't need to exist.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@bogdan

Removed options from code and doc. Removed unneeded method.

@jonleighton jonleighton merged commit 271308c into rails:master
@jeremy
Ruby on Rails member

Great job @bogdan!

@pokonski

Damn, that took a while, guys :o

@jtmkrueger

+1 This looks super useful!

+1 agree

As someone who has literally had to write "Order.select("id").collect(&:id)" four times today, this is a big +1 from me!

I've wanted a method like this for so long! I'm always writing little bits of code to do this. Thanks!

+1 !

Agree with @kennon, no SomeThing.select('col').map(&:col) any more!

+1 Very useful!

Can't try this out at the the moment, so here's a question: How would this compare to https://github.com/ernie/valium ? Same thing / functionality?

pluck yeah ;)

+1 really useful piece of code

+1 ;)

Why not several columns? Wouldn't it be as useful? or am I missing something here?

+2 because it's that good

Thanks a lot!

Does it work with serialization like ernie/valium?

@dmitriy-kiriyenko Since no one answered my previous question about how it compares to valium, and judging by the other comments, I'd say no one here actually knew about valium.

@cvshepherd just got to know valium, and I think it's better.

is "pluck" really the right name for this ?

+1: Is "pluck" really the right name for this ?

Seriously, I would propose 'project' as a better name

'project' is just as un-obvious. For a method that returns an array of columns I'd expect the method name to have something to do with getting an array of columns....

Something like 'selective_columns' and extend it to return optionally more than one column.

It's not un-obvious if you've heard of an SQL projection

values_at? =)

But I definitely like project. +1 for project.
We'll have another reserved word to avoid in business code.

+1, but I doubt it'll be changed anyway.

Being able to "pluck" multiple columns would be quite useful as well.

+1 although I think values_at (what valium uses) would be a better name than pluck.

+1, but I actually like "pluck". It's used pretty rarely in everyday speech, and to me seeing it is odd enough that I'll remember it as a method name. -- like "tap", the other most awesome method name ever! Plus, "pluck" is such a fun word!

:+1: , I actually think values_at is better than pluck or project

+1 multiple columns

+1 multiple columns

Ruby on Rails member

bike shed

Green??? What a stupid color for a bikeshed.

But seriously, project is a better name ;)

@whitethunder, I told them, but they don't trust me. Seriously, project is an excellent name. More, I'm looking forward for a method like "user" or "company". Also a great idea would be methods "topic", "post" and "comment". =)

Ruby on Rails member

For those who were asking, it looks like this patch handles serialized columns, because Column#type_cast decodes encoded columns in current master.

This wasn't the case in 3-0-stable, which is why Valium's implementation is (only slightly) more involved.

I agree with @jeremy, though -- this is a whole lot of discussion for a very simple change. In fact, I'd have submitted Valium's implementation as a patch long ago if I'd thought it had a chance to be accepted. One of those things where it was so ridiculously simple that I figured there was a reason it wasn't part of the AR API already. ;)

Ruby on Rails member

Hmm. I take back my comment about working with serialization. It looks like the only place that a Column's coder is being set in the current AR code is in 3 tests in column_definition_test at this point, unless I missed something. It doesn't look like SchemaCache would be the right place to handle this, either.

Anyway, I have a rough version of Valium's take on this ported to a Rails 3.2 patch and passing all but the serialization tests (due to the issue mentioned). I can work out the remaining issues there and submit a value(s)_of implementation for Rails 3.2 if the core team is interested.

Ruby on Rails member

@ernie Cool, yeah, let's see it. Wish we'd known you had Valium already implemented, sorry about that. Pull request came in; didn't look for prior art. Thanks for pitching in in any case.

Ruby on Rails member

@jeremy pushing it up now -- thanks!

Ruby on Rails member

For those visiting this thread: See #3871 for the pull request with alternate implementation supporting serialization, multiple values, etc.

Ruby on Rails member

unicorn

Ruby on Rails member

@tenderlove unicorns, rainbows and ponies would be the "etc" part

Love it! +1

wow, amazing +1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Nov 30, 2011
  1. @bogdan
This page is out of date. Refresh to see the latest.
View
9 activerecord/CHANGELOG.md
@@ -1,5 +1,14 @@
## Rails 3.2.0 (unreleased) ##
+
+* Implemented ActiveRecord::Relation#pluck method
+
+ Method returns Array of column value from table under ActiveRecord model
+
+ Client.pluck(:id)
+
+ *Bogdan Gusiev*
+
* Automatic closure of connections in threads is deprecated. For example
the following code is deprecated:
View
2 activerecord/lib/active_record/associations/collection_proxy.rb
@@ -39,7 +39,7 @@ class CollectionProxy # :nodoc:
instance_methods.each { |m| undef_method m unless m.to_s =~ /^(?:nil\?|send|object_id|to_a)$|^__|^respond_to|proxy_/ }
delegate :group, :order, :limit, :joins, :where, :preload, :eager_load, :includes, :from,
- :lock, :readonly, :having, :to => :scoped
+ :lock, :readonly, :having, :pluck, :to => :scoped
delegate :target, :load_target, :loaded?, :scoped,
:to => :@association
View
2 activerecord/lib/active_record/base.rb
@@ -449,7 +449,7 @@ class << self # Class methods
delegate :select, :group, :order, :except, :reorder, :limit, :offset, :joins,
:where, :preload, :eager_load, :includes, :from, :lock, :readonly,
:having, :create_with, :uniq, :to => :scoped
- delegate :count, :average, :minimum, :maximum, :sum, :calculate, :to => :scoped
+ delegate :count, :average, :minimum, :maximum, :sum, :calculate, :pluck, :to => :scoped
def inherited(child_class) #:nodoc:
# force attribute methods to be higher in inheritance hierarchy than other generated methods
View
17 activerecord/lib/active_record/relation/calculations.rb
@@ -166,6 +166,23 @@ def calculate(operation, column_name, options = {})
0
end
+ # This method is designed to perform select by a single column as direct SQL query
+ # Returns <tt>Array</tt> with values of the specified column name
+ # The values has same data type as column.
+ #
+ # Examples:
+ #
+ # Person.pluck(:id) # SELECT people.id FROM people
+ # Person.uniq.pluck(:role) # SELECT DISTINCT role FROM people
+ # Person.where(:confirmed => true).limit(5).pluck(:id)
+ #
+ def pluck(column_name)
+ scope = self.select(column_name)
+ self.connection.select_values(scope.to_sql).map! do |value|
+ type_cast_using_column(value, column_for(column_name))
+ end
+ end
+
private
def perform_calculation(operation, column_name, options = {})
View
25 activerecord/test/cases/calculations_test.rb
@@ -1,5 +1,6 @@
require "cases/helper"
require 'models/company'
+require "models/contract"
require 'models/topic'
require 'models/edge'
require 'models/club'
@@ -446,4 +447,28 @@ def test_distinct_is_honored_when_used_with_count_operation_after_group
distinct_authors_for_approved_count = Topic.group(:approved).count(:author_name, :distinct => true)[true]
assert_equal distinct_authors_for_approved_count, 2
end
+
+ def test_pluck
+ assert_equal [1,2,3,4], Topic.order(:id).pluck(:id)
+ end
+
+ def test_pluck_type_cast
+ topic = topics(:first)
+ relation = Topic.where(:id => topic.id)
+ assert_equal [ topic.approved ], relation.pluck(:approved)
+ assert_equal [ topic.last_read ], relation.pluck(:last_read)
+ assert_equal [ topic.written_on ], relation.pluck(:written_on)
+
+ end
+
+ def test_pluck_and_uniq
+ assert_equal [50, 53, 55, 60], Account.order(:credit_limit).uniq.pluck(:credit_limit)
+ end
+
+ def test_pluck_in_relation
+ company = Company.first
+ contract = company.contracts.create!
+ assert_equal [contract.id], company.contracts.pluck(:id)
+ end
+
end
View
9 railties/guides/source/active_record_querying.textile
@@ -1146,6 +1146,15 @@ h3. +select_all+
Client.connection.select_all("SELECT * FROM clients WHERE id = '1'")
</ruby>
+h3. +pluck+
+
+<tt>pluck</tt> can be used to query single column from table under model. It accepts column name as argument and returns Array of values of the specified column with corresponding data type.
+
+<ruby>
+Client.where(:active => true).pluck(:id) # SELECT id FROM clients WHERE clients.active
+Client.uniq.pluck(:role) # SELECT DISTINCT role FROM clients
+</ruby>
+
h3. Existence of Objects
If you simply want to check for the existence of the object there's a method called +exists?+. This method will query the database using the same query as +find+, but instead of returning an object or collection of objects it will return either +true+ or +false+.
Something went wrong with that request. Please try again.