Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add value_of/values_of to AR::Base/AR::Relation #3871

Closed
wants to merge 1 commit into from
Closed

Add value_of/values_of to AR::Base/AR::Relation #3871

wants to merge 1 commit into from

Conversation

ernie
Copy link
Contributor

@ernie ernie commented Dec 6, 2011

This patch basically amounts to a merge of Valium functionality into Rails core. Implements select_column/select_columnsfunctionality as exists in https://github.com/ernie/valium.

Superset of the recent pluck functionality, with the following enhancements:

  • Only runs the DB query if required. If the records have already been loaded, simply maps through them and reads the attributes
  • Supports multiple attributes:
User.where(:active => true).select_columns(:name, :email).each do |name, email|
  puts "#{name}'s e-mail address is #{email}"
end
  • deserializes serialized attributes

Added a few new tests, corrected test names. All tests pass.

[Updated: this part is no longer true since the method names were changed - All Valium specs pass without loading the Valium gem when this code is active, so this should be backwards-compatible for those people who are already using Valium.]

Thanks!

@bcardarella
Copy link
Contributor

Much nicer than pluck

@josevalim
Copy link
Contributor

/cc @dhh

If we are going to change the name from pluck to values_of, we should also remove pluck from activesupport.

@gmile
Copy link
Contributor

gmile commented Dec 6, 2011

👍

@fabianoalmeida
Copy link

Sounds good! 👍

@bobbytables
Copy link

+1

@ernie
Copy link
Contributor Author

ernie commented Dec 6, 2011

@josevalim as Much as I dislike the name "pluck" I do see it's been added to AS as well. I could rewrite this to use the method name "pluck" and accept 1 or more parameters easily enough, though I do prefer the value(s)_of syntax.

@allochi
Copy link

allochi commented Dec 6, 2011

+1, I like this :)

@dpickett
Copy link
Contributor

dpickett commented Dec 6, 2011

+1 and values_of is more expressive, imo

@ghost
Copy link

ghost commented Dec 6, 2011

+1, Awesome work Ernie!

@jonleighton
Copy link
Member

Thanks for the patch.

To be honest I am not massively keen on this myself though. I like pluck because it's pretty damn simple. I think other complexities to deal with serialisation etc are not sufficiently common that they should be handled in Rails core. Plus, I am loathe to add another place where we have to deal with attribute stuff.

I'd be up for having pluck take multiple arguments, though. (e.g. Person.pluck(:name, :age) #=> [["Jon", 22], ...])

I'd be pleased to hear actual use cases rather than a chorus of "+1" :)

@bcardarella
Copy link
Contributor

@jonleighton my issue with #pluck is I have no idea what the method does. When I see #values_of I immediately know what this does. It's a big readability factor that #pluck does not have.

@jonleighton
Copy link
Member

@bcardarella ok, but that's kinda unrelated to this patch, which actually changes the implementation substantially.

@ernie
Copy link
Contributor Author

ernie commented Dec 6, 2011

My 2¢:

If we don't support deserialization, then we don't achieve even the
basic goal of behaving like the ActiveSupport pluck method. Asking for
the attribute value should behave the same as if you called the reader
from an instantiated object.

Additionally, much of the deserialization complexity could be later
factored out if the intent is to set each Column's @coder at some
point. The code that exists thus far certainly seems to imply this was
the direction we were heading in (see Column#type_cast).

In short, AR#pluck isn't sufficiently like Enumerable#pluck to share
its name, in any case.

On Dec 6, 2011, at 5:18 PM, Jon Leighton
reply@reply.github.com
wrote:

Thanks for the patch.

To be honest I am not massively keen on this myself though. I like pluck because it's pretty damn simple. I think other complexities to deal with serialisation etc are not sufficiently common that they should be handled in Rails core. Plus, I am loathe to add another place where we have to deal with attribute stuff.

I'd be up for having pluck take multiple arguments, though. (e.g. Person.pluck(:name, :age) #=> [["Jon", 22], ...])

I'd be pleased to hear actual use cases rather than a chorus of "+1" :)


Reply to this email directly or view it on GitHub:
#3871 (comment)

@exviva
Copy link
Contributor

exviva commented Dec 6, 2011

@ernie values_of doesn't behave like Enumerable#pluck anyway, since it doesn't support instance methods, e.g.:

class Person < ActiveRecord::Base
  def full_name
    "#{first_name} #{last_name}"
  end
end

full_names = Person.pluck(:full_name) # yikes!

I understand that this is by design, in order to avoid instantiating AR objects, but I guess this use case is much more common than serializable columns, and it's equally inconsistent between AR and AS.

So if achieving exactly the same semantics as AS#pluck is impossible, maybe the whole feature doesn't make sense?

@ernie
Copy link
Contributor Author

ernie commented Dec 7, 2011

@exviva you've got me, there. So let me amend a bit. Insofar as it's possible to provide similar functionality to pluck while still achieving our goal of avoiding AR instantiation overhead, we should. Deserialization can be done without it.

And, in any case, values_of avoids the expectation entirely by not being called "pluck". ;)

@exviva
Copy link
Contributor

exviva commented Dec 7, 2011

The way I see it, there are two goals we're trying to achieve here:

  1. provide convenient syntax sugar for User.map(&:email), possibly with variable argument number, regardless whether :email is a column or an instance method
  2. provide a lower-level resource-efficient way of projecting an SQL select to a subset of columns

I guess both are "kind of" useful, although mostly for hacking in the console. Most of the time we don't want to access models in isolation, without their business methods, associations, validations, etc.

I'd venture an opinion that the second goal is more valuable for most applications, in a specific context (e.g. batch jobs). How about we name it select_columns, project_columns, map_columns, projection, or something similar? It sounds more low-level (as it should for that specific context).

Honestly, I've been missing that kind of syntax goodness like pluck for months now, but maybe it's not worth to introduce such an API just for the sake of it.

@jaylevitt
Copy link
Contributor

As a valium addict... Wait. As someone who uses Valium every day on the job... hmm. From what I've heard on the street, Valium is invaluable for "I need to collect a bunch of IDs to use in a WHERE x IN (...)" clause, where there's no good way to do the whole query in Arel without instantiating all the objects. In fact, I'm not sure #1 and #2 are separate use cases; you start with User.map(&:email) to see if it gets you in the right direction, and you end up three has_many :through later, caching the foreign keys so you can get back into Arel-world.

I haven't followed the serializer stuff yet, but if that is moving toward "column of Postgres type X should really instantiate Rubjy objjects of type X1 like so"... that seems nirvana-like.

@bogdan
Copy link
Contributor

bogdan commented Dec 7, 2011

I didn't implement multiple columns for pluck because of the same reason that @jonleighton described:
No real world use case.

@ernie
Copy link
Contributor Author

ernie commented Dec 7, 2011

@bogdan I don't want to speak for @jonleighton but I believe he was referring to the serialized column part, not the multiple column part. The latter's use case is self-evident. One such case is for large pages of records in administrative interfaces and reporting. Even find_in_batches won't beat the performance/memory savings of Valium.

I'd propose the former is an issue of correctness more than necessity, but again, in such cases as those outlined above, it would be useful to get access to attributes stored in a serialized column (user preferences, for instance) in their intended object.

@bogdan
Copy link
Contributor

bogdan commented Dec 7, 2011

IMHO. large administrative pages is not under Rails core scope.

@ernie
Copy link
Contributor Author

ernie commented Dec 7, 2011

@bogdan I'm not even sure what to say to that. Are you just trolling,
now? Ever since DHH did the very first Rails screencast illustrating
scaffolding an admin page for blogs, providing the underpinnings for
creating admin pages (large or otherwise) has been under Rails core's
scope.

Giving just one use case doesn't mean it's the only use case. It's
useful at the model layer, as well.

That's like saying that the cycle view helper doesn't belong in
rails because admin pages use it sometimes.

On Dec 7, 2011, at 7:50 AM, Bogdan Gusiev
reply@reply.github.com
wrote:

Don't think that large administrative pages is under Rails core scope.


Reply to this email directly or view it on GitHub:
#3871 (comment)

@amalagaura
Copy link

An extension of this sort should not be called 'pluck' because Enumerable.pluck specifically works on methods of the enumerable. I don't think the active support pluck which is already written supports that. If it does, it must be instantiating, correct? Otherwise there is no way to run an instance method.

I vote we should use value_of and values_of and use the functionality of the valerium gem. I can't speak about the deserialization because I don't know enough of that.

To summarize

  1. name pluck doesn't make sense because it implies instantiation and pulling value of any method
  2. value_of with only 1 field is incomplete, valerium already has values_of, so we should just use those

@chaffeqa
Copy link
Contributor

chaffeqa commented Dec 8, 2011

@ernie I am a Valium... fan... as well, and I love when I can use it.

However, my problem is: when do you draw the line between Rails Core and a slick Gem?

I am a fan of keeping Rails codebase useful yet not bloated. I am torn, but I feel this should just remain a brilliant gem that someone can simply add to their Gemfile if they have the correct use-case.

In unrelated news: I ❤️ Valium... thanks a ton @ernie for the super gem!

@jaylevitt
Copy link
Contributor

One advantage of putting it in core is that IIRC there are things it can't quite pull off (values off joins, maybe?) without monkeypatching AR. Ernie?

In general, I think it's valuable for core to provide both a query engine (arel), a full ORM (AR), and a way to fetch lightweight values through the same query engine. These seem nicely complementary, and promoting Valium to a first-class citizen could well relieve some of the reputation for GC memory pressure that Rails has accreted.

@amalagaura
Copy link

@chaffeqa Yes the Rails commit for pluck is very small and is limited to 1 value.

The kind of thing that valium (or pluck) does I think should be part of a regular activerecord adapter. Because it is giving you lower level access to your activerecord data. It is not so much of a higher abstraction (like a gem to make your model a tree, or auditing, or storing at attachment, etc.)

So since it is giving you lower level access help you optimize, I think core is a good place.

I still don't like the name pluck, because it has different meaning in enumerable (where objects are already instantiated). And since they decided they will do 1 column, why not more?

@chaffeqa
Copy link
Contributor

chaffeqa commented Dec 8, 2011

@jaylevitt I love the idea of putting it in core! :thumbs:
However, would the implementation change?

@amalagaura ok you convinced me, that is a good point concerning higher vs. lower abstraction

+1 for putting it in core

@ernie
Copy link
Contributor Author

ernie commented Dec 14, 2011

@jeremy So, this is the pull request I promised to post at your request on the pluck thread. I'm happy to close it if it doesn't stand a chance of getting applied -- could you just let me know if there's any hope? ;)

@aaronchi
Copy link
Contributor

Personally, If you are going to add a 'pluckish' method to Rails 3.2, I would like to see the valium implementation make it in.

@ernie
Copy link
Contributor Author

ernie commented Dec 20, 2011

3.2rc is out. Should we assume we're all "plucked"?

@capitalist
Copy link

@bogdan A perfect use case for multiple columns in pluck/values_of is providing an array of arrays to options for select.

@amalagaura
Copy link

I think they are really shipping an incomplete and less than fully thought out feature with this 'pluck' And rolling it back will be next to impossible now that it is in beta. That use case is absolutely correct with options for select!

There is limited functionality with pluck, maybe it is enough for some people and they don't want to put more stuff in core.

@jeremy
Copy link
Member

jeremy commented Dec 20, 2011

I'm sympathetic to @jonleighton's concerns about complexity and serialized attr handling, but looking at the implementation, it's not adding much, and everyone will expect serialized attrs to "just work."

Having separate methods makes sense for consistent return values, too (Array of values vs Array of Arrays of values).

Writing out account.users.value_of :id still reads oddly to me. It's asking for a singular value of... a collection. account.users.pluck :id has a good, natural feel in comparison. Maybe a more explicit action would clear this up, like account.users.select_value :id to match the Connection API.

@ernie
Copy link
Contributor Author

ernie commented Dec 20, 2011

@jeremy I'd be happy to alter this to select_value/select_values if it means the difference between getting merged or not.

@jonleighton
Copy link
Member

Ok... there are several issues here...

  1. I am persuaded that this ought to work with serialization. However I am not keen on the proposed code for doing this. I have spent quite a bit of time trying to unify the AR attributes code recently and this feels like it's taking a step backwards. I have a half-written patch to make pluck work with serialization using existing code paths, which I'll aim to finish over the next few days. I think this aspect is uncontroversial so we can put it in 3.2.
  2. I am not any more keen on select_value as a name than value_of - both of them sound like they return a single value when in fact they do not. pluck is by far the most natural name that has been considered IMO.
  3. There's a question of whether we should support multiple values. This seems ok to me though I don't think the feature is 'broken' without it, so it doesn't seem crucial to solve for 3.2. The main difficulty is naming/API. Supposing we stuck with pluck for the singular, I potentially think pluck_attributes could work, as 'attributes' is synonymous with 'hash of attribute/value pairs' in AR.

So I would like to get serialization working, leave the name as pluck, and come back to the question of multiple attributes after 3.2 unless a consensus emerges about how the API should be.

@jonleighton
Copy link
Member

/cc @dhh please give your thoughts about the API stuff

@ernie
Copy link
Contributor Author

ernie commented Dec 20, 2011

So, select_value/select_values is out for the reason that should have seemed obvious at first -- namely, select_values is the array of selects for relations. I'm working on a version using select_column/select_columns now, and rebasing to your recent patch breaking out AR#Base into modules..

I think from a development perspective, it's easy enough to use a single method name (like pluck) and splat all incoming params and determine whether there is more than one to invoke the multiple column handling, or behave differently depending on whether the param given is an array or not if we're concerned about allocating a new array with the splat.

From an API perspective, though, two method names, with specific return values (array/nested array) makes the most sense.

@ernie
Copy link
Contributor Author

ernie commented Dec 20, 2011

Updated to work with latest 3-2-atable, change method names along the lines of @jeremy's suggestion.

@ernie
Copy link
Contributor Author

ernie commented Dec 20, 2011

Bleh. GitHub didn't like this rebase very much. Hang tight, fixing.

@masterkain
Copy link
Contributor

select_column/s feels much more natural to me and I bet this applies to someone else.

I think there's room for @ernie work as it is actually super useful in my eyes as it's a nice addition to the AR core, my 2 cents.

@aaronchi
Copy link
Contributor

As a native english speaker, I have to admit that 'pluck' is a bit weird to me as well as it makes me think of chickens more than it does development :)

That said, I'll let the core team figure this out. In the end, the name is not that important but would love to see ernie get some recognition for his work.

@ernie
Copy link
Contributor Author

ernie commented Dec 21, 2011

@aaronchi awfully nice of you to say, but I'm just hoping for the best possible implementation to make it into 3.2 final. Plus, hey, one less gem for me to maintain. :)

@aaronchi
Copy link
Contributor

To be honest, I'm really just trying to get some meta_where/squeel functionality included in the next release of AR. I hate writing SQL in Rails ;)

@jonleighton
Copy link
Member

Hi folks,

Thanks everyone for input on this.

I've just pushed some code that makes pluck work with serialized attributes, but using existing code paths.

We have discussed the name change and the possibility of supporting multiple attributes, and came to the following conclusions:

  • We're happy with "pluck" as a name. There have been some valid reservations about this but ultimately it's just a name, and one that is already used by Underscore.js and Prototype.
  • At the moment, we're not going to add support for plucking multiple attributes. It was felt that plucking a single attribute is going to be 99% of the use case. Those who need multiple attributes can of course create a plugin. And we can come back to this question in the future if necessary.

I'd be happy to review a patch to map the existing records if they are already loaded.

Thanks for your contributions.

@ernie
Copy link
Contributor Author

ernie commented Dec 22, 2011

Thanks, Jon. I'll take a look at whatever refactoring went into getting the column doing deserialization and update Valium accordingly. Looks like it'll live through 3.2 at least -- haven't checked as I'm on the road right now but I assume it's a continuation of the work that looked to be partially done toward setting the coder on serialized columns?

Thanks again.

@asterite
Copy link

Hi. In several ocassions I needed to pluck many attributes, not just a single one. And seeing people using the valium gem, I don't understand why you conclude that "plucking a single value is going to be 99% of the use case". When I need more performance it hurts me to go to my Gemfile and add something more. Rails should give me that efficiency out of the box. And it didn't cost much to accept the pull request. :-(

@amalagaura
Copy link

@asterite @jonleighton I don't understand how select boxes are 1% of use cases in selecting attributes. In my Rails 3.2.1 app I still have the valium gem because I have select boxes. Am I the 1%? (joke)

@schmurfy
Copy link

I need that feature for 1 attribute nearly as much as I do for multiple attributes, it just feels stupid to have something partly implemented in core. In this case why even bother adding to rails instead of keeping it in a separate gem like valium which covers 100% of the use cases then ?

I surely missed the logic there, I especially love when random percentages are thrown in the discussion coming from nowhere ;)

@deefour
Copy link

deefour commented Feb 10, 2012

You can add my +1 too for having @ernie's work integrated in place of .pluck.

@ernie
Copy link
Contributor Author

ernie commented Feb 10, 2012

Hey guys, I appreciate all the warm fuzzies here, but the core team has made their opinion known some time ago, and Valium continues to work just fine and to be maintained -- I'd say we would do best to drop this thread for now and let folks focus on updates to 3.2 and 4.0 development.

Thanks again, though!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet