Add `Hash#map_values` to ActiveSupport to simplify a common pattern #15819

Merged
merged 1 commit into from Jun 29, 2014

Conversation

Projects
None yet
7 participants
@sgrif
Member

sgrif commented Jun 19, 2014

Keys remain the same, but the values change.

Didn't get a chance to convert existing code, I'll skim through the code base to make use of this later this afternoon.

@rafaelfranca

This comment has been minimized.

Show comment
Hide comment
@rafaelfranca

rafaelfranca Jun 19, 2014

Member

@chancancode I remember you wanted to add this some time ago.

@jeremy I'm not good with names WDYT?

Member

rafaelfranca commented Jun 19, 2014

@chancancode I remember you wanted to add this some time ago.

@jeremy I'm not good with names WDYT?

@chancancode

This comment has been minimized.

Show comment
Hide comment
@chancancode

chancancode Jun 19, 2014

Member

@jeremy / @zzak told me we can use Enumerable#to_h on 2.1 (?)... Maybe we can backport that? 😄

Member

chancancode commented Jun 19, 2014

@jeremy / @zzak told me we can use Enumerable#to_h on 2.1 (?)... Maybe we can backport that? 😄

@egilburg

This comment has been minimized.

Show comment
Hide comment
@egilburg

egilburg Jun 19, 2014

Contributor

Hash has method transform_keys (also the ! and deep_ friends). Should this method be consistently named transform_values ?

Contributor

egilburg commented Jun 19, 2014

Hash has method transform_keys (also the ! and deep_ friends). Should this method be consistently named transform_values ?

@rafaelfranca

This comment has been minimized.

Show comment
Hide comment
@rafaelfranca

rafaelfranca Jun 19, 2014

Member

Urgh! This make me think this method should be private API for Rails. I really don't want to have a deep_transform_values!

Member

rafaelfranca commented Jun 19, 2014

Urgh! This make me think this method should be private API for Rails. I really don't want to have a deep_transform_values!

@chancancode

This comment has been minimized.

Show comment
Hide comment
@chancancode

chancancode Jun 19, 2014

Member
  • 👍 on having a method to do this
  • 👍 on transform_values as a better name
  • 👎 on making it private – I needed this in a lot of apps/scripts I wrote
  • 👍 that deep_transform_values doesn't make any sense and we should never attempt to provide that

Perhaps we can...

  1. Backport Enumerable#to_h
  2. Add Hash#transform and perhaps implement that using Enumerable#to_h
  3. Implement Hash#*transform_* using Hash#transform?
Member

chancancode commented Jun 19, 2014

  • 👍 on having a method to do this
  • 👍 on transform_values as a better name
  • 👎 on making it private – I needed this in a lot of apps/scripts I wrote
  • 👍 that deep_transform_values doesn't make any sense and we should never attempt to provide that

Perhaps we can...

  1. Backport Enumerable#to_h
  2. Add Hash#transform and perhaps implement that using Enumerable#to_h
  3. Implement Hash#*transform_* using Hash#transform?
@sgrif

This comment has been minimized.

Show comment
Hide comment
@sgrif

sgrif Jun 19, 2014

Member

I am 👍 on backporting Enumerable#to_h, but I still think this provides a useful API even with #to_h, (this version will also be more performant than one implemented with #to_h).

Member

sgrif commented Jun 19, 2014

I am 👍 on backporting Enumerable#to_h, but I still think this provides a useful API even with #to_h, (this version will also be more performant than one implemented with #to_h).

@sgrif

This comment has been minimized.

Show comment
Hide comment
@sgrif

sgrif Jun 19, 2014

Member

What do you think about renaming transform_keys to map_keys, leaving transform_ as an alias for now?

Member

sgrif commented Jun 19, 2014

What do you think about renaming transform_keys to map_keys, leaving transform_ as an alias for now?

@egilburg

This comment has been minimized.

Show comment
Hide comment
@egilburg

egilburg Jun 19, 2014

Contributor

Is the behavior of map_keys identical to keys.map? Because keys.each is identical to .each_key (save for performance difference), so IMO 👎 on naming it map_keys if it behaves differently.

Contributor

egilburg commented Jun 19, 2014

Is the behavior of map_keys identical to keys.map? Because keys.each is identical to .each_key (save for performance difference), so IMO 👎 on naming it map_keys if it behaves differently.

@sgrif

This comment has been minimized.

Show comment
Hide comment
@sgrif

sgrif Jun 19, 2014

Member

No, it is not. See the API docs for transform_keys...

Member

sgrif commented Jun 19, 2014

No, it is not. See the API docs for transform_keys...

@egilburg

This comment has been minimized.

Show comment
Hide comment
@egilburg

egilburg Jun 20, 2014

Contributor

That's what I thought, hence 👎 on having keys.map be different from map_keys, while each_key is the same as keys.each

Contributor

egilburg commented Jun 20, 2014

That's what I thought, hence 👎 on having keys.map be different from map_keys, while each_key is the same as keys.each

@chancancode

This comment has been minimized.

Show comment
Hide comment
@chancancode

chancancode Jun 20, 2014

Member

@sgrif I'm indifferent about the two as a name, but since I wanted a version that allows remapping of both the key and value for cases like these, I prefer the name Hash#transform_* since Hash#map is already take and Hash#transform is not.

Member

chancancode commented Jun 20, 2014

@sgrif I'm indifferent about the two as a name, but since I wanted a version that allows remapping of both the key and value for cases like these, I prefer the name Hash#transform_* since Hash#map is already take and Hash#transform is not.

@sgrif

This comment has been minimized.

Show comment
Hide comment
@sgrif

sgrif Jun 20, 2014

Member

I'm fine with that.
On Jun 19, 2014 6:54 PM, "Godfrey Chan" notifications@github.com wrote:

@sgrif https://github.com/sgrif I'm indifferent about the two as a
name, but since I wanted a version that allows remapping of both the key
and value for cases like these
https://github.com/rails/rails/blob/master/activesupport/lib/active_support/json/encoding.rb#L86,
I prefer the name Hash#transform_* since Hash#map is already take and
Hash#transform is not.


Reply to this email directly or view it on GitHub
#15819 (comment).

Member

sgrif commented Jun 20, 2014

I'm fine with that.
On Jun 19, 2014 6:54 PM, "Godfrey Chan" notifications@github.com wrote:

@sgrif https://github.com/sgrif I'm indifferent about the two as a
name, but since I wanted a version that allows remapping of both the key
and value for cases like these
https://github.com/rails/rails/blob/master/activesupport/lib/active_support/json/encoding.rb#L86,
I prefer the name Hash#transform_* since Hash#map is already take and
Hash#transform is not.


Reply to this email directly or view it on GitHub
#15819 (comment).

@matthewd

This comment has been minimized.

Show comment
Hide comment
@matthewd

matthewd Jun 20, 2014

Member

Strong 👎 on renaming an existing ActiveSupport method just because its precedent doesn't match your new method's proposed name 😐

I think avoiding "map" would be a better move anyway: Hash#map establishes a precedent for how "mappy" things work on Hash, and returning a new hash ain't it. Given that constraint, transform_values does sound like a supportable choice.

Hash#transform excites me much less: that's array-happy by definition. To me, the thing that recommends transform_values (or whatever you call it) is that the most obvious alternative spelling performs far worse. If you're looking for a transform, you should probably be using an each_pair_with_object instead.

Member

matthewd commented Jun 20, 2014

Strong 👎 on renaming an existing ActiveSupport method just because its precedent doesn't match your new method's proposed name 😐

I think avoiding "map" would be a better move anyway: Hash#map establishes a precedent for how "mappy" things work on Hash, and returning a new hash ain't it. Given that constraint, transform_values does sound like a supportable choice.

Hash#transform excites me much less: that's array-happy by definition. To me, the thing that recommends transform_values (or whatever you call it) is that the most obvious alternative spelling performs far worse. If you're looking for a transform, you should probably be using an each_pair_with_object instead.

@sgrif

This comment has been minimized.

Show comment
Hide comment
@sgrif

sgrif Jun 20, 2014

Member

I'm fine with transform_values. My reason for map_keys was less because
of this method and more because of what map means in every language. I
will submit transform as a separate PR so the discussion on that method
can happen there.
On Jun 19, 2014 7:47 PM, "Matthew Draper" notifications@github.com wrote:

Strong [image: 👎] on renaming an existing ActiveSupport method just
because its precedent doesn't match your new method's proposed name [image:
😐]

I think avoiding "map" would be a better move anyway: Hash#map
establishes a precedent for how "mappy" things work on Hash, and returning
a new hash ain't it. Given that constraint, transform_values does sound
like a supportable choice.

Hash#transform excites me much less: that's array-happy by definition. To
me, the thing that recommends transform_values (or whatever you call it)
is that the most obvious alternative spelling performs far worse. If you're
looking for a transform, you should probably be using an
each_pair_with_object instead.


Reply to this email directly or view it on GitHub
#15819 (comment).

Member

sgrif commented Jun 20, 2014

I'm fine with transform_values. My reason for map_keys was less because
of this method and more because of what map means in every language. I
will submit transform as a separate PR so the discussion on that method
can happen there.
On Jun 19, 2014 7:47 PM, "Matthew Draper" notifications@github.com wrote:

Strong [image: 👎] on renaming an existing ActiveSupport method just
because its precedent doesn't match your new method's proposed name [image:
😐]

I think avoiding "map" would be a better move anyway: Hash#map
establishes a precedent for how "mappy" things work on Hash, and returning
a new hash ain't it. Given that constraint, transform_values does sound
like a supportable choice.

Hash#transform excites me much less: that's array-happy by definition. To
me, the thing that recommends transform_values (or whatever you call it)
is that the most obvious alternative spelling performs far worse. If you're
looking for a transform, you should probably be using an
each_pair_with_object instead.


Reply to this email directly or view it on GitHub
#15819 (comment).

@sgrif

This comment has been minimized.

Show comment
Hide comment
@sgrif

sgrif Jun 20, 2014

Member

Updated.

Member

sgrif commented Jun 20, 2014

Updated.

@@ -6,3 +6,4 @@
require 'active_support/core_ext/hash/keys'
require 'active_support/core_ext/hash/reverse_merge'
require 'active_support/core_ext/hash/slice'
+require 'active_support/core_ext/hash/transform_values'

This comment has been minimized.

@egilburg

egilburg Jun 20, 2014

Contributor

transform_keys and friends live in /core_ext/hash/keys.rb. To be consistent, this file could be called /core_ext/hash/values.rb.

@egilburg

egilburg Jun 20, 2014

Contributor

transform_keys and friends live in /core_ext/hash/keys.rb. To be consistent, this file could be called /core_ext/hash/values.rb.

+ # { a: 1, b: 2, c: 3 }.transform_values { |x| x * 2 }
+ # # => { a: 2, b: 4, c: 6 }
+ def transform_values(&block)
+ dup.transform_values!(&block)

This comment has been minimized.

@egilburg

egilburg Jun 20, 2014

Contributor

Given that this is a lower-level utility function, I think it's worth optimizing it to not use .dup:

require 'active_support/all'

class Hash
  def transform_values_without_dup
    new_hash = self.class.new
    new_hash.default = default
    new_hash.default_proc = default_proc if default_proc

    each do |key, value|
      new_hash[key] = yield(value)
    end

    new_hash
  end

  def transform_values(&block)
    dup.transform_values!(&block)
  end

  def transform_values!
    each do |key, value|
      self[key] = yield(value)
    end
  end
end

require 'benchmark'

HASH = { a: 'foo', b: 'bar', c: 'baz' }
HASH_WITH_DEFAULT = { a: 'foo', b: 'bar', c: 'baz' }; 
HASH_WITH_DEFAULT.default = :abc
HASH_WITH_DEFAULT_PROC = { a: 'foo', b: 'bar', c: 'baz' }; 
HASH_WITH_DEFAULT_PROC.default_proc = Proc.new { 'def' }
HWIA = { a: 'foo', b: 'bar', c: 'baz' }.with_indifferent_access

REPEAT = 100_000

Benchmark.bmbm do |x|
  x.report("transform_values") do 
    REPEAT.times do
      HASH.transform_values { |v| v + '!' }
      HASH_WITH_DEFAULT.transform_values { |v| v + '!' }
      HASH_WITH_DEFAULT_PROC.transform_values { |v| v + '!' }
      HWIA.transform_values { |v| v + '!' }
    end
  end

  x.report("transform_values_without_dup") do
    REPEAT.times do
      HASH.transform_values_without_dup { |v| v + '!' }
      HASH_WITH_DEFAULT.transform_values_without_dup { |v| v + '!' }
      HASH_WITH_DEFAULT_PROC.transform_values_without_dup { |v| v + '!' }
      HWIA.transform_values_without_dup { |v| v + '!' }
    end
  end
end
Rehearsal ----------------------------------------------------------------
transform_values               2.700000   0.020000   2.720000 (  2.719595)
transform_values_without_dup   2.280000   0.000000   2.280000 (  2.286633)
------------------------------------------------------- total: 5.000000sec

                                   user     system      total        real
transform_values               2.580000   0.000000   2.580000 (  2.573546)
transform_values_without_dup   2.290000   0.010000   2.300000 (  2.297113)
=> [  2.580000   0.000000   2.580000 (  2.573546)
,   2.2900

Not using dup gives ~20% performance gain.

@egilburg

egilburg Jun 20, 2014

Contributor

Given that this is a lower-level utility function, I think it's worth optimizing it to not use .dup:

require 'active_support/all'

class Hash
  def transform_values_without_dup
    new_hash = self.class.new
    new_hash.default = default
    new_hash.default_proc = default_proc if default_proc

    each do |key, value|
      new_hash[key] = yield(value)
    end

    new_hash
  end

  def transform_values(&block)
    dup.transform_values!(&block)
  end

  def transform_values!
    each do |key, value|
      self[key] = yield(value)
    end
  end
end

require 'benchmark'

HASH = { a: 'foo', b: 'bar', c: 'baz' }
HASH_WITH_DEFAULT = { a: 'foo', b: 'bar', c: 'baz' }; 
HASH_WITH_DEFAULT.default = :abc
HASH_WITH_DEFAULT_PROC = { a: 'foo', b: 'bar', c: 'baz' }; 
HASH_WITH_DEFAULT_PROC.default_proc = Proc.new { 'def' }
HWIA = { a: 'foo', b: 'bar', c: 'baz' }.with_indifferent_access

REPEAT = 100_000

Benchmark.bmbm do |x|
  x.report("transform_values") do 
    REPEAT.times do
      HASH.transform_values { |v| v + '!' }
      HASH_WITH_DEFAULT.transform_values { |v| v + '!' }
      HASH_WITH_DEFAULT_PROC.transform_values { |v| v + '!' }
      HWIA.transform_values { |v| v + '!' }
    end
  end

  x.report("transform_values_without_dup") do
    REPEAT.times do
      HASH.transform_values_without_dup { |v| v + '!' }
      HASH_WITH_DEFAULT.transform_values_without_dup { |v| v + '!' }
      HASH_WITH_DEFAULT_PROC.transform_values_without_dup { |v| v + '!' }
      HWIA.transform_values_without_dup { |v| v + '!' }
    end
  end
end
Rehearsal ----------------------------------------------------------------
transform_values               2.700000   0.020000   2.720000 (  2.719595)
transform_values_without_dup   2.280000   0.000000   2.280000 (  2.286633)
------------------------------------------------------- total: 5.000000sec

                                   user     system      total        real
transform_values               2.580000   0.000000   2.580000 (  2.573546)
transform_values_without_dup   2.290000   0.010000   2.300000 (  2.297113)
=> [  2.580000   0.000000   2.580000 (  2.573546)
,   2.2900

Not using dup gives ~20% performance gain.

This comment has been minimized.

@sgrif

sgrif Jun 20, 2014

Member

Not using dup would cause default values/default procs to be lost.

@sgrif

sgrif Jun 20, 2014

Member

Not using dup would cause default values/default procs to be lost.

This comment has been minimized.

@egilburg

egilburg Jun 20, 2014

Contributor

Updated with copying defaults, average performance gain is about ~15% now.

Arguably using .dup is more friendly to custom hash subclasses that add other behavior and rely on .dup to clone it.

Whether .dup is used or not, there should also be tests for the disrepancy you caught (ensuring defaults are copied over when using transform_values.

@egilburg

egilburg Jun 20, 2014

Contributor

Updated with copying defaults, average performance gain is about ~15% now.

Arguably using .dup is more friendly to custom hash subclasses that add other behavior and rely on .dup to clone it.

Whether .dup is used or not, there should also be tests for the disrepancy you caught (ensuring defaults are copied over when using transform_values.

This comment has been minimized.

@sgrif

sgrif Jun 29, 2014

Member

Updated not to use dup. Opted not to copy default, default_proc, etc to be consistent with Ruby. Ruby only copies them on reject, but not select, currently, and that behavior is deprecated (will be removed in 2.2).

@sgrif

sgrif Jun 29, 2014

Member

Updated not to use dup. Opted not to copy default, default_proc, etc to be consistent with Ruby. Ruby only copies them on reject, but not select, currently, and that behavior is deprecated (will be removed in 2.2).

@sgrif

This comment has been minimized.

Show comment
Hide comment
@sgrif

sgrif Jun 21, 2014

Member

Submitted upstream as well if anyone would like to comment there. https://bugs.ruby-lang.org/issues/9970

Member

sgrif commented Jun 21, 2014

Submitted upstream as well if anyone would like to comment there. https://bugs.ruby-lang.org/issues/9970

@chancancode chancancode added JRuby and removed JRuby labels Jun 26, 2014

@guilleiguaran

This comment has been minimized.

Show comment
Hide comment
@guilleiguaran

guilleiguaran Jun 28, 2014

Member

@sgrif can you rebase this?

Member

guilleiguaran commented Jun 28, 2014

@sgrif can you rebase this?

@sgrif

This comment has been minimized.

Show comment
Hide comment
@sgrif

sgrif Jun 28, 2014

Member

Done

On Sat, Jun 28, 2014 at 8:14 AM, Guillermo Iguaran <notifications@github.com

wrote:

@sgrif https://github.com/sgrif can you rebase this?


Reply to this email directly or view it on GitHub
#15819 (comment).

Thanks,
Sean Griffin

Member

sgrif commented Jun 28, 2014

Done

On Sat, Jun 28, 2014 at 8:14 AM, Guillermo Iguaran <notifications@github.com

wrote:

@sgrif https://github.com/sgrif can you rebase this?


Reply to this email directly or view it on GitHub
#15819 (comment).

Thanks,
Sean Griffin

Add `Hash#map_values` to ActiveSupport
Didn't get a chance to convert existing code, I'll skim through the code
base to make use of this later this afternoon.
@sgrif

This comment has been minimized.

Show comment
Hide comment
@sgrif

sgrif Jun 29, 2014

Member

Any more feedback on this? Writing a lot of methods in AR right now that could certainly make good use of this.

Member

sgrif commented Jun 29, 2014

Any more feedback on this? Writing a lot of methods in AR right now that could certainly make good use of this.

guilleiguaran added a commit that referenced this pull request Jun 29, 2014

Merge pull request #15819 from sgrif/sg-hash-map-values
Add `Hash#map_values` to ActiveSupport to simplify a common pattern

@guilleiguaran guilleiguaran merged commit f123da5 into rails:master Jun 29, 2014

1 check was pending

continuous-integration/travis-ci The Travis CI build is in progress
Details

@sgrif sgrif deleted the sgrif:sg-hash-map-values branch Jun 29, 2014

sgrif added a commit that referenced this pull request Jul 5, 2014

+ #
+ # { a: 1, b: 2, c: 3 }.transform_values { |x| x * 2 }
+ # # => { a: 2, b: 4, c: 6 }
+ def transform_values(&block)

This comment has been minimized.

@bkuhlmann

bkuhlmann Jul 12, 2014

In case it helps, maybe refactor the code as follows?:

def transform_values(&block)
  each_with_object(self.class.new) do |(key, value), result|
    result[key] = yield value
  end
end

This would eliminate the need for an instance variable.

@bkuhlmann

bkuhlmann Jul 12, 2014

In case it helps, maybe refactor the code as follows?:

def transform_values(&block)
  each_with_object(self.class.new) do |(key, value), result|
    result[key] = yield value
  end
end

This would eliminate the need for an instance variable.

This comment has been minimized.

@matthewd

matthewd Jul 12, 2014

Member

It's a local variable, not an instance variable. And that spelling would severely curtail the advantage of this method: it currently avoids doing an allocation per hash entry.

@matthewd

matthewd Jul 12, 2014

Member

It's a local variable, not an instance variable. And that spelling would severely curtail the advantage of this method: it currently avoids doing an allocation per hash entry.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment