New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Hash#transform_keys and Hash#transform_values #4385

Merged
merged 3 commits into from Aug 17, 2018

Conversation

Projects
None yet
6 participants
@deepj
Contributor

deepj commented May 6, 2017

This PR adds similar functionality in Ruby 2.4 which isHash#transform_keys and Hash#transform_values.

Hash#transform_keys returns a new hash with all keys converted using the block operation.

hash = {:a => 1, :b => 2, :c => 3}
hash.transform_keys { |key| key.to_s } # => {"A" => 1, "B" => 2, "C" => 3}

Hash#transform_values returns a new hash with the results of running block once for every value.

hash = {:a => 1, :b => 2, :c => 3}
hash.transform_values { |value| value + 1 } # => {:a => 2, :b => 3, :c => 4}

I haven't added bang equivalents since what I know it's not possible to modify types of keys/values in self.

@deepj deepj force-pushed the deepj:transform-keys-values branch 2 times, most recently from 303e966 to fd1e266 May 7, 2017

@bcardiff

This comment has been minimized.

Member

bcardiff commented May 8, 2017

@deepj the bang version can be implemented, as long as they don't change the type of the values nor keys.

Would you prefer to implement those in this PR as well?

@deepj

This comment has been minimized.

Contributor

deepj commented May 8, 2017

@bcardiff I meant the bang versions cannot work in the same way (including type casting)

Yes, I can add them including a notice they don't work with type casting.

@deepj

This comment has been minimized.

Contributor

deepj commented May 8, 2017

I tried to implement transform_keys! the following way

class Hash
  def transform_keys!(&block  : K -> K)
    each do |key, value|
      self[yield(key)] = value
      delete(key)
    end
  end
end

h = {'a' => 1, 'b' => 2}
h.transform_keys!(&.upcase)
puts h

But h returns an empty hash and I don't know why. Crystal 0.22.0

@bcardiff

This comment has been minimized.

Member

bcardiff commented May 8, 2017

@deepj it's because the hash in changed inside the iterator.

For example if you swap delete and self[..] = you will end in an infinite loop.

I think we will need to duplicate the keys. Unless we go down with something that deals with the hash internals for reallocating the entries. So something as follow could work:

class Hash
  def transform_keys!(&block  : K -> K)
    each.to_a.each do |(key, value)|
      delete(key)
      self[yield(key)] = value
    end
  end
end

h = {'a' => 1, 'b' => 2}
h.transform_keys!(&.upcase)
puts h

Note that deleting the key should happen before inserting. Just in case the key transformation did nothing for that key.

But, now i am thinking of the corner cases:

h = {'a' => 1, 'b' => 2}
h.transform_keys!(&.succ)
puts h

The above change leads of {'c' => 2}

So we would need something like

class Hash
  def transform_keys!(&block  : K -> K)
    pairs = each.to_a
    clear
    pairs.each do |(key, value)|
      self[yield(key)] = value
    end
  end
end

which is essentially h = h.transform_keys(&.succ).

So, unless we really do it in-place there is not real value for bang of keys. Let's leave that out.

But the transform_values! could be there. Doing something efficient for this overload seems straight forward. Check Hash#each.

@RX14

This comment has been minimized.

Member

RX14 commented May 8, 2017

This looks like map for keys/values on hash. Why not call it map_keys/map_values?

@Sija

This comment has been minimized.

Contributor

Sija commented May 8, 2017

@RX14 One reason would be familiarity, since there's already Hash#transform_keys in Rails.

@bew

This comment has been minimized.

Contributor

bew commented May 8, 2017

I'm all for map_keys! & map_values! they're the perfect names IMO.

The implementation needs to use internals I think: https://carc.in/#/r/1zwm

class Hash
  def map_keys!
    current = @first
    while current
      current.key = yield current.key
      current = current.fore
    end
  end

  private class Entry
    setter key : K # key has a getter only, adding a setter here
  end
end

h = {'a' => 1, 'b' => 2}
h.map_keys!(&.succ)
puts h # => {'b' => 1, 'c' => 2}

This impl has no copy (well maybe when copying from the block result to the Entry, but I'm not sure..), so it should be the fastest and the safest (no allocations or anything that could fail)

@konovod

This comment has been minimized.

Contributor

konovod commented May 9, 2017

What if two keys would be mapped to the same value? It should raise, not create inconsistent UB'ish state imho. https://carc.in/#/r/200a

@deepj deepj force-pushed the deepj:transform-keys-values branch 2 times, most recently from b2e8dc4 to fba15fb May 9, 2017

src/hash.cr Outdated
#
# ```
# hash = {"a" => 1, "b" => 2, "c" => 3}
# hash.transform_key! { |key| key.succ }

This comment has been minimized.

@bew

bew May 9, 2017

Contributor

Should be transform_keys! (you missed the s)

src/hash.cr Outdated
# hash = {"a" => 1, "b" => 2, "c" => 3}
# hash.transform_key! { |key| key.succ }
# hash # => {"b" => 1, "c" => 2, "d" => 3}
def transform_keys!(&block : K -> K)

This comment has been minimized.

@bcardiff

bcardiff May 9, 2017

Member

This is breaking the Hash invariant. A call to rehash is needed at the end or the final state won't be useful for user.

I don't fully like that the Entry#key is changed from getter to property since that was preventing the invariant to be broken.

I would say to either base the implementation on rehash (zeroing the buckets, creating new Entry, iterate the old entries, storing new entries in the right place) or remove this implementation.

If any implementation is added, I would say to assert that actual.should eq(expected) and expected.should eq(actual) so we can detect if the code messes up with the invariant.

This comment has been minimized.

@bcardiff

bcardiff May 9, 2017

Member

From @konovod comment I would say that insert_in_bucket should be used also so we avoid having more than one entry with the same key.

If the user provides a function that collide two keys into one there is no guarantee which value will be preserved. It's that or raising.

@bew

This comment has been minimized.

Contributor

bew commented Feb 23, 2018

I thought this was merged, @deepj do you want to continue working on it?

If we can't find a consensus for mutating methods, at least it would be great to have this for non-mutating ones.

@deepj

This comment has been minimized.

Contributor

deepj commented Feb 23, 2018

@bew Hi, I totally forgot on this PR. But my conclusion is, the best things here is keeping only #transform_keys and #transform_values.

What do you think @bcardiff @RX14

@bew

This comment has been minimized.

Contributor

bew commented Apr 26, 2018

I think you can also keep transform_values! too, that can be useful at times.

@deepj deepj force-pushed the deepj:transform-keys-values branch from fba15fb to 378262a Aug 2, 2018

@deepj deepj force-pushed the deepj:transform-keys-values branch from 8266e31 to 1527126 Aug 2, 2018

@deepj

This comment has been minimized.

Contributor

deepj commented Aug 2, 2018

@bew So, I left #transform_keys, #transform_values, #transform_values!

@bcardiff Can be this reviewed again, please? Is something else what I can do here and push things forward?

@bew

This comment has been minimized.

Contributor

bew commented Aug 15, 2018

LGTM, can this have some reviews?

@RX14 RX14 requested a review from bcardiff Aug 16, 2018

@bcardiff

LGTM

@RX14

RX14 approved these changes Aug 17, 2018

@RX14 RX14 added this to the 0.26.1 milestone Aug 17, 2018

@RX14 RX14 merged commit 011aa64 into crystal-lang:master Aug 17, 2018

4 checks passed

ci/circleci: test_darwin Your tests passed on CircleCI!
Details
ci/circleci: test_linux Your tests passed on CircleCI!
Details
ci/circleci: test_linux32 Your tests passed on CircleCI!
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

omarroth added a commit to omarroth/crystal that referenced this pull request Nov 5, 2018

Add Hash#transform_keys and Hash#transform_values (crystal-lang#4385)
* Add Hash#transform_keys and Hash#transform_values

* Add bang equivalents for transform_keys and transform_values

* Remove #transform_key!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment