Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize Hash#transform_{keys,values} #14502

Merged

Conversation

jgaskins
Copy link
Contributor

@jgaskins jgaskins commented Apr 17, 2024

Preallocating the correct-size hash is faster for hash sizes > 4, but the empty hash is faster for sizes <= 4. It would be a lot closer, but Hash#upsert allocates an initial capacity of 4 when the hash is empty but Hash#initialize sets the floor for initial_capacity at 8 elements.

This PR gives the best of both worlds by using the current approach when the source hash has <= 4 keys in it, or a fully preallocated hash for > 4.

Benchmark code
require "benchmark"

{
  1,
  4,
  5,
  10,
  100,
  1_000,
}.each do |size|
  hash = Array
    .new(size) { |i| {i.to_s, i.to_s} }
    .to_h

  unless hash.transform_keys(&.itself) == hash.transform_keys_optimized(&.itself)
    raise "Oops! #{hash.transform_keys_optimized(&.itself).pretty_inspect}"
  end

  puts
  puts
  puts "### #{size}"
  puts "Keys"
  Benchmark.ips do |x|
    x.report "baseline" { hash.transform_keys(&.itself) }
    x.report "optimized" { hash.transform_keys_optimized(&.itself) }
  end
  puts
  puts "Values"
  Benchmark.ips do |x|
    x.report "baseline" { hash.transform_values(&.itself) }
    x.report "optimized" { hash.transform_values_optimized(&.itself) }
  end
end

class Hash(K, V)
  def transform_keys_optimized(& : K, V -> K2) : Hash(K2, V) forall K2
    hash = Hash(K2, V).new(initial_capacity: size)
    each do |key, value|
      hash[yield(key, value)] = value
    end
    hash
  end

  def transform_values_optimized(& : V, K -> V2) : Hash(K, V2) forall V2
    hash = Hash(K, V2).new(initial_capacity: size)
    each do |key, value|
      hash[key] = yield(value, key)
    end
    hash
  end
end
### 1
Keys
 baseline  28.39M ( 35.23ns) (± 1.34%)  176B/op        fastest
optimized  20.34M ( 49.18ns) (± 1.41%)  272B/op   1.40× slower

Values
 baseline  29.40M ( 34.01ns) (± 1.61%)  176B/op        fastest
optimized  20.30M ( 49.27ns) (± 1.96%)  272B/op   1.45× slower


### 4
Keys
 baseline  17.18M ( 58.19ns) (± 2.63%)  176B/op        fastest
optimized  13.56M ( 73.76ns) (± 1.19%)  272B/op   1.27× slower

Values
 baseline  17.19M ( 58.17ns) (± 2.61%)  176B/op        fastest
optimized  13.56M ( 73.74ns) (± 1.16%)  272B/op   1.27× slower


### 5
Keys
 baseline  10.14M ( 98.62ns) (± 1.09%)  384B/op   1.23× slower
optimized  12.45M ( 80.34ns) (± 1.00%)  272B/op        fastest

Values
 baseline  10.07M ( 99.34ns) (± 4.05%)  384B/op   1.23× slower
optimized  12.40M ( 80.66ns) (± 3.20%)  272B/op        fastest


### 10
Keys
 baseline   5.04M (198.33ns) (± 4.11%)  832B/op   1.28× slower
optimized   6.46M (154.82ns) (± 0.96%)  512B/op        fastest

Values
 baseline   5.08M (196.95ns) (± 2.39%)  832B/op   1.27× slower
optimized   6.43M (155.58ns) (± 3.46%)  512B/op        fastest


### 100
Keys
 baseline 560.97k (  1.78µs) (± 1.02%)  7.09kB/op   1.35× slower
optimized 757.56k (  1.32µs) (± 1.02%)  3.34kB/op        fastest

Values
 baseline 562.41k (  1.78µs) (± 1.02%)  7.09kB/op   1.34× slower
optimized 754.39k (  1.33µs) (± 2.92%)  3.34kB/op        fastest


### 1000
Keys
 baseline  49.88k ( 20.05µs) (± 1.27%)  56.4kB/op   1.47× slower
optimized  73.53k ( 13.60µs) (± 1.89%)  28.1kB/op        fastest

Values
 baseline  49.67k ( 20.13µs) (± 1.23%)  56.4kB/op   1.47× slower
optimized  73.05k ( 13.69µs) (± 3.42%)  28.1kB/op        fastest

src/hash.cr Outdated Show resolved Hide resolved
@straight-shoota
Copy link
Member

straight-shoota commented Apr 18, 2024

Initializing the transformed hash with the current size seems like a good idea. We could also consider using the current capacity in order to make it a copy with identical properties, just transformed values/keys.

But I'm not sure about optimizing the capacity for small hashes. The default minimum capacity for a hash has its reasons. It's not highly optimized for the use case of very small hashes, but the overhead is minimal.
It will overall be less efficient if more items are added afterwards. I suppose an argument could be made that transformed hashes are less likely to get new additions. But that's speculative and surely there are many use cases where it doesn't hold.

So my suggestion would be to drop the special handling for small hashes and just allocate the transformed hash with an initial capacity.

@ysbaddaden
Copy link
Contributor

Let's do some archeology:

In #8017 Hash#initialize called #malloc_entries(initial_capacity / 2) which correlated with the other #malloc_entries(4) since initial capacity is supposed to be 8.

But #8182 patched #initialize to now call #malloc_entries(initial_capacity) but it didn't change the other #malloc_entries(4) and now they don't correlate anymore.

Said differently: magic numbers 😮‍💨

All that to say that magic numbers really are as bad as we say they are + I agree with @straight-shoota's statement above: let's just always use the capacity, as other calls to #malloc_entries do.

@ysbaddaden ysbaddaden added this to the 1.14.0 milestone Aug 27, 2024
@straight-shoota straight-shoota merged commit e6b5b94 into crystal-lang:master Sep 2, 2024
65 checks passed
@jgaskins jgaskins deleted the optimize-hash-transform branch October 17, 2024 20:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants