New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize Enum.map/2 for lists #3811
Conversation
Out of 3 options: * using for comprehensions * using explicit recursion with accumulator * using simple consing The third option seems to be the fastest. That's also the one that is used by :lists.map/2
Just to be double sure, please run every benchmark in a separated process. José Valim |
I must admit I went a lazy path this time - the new testing code Results:
|
Yes. comprehensions have been rewritten to always use Enum.reduce, that's why it became slower. If you find any other use of comprehensions in |
Optimize Enum.map/2 for lists
Optimize Enum.map/2 for lists Signed-off-by: José Valim <jose.valim@plataformatec.com.br>
@josevalim Interesting, could you explain why? Should we always run benchmarks in separated processes? |
I think the concern was to eliminate the effects of garbage collection. Each process has a new heap, so garbage collection will happen at the same time. When you run all in the same process it's not as predictable, as you can accumulate garbage from previous runs. |
@michalmuskala That makes sense. Thank you for the explanation :) |
I'm usually a proponent of body-recursive functions but in the stdlib we should strive for highest performance. @michalmuskala did you also measure the tail-recursive version that reverses the list at the end. Tail-recursive should be ~30% faster on x86 according to erlang's efficiency guide. |
@ericmj I believe tail-recursive is the implementation I called I've read that part of efficiency guide too - if my benchmarking was correct it seems both solutions are equally performant currently. |
I think using |
In my previous benchmarks, tail-recursive is faster for long collections, body-recursive for shorter ones. |
The reasoning is the same as in elixir-lang#3811 and 99e0d8e
* Avoid do_ prefixing * Refactor list functions to be body-recursive where sensible The reasoning is the same as in #3811 and 99e0d8e. * Use recursive functions to optimize filter/reject for lists ----- With input: Big (10 Million) ----- Name ips average deviation median tail 1.64 610.99 ms ±13.78% 577.63 ms body 1.48 675.90 ms ±10.06% 681.08 ms for 1.46 687.19 ms ±12.84% 693.82 ms Comparison: tail 1.64 body 1.48 - 1.11x slower for 1.46 - 1.12x slower ----- With input: Medium (100 Thousand) ----- Name ips average deviation median tail 201.60 4.96 ms ±15.12% 4.81 ms body 199.52 5.01 ms ±14.03% 4.76 ms for 178.95 5.59 ms ±14.50% 5.39 ms Comparison: tail 201.60 body 199.52 - 1.01x slower for 178.95 - 1.13x slower ---- With input: Small (1 Thousand) ----- Name ips average deviation median body 23.98 K 41.70 μs ±38.90% 38.00 μs tail 21.35 K 46.84 μs ±35.64% 44.00 μs for 18.64 K 53.63 μs ±31.60% 50.00 μs Comparison: body 23.98 K tail 21.35 K - 1.12x slower for 18.64 K - 1.29x slower * Add function guard to group_by/3 back It was used for dispatch between the new and deprecated version. The guard was erroneously removed in 99e44a1#diff-6881431a92cd4e3ea0de82bf2338f8eaL1032.
When writing an article about lists I ventured into implementing couple of the core list functions myself. I was surprised when I discovered that the current implementation of
Enum.map/2
is not optimal.I came up with 3 options:
The third option seems to be the fastest. That's also the one that is used by
:lists.map/2
Tested implementations:
Results:
Full testing code
Performance gain is steadily around 30%. The difference is not huge (and probably less significant if you do any work inside the mapping function), but given how often
Enum.map/2
is used, I think this warrants the change.