Use size to set array capa when possible #444

Closed
wants to merge 3 commits into
from

Conversation

3 participants

Enumerable#to_a works by creating an empty array with small capacity, then populating it and expanding the capacity as it goes. For large enumerables, this causes several resizes, which can hurt performance. When an enumerable exposes a size method, we can guess that the resulting array's size will usually be equal to the enumerable's size. If we're right, we only have to set capacity once, and if we're wrong, we don't lose anything.

This PR adjusts enum.c's to_a method to take advantage of the size method when it's there. In my tests this makes Range#to_a about 10% faster, and doesn't have any significant effect on a vanilla enum with no size method. I couldn't find any existing benchmark that this consistently made better or worse.

If you like this idea, this could also be done in other classes with custom to_a, like Hash.

(Sorry if this isn't a proper way to submit a patch, by the way. I wasn't sure.)

HonoreDB added some commits Nov 11, 2013

Speedup ratios (with three runs)

app_answer  1.031
app_aobench 1.017
app_erb 1.012
app_factorial   1.009
app_fib 0.997
app_mandelbrot  0.990
app_pentomino   1.005
app_raise   1.023
app_strconcat   0.993
app_tak 1.032
app_tarai   1.019
app_uri 0.987
enum_to_a_sized 1.094
enum_to_a_unsized   1.019
hash_shift  1.003
io_file_create  1.012
io_file_read    1.003
io_file_write   0.990
io_select   1.006
io_select2  1.003
io_select3  0.986
loop_for    1.021
loop_generator  1.004
loop_times  0.942
loop_whileloop  1.015
loop_whileloop2 0.983
so_ackermann    1.024
so_array    1.007
so_binary_trees 1.009
so_concatenate  1.013
so_count_words  1.023
so_exception    1.018
so_fannkuch 1.012
so_fasta    1.016
so_k_nucleotide 0.989
so_lists    0.995
so_mandelbrot   0.992
so_matrix   1.018
so_meteor_contest   1.002
so_nbody    1.003
so_nested_loop  0.976
so_nsieve   1.007
so_nsieve_bits  0.995
so_object   0.998
so_partial_sums 0.988
so_pidigits 0.996
so_random   1.016
so_reverse_complement   1.002
so_sieve    0.981
so_spectralnorm 0.933
vm1_attr_ivar*  0.981
vm1_attr_ivar_set*  1.054
vm1_block*  1.010
vm1_const*  1.055
vm1_ensure* 0.747
vm1_float_simple*   1.003
vm1_gc_short_lived* 1.006
vm1_gc_short_with_complex_long* 1.003
vm1_gc_short_with_long* 1.006
vm1_gc_short_with_symbol*   1.009
vm1_gc_wb_ary*  1.014
vm1_gc_wb_obj*  1.004
vm1_ivar*   1.007
vm1_ivar_set*   0.963
vm1_length* 1.015
vm1_lvar_init*  0.937
vm1_lvar_set*   1.001
vm1_neq*    0.969
vm1_not*    0.991
vm1_rescue* 0.980
vm1_simplereturn*   0.991
vm1_swap*   1.023
vm1_yield*  0.970
vm2_array*  1.012
vm2_bigarray*   0.994
vm2_bighash*    0.999
vm2_case*   1.057
vm2_defined_method* 1.014
vm2_dstr*   1.029
vm2_eval*   1.098
vm2_method* 0.987
vm2_method_missing* 1.007
vm2_method_with_block*  1.001
vm2_mutex*  1.022
vm2_poly_method*    1.046
vm2_poly_method_ov* 1.004
vm2_proc*   0.967
vm2_raise1* 1.050
vm2_raise2* 1.024
vm2_regexp* 0.998
vm2_send*   1.005
vm2_super*  1.012
vm2_unif1*  0.998
vm2_zsuper* 1.003
vm3_backtrace   0.938
vm3_clearmethodcache    1.008
vm3_gc  0.992

@mmasaki mmasaki closed this in d908180 May 10, 2015

@hsbt hsbt reopened this May 14, 2015

nurse pushed a commit to nurse/ruby that referenced this pull request May 25, 2015

* enum.c (enum_to_a): Use size to set array capa when possible.
  the patch is from HonoreDB <aweiner at mdsol.com>.
  [fix GH-444]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@50457 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

mmasaki added a commit to mmasaki/ruby that referenced this pull request May 30, 2015

* enum.c (enum_to_a): Use size to set array capa when possible.
  the patch is from HonoreDB <aweiner at mdsol.com>.
  [fix GH-444]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@50457 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

@mmasaki mmasaki closed this Nov 10, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment