Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

Use size to set array capa when possible #444

Open
wants to merge 3 commits into from

2 participants

@HonoreDB

Enumerable#to_a works by creating an empty array with small capacity, then populating it and expanding the capacity as it goes. For large enumerables, this causes several resizes, which can hurt performance. When an enumerable exposes a size method, we can guess that the resulting array's size will usually be equal to the enumerable's size. If we're right, we only have to set capacity once, and if we're wrong, we don't lose anything.

This PR adjusts enum.c's to_a method to take advantage of the size method when it's there. In my tests this makes Range#to_a about 10% faster, and doesn't have any significant effect on a vanilla enum with no size method. I couldn't find any existing benchmark that this consistently made better or worse.

If you like this idea, this could also be done in other classes with custom to_a, like Hash.

(Sorry if this isn't a proper way to submit a patch, by the way. I wasn't sure.)

HonoreDB added some commits
@HonoreDB HonoreDB Use enum size to set array capa d891476
@HonoreDB HonoreDB Inline enum_size check for performance 959a4d6
@HonoreDB HonoreDB Benchmark Enumerable#to_a
28fb6bf
@HonoreDB

Speedup ratios (with three runs)

app_answer  1.031
app_aobench 1.017
app_erb 1.012
app_factorial   1.009
app_fib 0.997
app_mandelbrot  0.990
app_pentomino   1.005
app_raise   1.023
app_strconcat   0.993
app_tak 1.032
app_tarai   1.019
app_uri 0.987
enum_to_a_sized 1.094
enum_to_a_unsized   1.019
hash_shift  1.003
io_file_create  1.012
io_file_read    1.003
io_file_write   0.990
io_select   1.006
io_select2  1.003
io_select3  0.986
loop_for    1.021
loop_generator  1.004
loop_times  0.942
loop_whileloop  1.015
loop_whileloop2 0.983
so_ackermann    1.024
so_array    1.007
so_binary_trees 1.009
so_concatenate  1.013
so_count_words  1.023
so_exception    1.018
so_fannkuch 1.012
so_fasta    1.016
so_k_nucleotide 0.989
so_lists    0.995
so_mandelbrot   0.992
so_matrix   1.018
so_meteor_contest   1.002
so_nbody    1.003
so_nested_loop  0.976
so_nsieve   1.007
so_nsieve_bits  0.995
so_object   0.998
so_partial_sums 0.988
so_pidigits 0.996
so_random   1.016
so_reverse_complement   1.002
so_sieve    0.981
so_spectralnorm 0.933
vm1_attr_ivar*  0.981
vm1_attr_ivar_set*  1.054
vm1_block*  1.010
vm1_const*  1.055
vm1_ensure* 0.747
vm1_float_simple*   1.003
vm1_gc_short_lived* 1.006
vm1_gc_short_with_complex_long* 1.003
vm1_gc_short_with_long* 1.006
vm1_gc_short_with_symbol*   1.009
vm1_gc_wb_ary*  1.014
vm1_gc_wb_obj*  1.004
vm1_ivar*   1.007
vm1_ivar_set*   0.963
vm1_length* 1.015
vm1_lvar_init*  0.937
vm1_lvar_set*   1.001
vm1_neq*    0.969
vm1_not*    0.991
vm1_rescue* 0.980
vm1_simplereturn*   0.991
vm1_swap*   1.023
vm1_yield*  0.970
vm2_array*  1.012
vm2_bigarray*   0.994
vm2_bighash*    0.999
vm2_case*   1.057
vm2_defined_method* 1.014
vm2_dstr*   1.029
vm2_eval*   1.098
vm2_method* 0.987
vm2_method_missing* 1.007
vm2_method_with_block*  1.001
vm2_mutex*  1.022
vm2_poly_method*    1.046
vm2_poly_method_ov* 1.004
vm2_proc*   0.967
vm2_raise1* 1.050
vm2_raise2* 1.024
vm2_regexp* 0.998
vm2_send*   1.005
vm2_super*  1.012
vm2_unif1*  0.998
vm2_zsuper* 1.003
vm3_backtrace   0.938
vm3_clearmethodcache    1.008
vm3_gc  0.992
@mmasaki mmasaki closed this pull request from a commit
@mmasaki mmasaki * enum.c (enum_to_a): Use size to set array capa when possible.
  the patch is from HonoreDB <aweiner at mdsol.com>.
  [fix GH-444]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@50457 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
d908180
@mmasaki mmasaki closed this in d908180
@hsbt hsbt reopened this
@nurse nurse referenced this pull request from a commit in nurse/ruby
@mmasaki mmasaki * enum.c (enum_to_a): Use size to set array capa when possible.
  the patch is from HonoreDB <aweiner at mdsol.com>.
  [fix GH-444]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@50457 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
4962a8b
@mmasaki mmasaki referenced this pull request from a commit in mmasaki/ruby
@mmasaki mmasaki * enum.c (enum_to_a): Use size to set array capa when possible.
  the patch is from HonoreDB <aweiner at mdsol.com>.
  [fix GH-444]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@50457 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
772f6d5
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Nov 11, 2013
  1. @HonoreDB
Commits on Nov 12, 2013
  1. @HonoreDB
  2. @HonoreDB

    Benchmark Enumerable#to_a

    HonoreDB authored
This page is out of date. Refresh to see the latest.
View
3  benchmark/bm_enum_to_a_sized.rb
@@ -0,0 +1,3 @@
+(2**15).times do |i|
+ ary = (0...i).to_a
+end
View
16 benchmark/bm_enum_to_a_unsized.rb
@@ -0,0 +1,16 @@
+class SizelessEnum
+ include Enumerable
+
+ def initialize(n)
+ @n = n
+ end
+
+ def each
+ @n.times { |i| yield i }
+ end
+
+end
+
+(2**15).times do |i|
+ ary = SizelessEnum.new(i).to_a
+end
View
3  enum.c
@@ -501,7 +501,8 @@ enum_flat_map(VALUE obj)
static VALUE
enum_to_a(int argc, VALUE *argv, VALUE obj)
{
- VALUE ary = rb_ary_new();
+ VALUE size = rb_check_funcall(obj, id_size, 0, 0);
+ VALUE ary = rb_ary_new2(size == Qundef ? RARRAY_EMBED_LEN_MAX : NUM2LONG(size));
rb_block_call(obj, id_each, argc, argv, collect_all, ary);
OBJ_INFECT(ary, obj);
Something went wrong with that request. Please try again.