Skip to content

Conversation

srawlins
Copy link
Contributor

Currently #each_slice creates a new Array each time it yields, to hold the elements (each_slice_i). However, if the block given to #each_slice immediately separates the elements into multiple block arguments, then that new Array is immediately discarded, when it could have been used for every yield from #each_slice. Take these two examples:

arrays = []
(1..10).each_slice(3) { |a| arrays << a }

(1..10).each_slice(3) { |a,b,c| sum = a+b+c }

In the first example, the yielded Array is a, so we can't reuse the allocated Array for the next yield (we must create a new one). In the second example, however, Ruby immediately separates the elements of the yielded Array into a, b, and c, discarding the Array instance. This can be made faster. (Everything above is true for #each_cons as well.)

The patch

This patch uses rb_block_arity() to test how many arguments the given block can take, and only creates new Arrays if the arity is 1 or -1 (this should mean that the block either takes exactly 1 argument (no separation), only a splat argument (no separation), or is a lambda with 0 required arguments (unknown separation) (#arity docs). This performance trick is already used in Hash#each_pair:

if (rb_block_arity() > 1)
    rb_hash_foreach(hash, each_pair_i_fast, 0);
else
    rb_hash_foreach(hash, each_pair_i, 0);

Speedup

In a micro-benchmark, the speedup is decent, approximately 30% for arg-separating blocks. Here are some numbers for #each_cons. Scores are seconds of user-time (smaller is better) running variants of 200_000.times { (1..100).each_cons(3) { |a,b,c| sum = a+b+c } }. Some accept 1 arg, and some accept 3 args. Some used a block, some a proc, and one used a lambda:

yield recipient  trunk  patch
block (1 arg)     5.40   4.53
block (3 args)    4.00   3.07
proc (1 arg)      4.77   4.52
proc (3 args)     4.06   3.04
lambda (3 args)   3.86   2.84

@nobu nobu closed this in 2067516 Apr 15, 2014
@mame
Copy link
Member

mame commented Apr 16, 2014

I found an incompatibility.

$ ruby -ve 'ary = []; (1..10).each_slice(3, &lambda {|a, *| ary << a }); p ary'
ruby 2.1.0p0 (2013-12-25 revision 44422) [x86_64-linux]
[[1, 2, 3], [4, 5, 6], [7, 8, 9], [10]]

$ ./miniruby -ve 'ary = []; (1..10).each_slice(3, &lambda {|a, *| ary << a }); p ary'
ruby 2.2.0dev (2014-04-16 trunk 45599) [x86_64-linux]
[[10], [10], [10], [10]]

@eregon
Copy link
Member

eregon commented Apr 18, 2014

This patch uses rb_block_arity() to test how many arguments the given block can take, and only creates new Arrays if the arity is 1 or -1 (this should mean that the block either takes exactly 1 argument (no separation), only a splat argument (no separation), or is a lambda with 0 required arguments (unknown separation) (#arity docs).

Only cases of strictly positive arity can likely be optimized directly in such a way.
arity = -n-1 with n > 0 means n mandatory arguments, and some optional args (mame's example has -2 arity).
arity is not enough for some complex cases like lambda { |a,b=2,c| }.arity => -3

mmasaki pushed a commit to mmasaki/ruby that referenced this pull request Apr 21, 2014
* enum.c (enum_each_slice, enum_each_cons): make more efficient by
  allocating less and recycling block argument arrays if possible.
  [Fixes rubyGH-596]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@45589 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants