Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize compilation of large literal arrays #9721

Merged
merged 2 commits into from Jan 27, 2024

Conversation

jeremyevans
Copy link
Contributor

To avoid stack overflow, Ruby splits compilation of large arrays into smaller arrays, and concatenates the small arrays together. It previously used newarray/concatarray for this, which is inefficient. This switches the compilation to use pushtoarray, which is much faster. This makes almost all literal arrays only allocate a single array.

For cases where there is a large amount of static values in the array, Ruby will statically compile subarrays, and previously added them using concatarray. This switches to concattoarray, avoiding an array allocation for the append.

Keyword splats are also supported in arrays, and ignored if the keyword splat is empty. Previously, this used newarraykwsplat and concatarray. This still uses newarraykwsplat, but switches to concattoarray to save an allocation. So large arrays with keyword splats can allocate 2 arrays instead of 1.

Previously, for the following array sizes (assuming local variable access for each element), Ruby allocated the following number of arrays:

1000 elements: 7 arrays
10000 elements: 79 arrays
100000 elements: 781 arrays

With these changes, only a single array is allocated (or 2 for a large array with a keyword splat.

Results using the included benchmark:

                       array_1000
            miniruby:     34770.0 i/s
   ./miniruby-before:     10511.7 i/s - 3.31x  slower

                      array_10000
            miniruby:      4938.8 i/s
   ./miniruby-before:       483.8 i/s - 10.21x  slower

                     array_100000
            miniruby:       727.2 i/s
   ./miniruby-before:         4.1 i/s - 176.98x  slower

To avoid stack overflow, Ruby splits compilation of large arrays
into smaller arrays, and concatenates the small arrays together.
It previously used newarray/concatarray for this, which is
inefficient.  This switches the compilation to use pushtoarray,
which is much faster. This makes almost all literal arrays only
allocate a single array.

For cases where there is a large amount of static values in the
array, Ruby will statically compile subarrays, and previously
added them using concatarray.  This switches to concattoarray,
avoiding an array allocation for the append.

Keyword splats are also supported in arrays, and ignored if the
keyword splat is empty.  Previously, this used newarraykwsplat and
concatarray.  This still uses newarraykwsplat, but switches to
concattoarray to save an allocation.  So large arrays with keyword
splats can allocate 2 arrays instead of 1.

Previously, for the following array sizes (assuming local variable
access for each element), Ruby allocated the following number of
arrays:

  1000 elements: 7 arrays
 10000 elements: 79 arrays
100000 elements: 781 arrays

With these changes, only a single array is allocated (or 2 for a
large array with a keyword splat.

Results using the included benchmark:

```
                       array_1000
            miniruby:     34770.0 i/s
   ./miniruby-before:     10511.7 i/s - 3.31x  slower

                      array_10000
            miniruby:      4938.8 i/s
   ./miniruby-before:       483.8 i/s - 10.21x  slower

                     array_100000
            miniruby:       727.2 i/s
   ./miniruby-before:         4.1 i/s - 176.98x  slower
```
@jeremyevans jeremyevans requested a review from mame January 26, 2024 19:10
jeremyevans added a commit to jeremyevans/ruby that referenced this pull request Jan 26, 2024
Previously, a literal array with a splat and any other args resulted in
more than one array allocation:

```ruby
[1, *a]
[*a, 1]
[*a, *a]
[*a, 1, 2]

[*a, a]
[*a, 1, *a]
[*a, 1, a]
[*a, a, a]

[*a, a, *a]
[*a, 1, *a, 1]
[*a, 1, *a, *a]

[*a, a, *a, a]
```

This is because previously Ruby would use newarray and concatarray
to create the array, which both each allocate an array internally.

This changes the compilation to use concattoarray and pushtoarray,
which do not allocate arrays.  It also updates the peephole optimizer
to optimize the duparray/concattoarray sequence to
putobject/concattoarray, mirroring the existing duparray/concatarray
optimization.

These changes reduce the array allocations for the above examples to
a single array allocation, except for:

```
[*a, 1, a]
[*a, a, a]
```

The reason for this is because optimizing this case to only allocate
1 array requires changes to compile_array, which would currently
conflict with an unmerged pull request (ruby#9721).  After that pull
request is merged, it should be possible to refactor things to only
allocate a 1 array for all literal arrays (or 2 for arrays with
keyword splats).
compile.c Outdated Show resolved Hide resolved
Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
@jeremyevans jeremyevans merged commit 2217e08 into ruby:master Jan 27, 2024
97 checks passed
jeremyevans added a commit that referenced this pull request Jan 27, 2024
Previously, a literal array with a splat and any other args resulted in
more than one array allocation:

```ruby
[1, *a]
[*a, 1]
[*a, *a]
[*a, 1, 2]

[*a, a]
[*a, 1, *a]
[*a, 1, a]
[*a, a, a]

[*a, a, *a]
[*a, 1, *a, 1]
[*a, 1, *a, *a]

[*a, a, *a, a]
```

This is because previously Ruby would use newarray and concatarray
to create the array, which both each allocate an array internally.

This changes the compilation to use concattoarray and pushtoarray,
which do not allocate arrays.  It also updates the peephole optimizer
to optimize the duparray/concattoarray sequence to
putobject/concattoarray, mirroring the existing duparray/concatarray
optimization.

These changes reduce the array allocations for the above examples to
a single array allocation, except for:

```
[*a, 1, a]
[*a, a, a]
```

The reason for this is because optimizing this case to only allocate
1 array requires changes to compile_array, which would currently
conflict with an unmerged pull request (#9721).  After that pull
request is merged, it should be possible to refactor things to only
allocate a 1 array for all literal arrays (or 2 for arrays with
keyword splats).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants