New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of string interpolation #1626

Closed
wants to merge 5 commits into
base: trunk
from

Conversation

4 participants
@south37

south37 commented May 21, 2017

This patch will add pre-allocation in string interpolation.
By this, unnecessary capacity resizing is avoided.

For small strings, optimized rb_str_resurrect operation is faster, so pre-allocation is done only when concatenated strings are large.
MIN_PRE_ALLOC_SIZE was decided by experimenting with my local machine (x86_64-apple-darwin 16.5.0, Apple LLVM version 8.1.0 (clang - 802.0.42)).

String interpolation will be faster around 72% when large string is created.

  • Before
Calculating -------------------------------------
Large string interpolation
                          1.276M (± 5.9%) i/s -      6.358M in   5.002022s
Small string interpolation
                          5.156M (± 5.5%) i/s -     25.728M in   5.005731s
  • After
Calculating -------------------------------------
Large string interpolation
                          2.201M (± 5.8%) i/s -     11.063M in   5.043724s   <- 72% faster!!
Small string interpolation
                          5.192M (± 5.7%) i/s -     25.971M in   5.020516s   <- no degradation
  • Test code
require 'benchmark/ips'

Benchmark.ips do |x|
  x.report "Large string interpolation" do |t|
    a = "Hellooooooooooooooooooooooooooooooooooooooooooooooooooo"
    b = "Wooooooooooooooooooooooooooooooooooooooooooooooooooorld"

    t.times do
      "#{a}, #{b}!"
    end
  end

  x.report "Small string interpolation" do |t|
    a = "Hello"
    b = "World"

    t.times do
      "#{a}, #{b}!"
    end
  end
end

Issue
https://bugs.ruby-lang.org/issues/13587

Improve performance of string interpolation
This patch will add pre-allocation in string interpolation.
By this, unecessary capacity resizing is avoided.

For small strings, optimized `rb_str_resurrect` operation is faster, so pre-allocation is done only when concatenated strings are large.
MIN_PRE_ALLOC_SIZE was decided by experimenting with local machine (x86_64-apple-darwin 16.5.0, Apple LLVM version 8.1.0 (clang - 802.0.42)).

String interpolation will be faster around 72% when large string is created.

* Before
Calculating -------------------------------------
Large string interpolation
                          1.276M (± 5.9%) i/s -      6.358M in   5.002022s
Small string interpolation
                          5.156M (± 5.5%) i/s -     25.728M in   5.005731s

* After
Calculating -------------------------------------
Large string interpolation
                          2.201M (± 5.8%) i/s -     11.063M in   5.043724s
Small string interpolation
                          5.192M (± 5.7%) i/s -     25.971M in   5.020516s

* Test code
require 'benchmark/ips'

Benchmark.ips do |x|
  x.report "Large string interpolation" do |t|
    a = "Hellooooooooooooooooooooooooooooooooooooooooooooooooooo"
    b = "Wooooooooooooooooooooooooooooooooooooooooooooooooooorld"

    t.times do
      "#{a}, #{b}!"
    end
  end

  x.report "Small string interpolation" do |t|
    a = "Hello"
    b = "World"

    t.times do
      "#{a}, #{b}!"
    end
  end
end
@nobu

fix compile error

Show outdated Hide outdated string.c Outdated
Show outdated Hide outdated string.c Outdated
Fix coding style
This patch fixes wrong coding style.
@south37

This comment has been minimized.

Show comment
Hide comment
@south37

south37 Jul 26, 2017

@nobu I fixed coding style! 😄

south37 commented Jul 26, 2017

@nobu I fixed coding style! 😄

@nobu

nobu approved these changes Jul 27, 2017

long len = 1;
if (UNLIKELY(!num)) return rb_str_new(0, 0);
if (UNLIKELY(num == 1)) return rb_str_resurrect(strary[0]);

This comment has been minimized.

@rhenium

rhenium Jul 27, 2017

Member

Would num ever be 0 or 1?

@rhenium

rhenium Jul 27, 2017

Member

Would num ever be 0 or 1?

This comment has been minimized.

@south37

south37 Jul 27, 2017

Thanks for your comment! 😄

If rb_str_concat_literals is called from concatstrings(YARV insn) and it is executed as YARV insns generated by Ruby compiler, num seems not to be 0 nor 1.
But, if rb_str_concat_literals is called directly in ruby internal, num may be 0 or 1.
I considered that situation.

But, rb_str_concat_literals is used only in concatstrings now, so it may be better to reject them 💡

@south37

south37 Jul 27, 2017

Thanks for your comment! 😄

If rb_str_concat_literals is called from concatstrings(YARV insn) and it is executed as YARV insns generated by Ruby compiler, num seems not to be 0 nor 1.
But, if rb_str_concat_literals is called directly in ruby internal, num may be 0 or 1.
I considered that situation.

But, rb_str_concat_literals is used only in concatstrings now, so it may be better to reject them 💡

This comment has been minimized.

@rhenium

rhenium Sep 16, 2017

Member

Thanks for explanation. I've just found that :"#{}" actually produces a concatstrings call with num==1 (possibly there are more cases that I didn't find). I think the concatstrings call can be eliminated in that case, but it would be out of scope of this PR.

@rhenium

rhenium Sep 16, 2017

Member

Thanks for explanation. I've just found that :"#{}" actually produces a concatstrings call with num==1 (possibly there are more cases that I didn't find). I think the concatstrings call can be eliminated in that case, but it would be out of scope of this PR.

s = 1;
}
else {
str = rb_str_buf_new(len);

This comment has been minimized.

@rhenium

rhenium Jul 27, 2017

Member

The encoding is not preserved. s="."*50; p "#{s}x".encoding would result in ASCII-8BIT.

@rhenium

rhenium Jul 27, 2017

Member

The encoding is not preserved. s="."*50; p "#{s}x".encoding would result in ASCII-8BIT.

This comment has been minimized.

@south37

south37 Jul 27, 2017

Sorry, I didn't notice it...
Thanks for pointing it out!

I 'll fix it and add the test case! 💡

@south37

south37 Jul 27, 2017

Sorry, I didn't notice it...
Thanks for pointing it out!

I 'll fix it and add the test case! 💡

This comment has been minimized.

@south37

south37 Jul 27, 2017

@rhenium

I fixed it by 1f6ad9b ! 💡

I referenced the similar operation in array.c. https://github.com/ruby/ruby/blob/trunk/array.c#L1974

@south37

south37 Jul 27, 2017

@rhenium

I fixed it by 1f6ad9b ! 💡

I referenced the similar operation in array.c. https://github.com/ruby/ruby/blob/trunk/array.c#L1974

Fix ASCII-8bit encoding bug
`s="."*50; p "#{s}x".encoding` should be UTF-8, but resulted in
ASCII-8bit.
This patch fixes it.
Show outdated Hide outdated test/ruby/test_string.rb Outdated
@south37

This comment has been minimized.

Show comment
Hide comment
@south37

south37 commented Sep 9, 2017

@rhenium ping

matzbot pushed a commit that referenced this pull request Sep 17, 2017

compile.c: optimize unnecessary concatstrings
* compile.c (iseq_peephole_optimize): optimize away unnecessary
  concatenation of single string, following tostring which always
  puts a String instance.
  #1626 (comment)

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59945 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

@matzbot matzbot closed this in 80c5030 Oct 21, 2017

mrkn pushed a commit to mrkn/ruby that referenced this pull request Dec 1, 2017

Improve performance of string interpolation
This patch will add pre-allocation in string interpolation.
By this, unecessary capacity resizing is avoided.

For small strings, optimized `rb_str_resurrect` operation is
faster, so pre-allocation is done only when concatenated strings
are large.  `MIN_PRE_ALLOC_SIZE` was decided by experimenting with
local machine (x86_64-apple-darwin 16.5.0, Apple LLVM version
8.1.0 (clang - 802.0.42)).

String interpolation will be faster around 72% when large string is created.

* Before
  ```
  Calculating -------------------------------------
  Large string interpolation
                            1.276M (± 5.9%) i/s -      6.358M in   5.002022s
  Small string interpolation
                            5.156M (± 5.5%) i/s -     25.728M in   5.005731s
  ```

* After
  ```
  Calculating -------------------------------------
  Large string interpolation
                            2.201M (± 5.8%) i/s -     11.063M in   5.043724s
  Small string interpolation
                            5.192M (± 5.7%) i/s -     25.971M in   5.020516s
  ```

* Test code
  ```ruby
  require 'benchmark/ips'

  Benchmark.ips do |x|
    x.report "Large string interpolation" do |t|
      a = "Hellooooooooooooooooooooooooooooooooooooooooooooooooooo"
      b = "Wooooooooooooooooooooooooooooooooooooooooooooooooooorld"

      t.times do
        "#{a}, #{b}!"
      end
    end

    x.report "Small string interpolation" do |t|
      a = "Hello"
      b = "World"

      t.times do
        "#{a}, #{b}!"
      end
    end
  end
  ```

[Fix rubyGH-1626]
From: Nao Minami <south37777@gmail.com>

git-svn-id: svn+ssh://svn.ruby-lang.org/ruby/trunk@60320 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment