New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip force_encoding in compiled code of erb #1147

Closed
wants to merge 1 commit into
base: trunk
from

Conversation

4 participants
@k0kubun
Member

k0kubun commented Dec 13, 2015

Since compiled script has magic comment for encoding, ERB#result will have valid encoding without calling force_encoding.

before

$ make benchmark-each COMPARE_RUBY= ITEM=bm_app_erb
benchmark results:
Execution time (sec)
name    built-ruby
app_erb      0.891

after

$ make benchmark-each COMPARE_RUBY= ITEM=bm_app_erb
benchmark results:
Execution time (sec)
name    built-ruby
app_erb      0.863
@hsbt

This comment has been minimized.

Show comment
Hide comment
@hsbt

hsbt Dec 14, 2015

Member

@k0kubun Travis fail is another issue probably. see http://rubyci.s3.amazonaws.com/osx1011/ruby-trunk/log/20151213T234500Z.fail.html.gz

I will notice you when test_option.rb is fixed. Please rebase trunk branch at it time.

Member

hsbt commented Dec 14, 2015

@k0kubun Travis fail is another issue probably. see http://rubyci.s3.amazonaws.com/osx1011/ruby-trunk/log/20151213T234500Z.fail.html.gz

I will notice you when test_option.rb is fixed. Please rebase trunk branch at it time.

@k0kubun

This comment has been minimized.

Show comment
Hide comment
@k0kubun

k0kubun Dec 14, 2015

Member

Travis fail is another issue probably.

I see.

I will notice you when test_option.rb is fixed. Please rebase trunk branch at it time.

I'll rebase after the fix and your notice.

Member

k0kubun commented Dec 14, 2015

Travis fail is another issue probably.

I see.

I will notice you when test_option.rb is fixed. Please rebase trunk branch at it time.

I'll rebase after the fix and your notice.

erb.rb: Use script encoding instead of force_encoding
The original intention of introducing `_erbout.force_encoding`
in r21170 was:

- "returns a string in the same character encoding as the input string."
- "When the input string has a magic comment, however, it returns a string
  in the encoding specified by the magic comment."

And they are tested by test/erb/test_erb_m17n.rb well and this patch
passes the test.
Since magic comment is always added in ERB compiled code, using ''.dup
instead of String.new will set correct encoding without calling
force_encoding method.

The benchmark results are:

* Before

$ ./ruby benchmark/run.rb --matzruby=./ruby -m bm_app_erb
MatzRuby:
ruby 2.5.0dev (2017-05-26 skip-force-enc.. 58903) [x86_64-linux]
last_commit=Skip force_encoding in compiled code of erb
Ruby:

app_erb:
matz 0.715

* After

$ ./ruby benchmark/run.rb --matzruby=./ruby -m bm_app_erb
MatzRuby:
ruby 2.5.0dev (2017-05-26 skip-force-enc.. 58903) [x86_64-linux]
last_commit=Skip force_encoding in compiled code of erb
Ruby:

app_erb:
matz 0.672

And perf(1) results are:

* Before

$ sudo perf stat ./ruby benchmark/bm_app_erb.rb

 Performance counter stats for './ruby benchmark/bm_app_erb.rb':

        709.571746      task-clock (msec)         #    1.000 CPUs utilized
                 5      context-switches          #    0.007 K/sec
                 1      cpu-migrations            #    0.001 K/sec
             1,337      page-faults               #    0.002 M/sec
     3,088,936,521      cycles                    #    4.353 GHz
   <not supported>      stalled-cycles-frontend
   <not supported>      stalled-cycles-backend
     4,849,564,282      instructions              #    1.57  insns per cycle
     1,027,042,087      branches                  # 1447.411 M/sec
        19,983,456      branch-misses             #    1.95% of all branches

       0.709747823 seconds time elapsed

* After

$ sudo perf stat ./ruby benchmark/bm_app_erb.rb

 Performance counter stats for './ruby benchmark/bm_app_erb.rb':

        693.494673      task-clock (msec)         #    1.000 CPUs utilized
                 7      context-switches          #    0.010 K/sec
                 1      cpu-migrations            #    0.001 K/sec
             1,316      page-faults               #    0.002 M/sec
     3,025,639,349      cycles                    #    4.363 GHz
   <not supported>      stalled-cycles-frontend
   <not supported>      stalled-cycles-backend
     4,694,848,271      instructions              #    1.55  insns per cycle
       994,496,704      branches                  # 1434.037 M/sec
        19,693,239      branch-misses             #    1.98% of all branches

       0.693724345 seconds time elapsed

[fix GH-1147]

@hsbt hsbt closed this in 5fc3236 May 26, 2017

@k0kubun k0kubun deleted the k0kubun:skip-force-encoding branch May 26, 2017

@eagletmt

This comment has been minimized.

Show comment
Hide comment
@eagletmt

eagletmt Apr 11, 2018

Hi @k0kubun , this change breaks output's encoding compatibility.

% RBENV_VERSION=2.4.4 ruby -rerb -ve 'buf = "にほんご".b; puts ERB.new("buf = <%= buf %>").result(binding).encoding'
ruby 2.4.4p296 (2018-03-28 revision 63013) [x86_64-linux]
UTF-8
% RBENV_VERSION=2.5.1 ruby -rerb -ve 'buf =  にほんご".b; puts ERB.new("buf = <%= buf %>").result(binding).encoding'
ruby 2.5.1p57 (2018-03-29 revision 63029) [x86_64-linux]
ASCII-8BIT

Is this intentional?

eagletmt commented Apr 11, 2018

Hi @k0kubun , this change breaks output's encoding compatibility.

% RBENV_VERSION=2.4.4 ruby -rerb -ve 'buf = "にほんご".b; puts ERB.new("buf = <%= buf %>").result(binding).encoding'
ruby 2.4.4p296 (2018-03-28 revision 63013) [x86_64-linux]
UTF-8
% RBENV_VERSION=2.5.1 ruby -rerb -ve 'buf =  にほんご".b; puts ERB.new("buf = <%= buf %>").result(binding).encoding'
ruby 2.5.1p57 (2018-03-29 revision 63029) [x86_64-linux]
ASCII-8BIT

Is this intentional?

@k0kubun

This comment has been minimized.

Show comment
Hide comment
@k0kubun

k0kubun Apr 13, 2018

Member

While I actually didn't write it in a commit message (I should've written it, sorry), this behavior change is totally intentional as I believe it's out of ERB's responsibility.

Still I'm open to change the behavior back if there is a real-world problem which should be fixed in ERB's layer. As far as I know, we have fguillen/simplecov-rcov#20 triggered by this change, and I thought it should be fixed by simplecov-rcov's side. See fguillen/simplecov-rcov#20 (comment) for details. Please let me know if you have another issue triggered by this or objection to my direction for simplecov-rcov's issue.

Member

k0kubun commented Apr 13, 2018

While I actually didn't write it in a commit message (I should've written it, sorry), this behavior change is totally intentional as I believe it's out of ERB's responsibility.

Still I'm open to change the behavior back if there is a real-world problem which should be fixed in ERB's layer. As far as I know, we have fguillen/simplecov-rcov#20 triggered by this change, and I thought it should be fixed by simplecov-rcov's side. See fguillen/simplecov-rcov#20 (comment) for details. Please let me know if you have another issue triggered by this or objection to my direction for simplecov-rcov's issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment