Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Broken KCODE handling in 1.8 mode #1982

Closed
dbussink opened this Issue Nov 3, 2012 · 0 comments

Comments

Projects
None yet
2 participants
Owner

dbussink commented Nov 3, 2012

Due to the change made in 0d5f8ea the following problem arises.

$KCODE = "utf-8"

str = "こにちわ"
reg = %r!!

puts str.gsub(reg, ".")

The problem only appears if this code is run twice. The first time it's executed correctly, the second time it fails. This is because the second run loads the file from the rbc cache and then reads the string in 1.8 mode with an encoding, breaking cases where we use KCODE as a fallback when a string doesn't have an encoding:

https://github.com/rubinius/rubinius/blob/master/vm/builtin/string.cpp#L1514

There could be other places that assume in 1.8 mode that strings don't have an encoding. There is more than one failure when I run CI multiple times without rebuilding, exposing similar issues because of roundtripping through the rbc cache:

1)
String#gsub with pattern and replacement respects $KCODE when the pattern collapses FAILED
Expected ".?.?.\223.?.?.\253.?.?.\241.?.?.\217."
 to equal ".こ.に.ち.わ."

          { } in Object#__script__ at spec/ruby/core/string/gsub_spec.rb:24
      Kernel(Object)#instance_eval at kernel/common/eval18.rb:45
     { } in Enumerable(Array)#all? at kernel/common/enumerable.rb:102
                        Array#each at kernel/bootstrap/array.rb:68
            Enumerable(Array)#all? at kernel/common/enumerable.rb:102
             Integer(Fixnum)#times at kernel/common/integer.rb:83
                        Array#each at kernel/bootstrap/array.rb:68
                 Object#__script__ at spec/ruby/core/string/gsub_spec.rb:5
                       Kernel.load at kernel/common/kernel.rb:580
      Kernel(Object)#instance_eval at kernel/common/eval18.rb:45
                        Array#each at kernel/bootstrap/array.rb:68
  Rubinius::CodeLoader#load_script at kernel/delta/codeloader.rb:68
  Rubinius::CodeLoader.load_script at kernel/delta/codeloader.rb:118
           Rubinius::Loader#script at kernel/loader.rb:614
             Rubinius::Loader#main at kernel/loader.rb:815

2)
String#scan respects $KCODE when the pattern collapses to nothing FAILED
Expected ["", "", "", "", "", "", "", "", "", "", "", "", ""]
to equal ["", "", "", "", ""]

          { } in Object#__script__ at spec/ruby/core/string/scan_spec.rb:29
      Kernel(Object)#instance_eval at kernel/common/eval18.rb:45
     { } in Enumerable(Array)#all? at kernel/common/enumerable.rb:102
                        Array#each at kernel/bootstrap/array.rb:68
            Enumerable(Array)#all? at kernel/common/enumerable.rb:102
             Integer(Fixnum)#times at kernel/common/integer.rb:83
                        Array#each at kernel/bootstrap/array.rb:68
                 Object#__script__ at spec/ruby/core/string/scan_spec.rb:5
                       Kernel.load at kernel/common/kernel.rb:580
      Kernel(Object)#instance_eval at kernel/common/eval18.rb:45
                        Array#each at kernel/bootstrap/array.rb:68
  Rubinius::CodeLoader#load_script at kernel/delta/codeloader.rb:68
  Rubinius::CodeLoader.load_script at kernel/delta/codeloader.rb:118
           Rubinius::Loader#script at kernel/loader.rb:614
             Rubinius::Loader#main at kernel/loader.rb:815

3)
String#split with Regexp respects $KCODE when splitting between characters FAILED
Expected 12
 to equal 4

          { } in Object#__script__ at spec/ruby/core/string/split_spec.rb:251
      Kernel(Object)#instance_eval at kernel/common/eval18.rb:45
     { } in Enumerable(Array)#all? at kernel/common/enumerable.rb:102
                        Array#each at kernel/bootstrap/array.rb:68
            Enumerable(Array)#all? at kernel/common/enumerable.rb:102
             Integer(Fixnum)#times at kernel/common/integer.rb:83
                        Array#each at kernel/bootstrap/array.rb:68
                 Object#__script__ at spec/ruby/core/string/split_spec.rb:158
                       Kernel.load at kernel/common/kernel.rb:580
      Kernel(Object)#instance_eval at kernel/common/eval18.rb:45
                        Array#each at kernel/bootstrap/array.rb:68
  Rubinius::CodeLoader#load_script at kernel/delta/codeloader.rb:68
  Rubinius::CodeLoader.load_script at kernel/delta/codeloader.rb:118
           Rubinius::Loader#script at kernel/loader.rb:614
             Rubinius::Loader#main at kernel/loader.rb:815

4)
String#split with Regexp respects the encoding of the regexp when splitting between characters FAILED
Expected 2
 to equal 1

          { } in Object#__script__ at spec/ruby/core/string/split_spec.rb:261
      Kernel(Object)#instance_eval at kernel/common/eval18.rb:45
     { } in Enumerable(Array)#all? at kernel/common/enumerable.rb:102
                        Array#each at kernel/bootstrap/array.rb:68
            Enumerable(Array)#all? at kernel/common/enumerable.rb:102
             Integer(Fixnum)#times at kernel/common/integer.rb:83
                        Array#each at kernel/bootstrap/array.rb:68
                 Object#__script__ at spec/ruby/core/string/split_spec.rb:158
                       Kernel.load at kernel/common/kernel.rb:580
      Kernel(Object)#instance_eval at kernel/common/eval18.rb:45
                        Array#each at kernel/bootstrap/array.rb:68
  Rubinius::CodeLoader#load_script at kernel/delta/codeloader.rb:68
  Rubinius::CodeLoader.load_script at kernel/delta/codeloader.rb:118
           Rubinius::Loader#script at kernel/loader.rb:614
             Rubinius::Loader#main at kernel/loader.rb:815

5)
StringScanner#getch is multi-byte character sensitive FAILED
Expected "\244"
 to equal "??"

          { } in Object#__script__ at spec/ruby/library/stringscanner/getch_spec.rb:28
      Kernel(Object)#instance_eval at kernel/common/eval18.rb:45
     { } in Enumerable(Array)#all? at kernel/common/enumerable.rb:102
                        Array#each at kernel/bootstrap/array.rb:68
            Enumerable(Array)#all? at kernel/common/enumerable.rb:102
             Integer(Fixnum)#times at kernel/common/integer.rb:79
                        Array#each at kernel/bootstrap/array.rb:68
                 Object#__script__ at spec/ruby/library/stringscanner/getch_spec.rb:5
                       Kernel.load at kernel/common/kernel.rb:580
      Kernel(Object)#instance_eval at kernel/common/eval18.rb:45
                        Array#each at kernel/bootstrap/array.rb:68
  Rubinius::CodeLoader#load_script at kernel/delta/codeloader.rb:68
  Rubinius::CodeLoader.load_script at kernel/delta/codeloader.rb:118
           Rubinius::Loader#script at kernel/loader.rb:614
             Rubinius::Loader#main at kernel/loader.rb:815

6)
StringScanner#scan returns the matched string for a multi byte string with KCODE FAILED
Expected "П\321"
 to equal "Привет"

          { } in Object#__script__ at spec/ruby/library/stringscanner/scan_spec.rb:34
      Kernel(Object)#instance_eval at kernel/common/eval18.rb:45
     { } in Enumerable(Array)#all? at kernel/common/enumerable.rb:102
                        Array#each at kernel/bootstrap/array.rb:68
            Enumerable(Array)#all? at kernel/common/enumerable.rb:102
             Integer(Fixnum)#times at kernel/common/integer.rb:79
                        Array#each at kernel/bootstrap/array.rb:68
                 Object#__script__ at spec/ruby/library/stringscanner/scan_spec.rb:6
                       Kernel.load at kernel/common/kernel.rb:580
      Kernel(Object)#instance_eval at kernel/common/eval18.rb:45
                        Array#each at kernel/bootstrap/array.rb:68
  Rubinius::CodeLoader#load_script at kernel/delta/codeloader.rb:68
  Rubinius::CodeLoader.load_script at kernel/delta/codeloader.rb:118
           Rubinius::Loader#script at kernel/loader.rb:614
             Rubinius::Loader#main at kernel/loader.rb:815

Finished in 90.822975 seconds

3861 files, 17451 examples, 45618 expectations, 6 failures, 0 errors

@brixen brixen closed this in f1e6b96 Nov 4, 2012

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment