New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unicode chars in method names aren't accepted #3778

Closed
jodosha opened this Issue Apr 4, 2016 · 6 comments

Comments

Projects
None yet
3 participants
@jodosha

jodosha commented Apr 4, 2016

Environment

JRuby version:

jruby 9.1.0.0-SNAPSHOT (2.3.0) 2016-04-01 30c1276 Java HotSpot(TM) 64-Bit Server VM 25.60-b23 on 1.8.0_60-b27 +jit [darwin-x86_64]

OS:

Darwin escher 15.3.0 Darwin Kernel Version 15.3.0: Thu Dec 10 18:40:58 PST 2015; root:xnu-3248.30.4~1/RELEASE_X86_64 x86_64

Expected Behavior

Unicode chars in method names should be accepted.

I discovered this issue with hanami-utils and patched it to make the build to pass with 9.0.0.0.

According to @enebo, this supposed to be fixed with nightly series, but it isn't the case.

MRI/expected behavior:

irb(main):001:0> RUBY_VERSION
=> "2.3.0"
irb(main):002:0> def тест
irb(main):003:1> puts "hi"
irb(main):004:1> end
=> :тест
irb(main):005:0> тест
hi
=> nil

Actual Behavior

irb(main):001:0> RUBY_VERSION
=> "2.3.0"
irb(main):002:0> def тест
irb(main):003:1> puts "hi"
irb(main):004:1> end
=> :????
irb(main):005:0> тест
NameError: uninitialized constant B5AB
    from org/jruby/RubyModule.java:3238:in `const_missing'
    from (irb):5:in `<eval>'
    from org/jruby/RubyKernel.java:983:in `eval'
    from org/jruby/RubyKernel.java:1290:in `loop'
    from org/jruby/RubyKernel.java:1103:in `catch'
    from org/jruby/RubyKernel.java:1103:in `catch'
    from /Users/luca/.rubies/jruby-9.1.0.0-SNAPSHOT/bin/irb:13:in `<top>'
irb(main):006:0> __send__(:тест)
hi
=> nil
irb(main):007:0> __send__("тест")
NoMethodError: undefined method `тест' for main:Object
    from (irb):7:in `<eval>'
    from org/jruby/RubyKernel.java:983:in `eval'
    from org/jruby/RubyKernel.java:1290:in `loop'
    from org/jruby/RubyKernel.java:1103:in `catch'
    from org/jruby/RubyKernel.java:1103:in `catch'
    from /Users/luca/.rubies/jruby-9.1.0.0-SNAPSHOT/bin/irb:13:in `<top>'
@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Aug 16, 2016

Member

It seems to be seeing тест as being preceded by a capital letter, causing the first error when it tries to access it. If instead you make that line тест() it invokes correctly.

The send issue is a known mismatch between how we store multibyte symbols and how we store methods.

Member

headius commented Aug 16, 2016

It seems to be seeing тест as being preceded by a capital letter, causing the first error when it tries to access it. If instead you make that line тест() it invokes correctly.

The send issue is a known mismatch between how we store multibyte symbols and how we store methods.

@enebo

This comment has been minimized.

Show comment
Hide comment
@enebo

enebo Aug 16, 2016

Member

@jodosha also extreme minor issue I should bring up it we recently learned multibyte methods defined in irb will not work (@donv opened an issue on that a couple of days ago -- #4070). That is not related to this issue at all but you might trip over that as well.

The send issue and most likely the last of our encoding issues should be addressed by: #3880 (comment)

This is a big change and we need to do it early in a dev cycle for bake time.

Member

enebo commented Aug 16, 2016

@jodosha also extreme minor issue I should bring up it we recently learned multibyte methods defined in irb will not work (@donv opened an issue on that a couple of days ago -- #4070). That is not related to this issue at all but you might trip over that as well.

The send issue and most likely the last of our encoding issues should be addressed by: #3880 (comment)

This is a big change and we need to do it early in a dev cycle for bake time.

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Aug 16, 2016

Member

I think the main issue here above and beyond #3880 is that тест is getting parsed as a constant access. Using both Ruby's String and Java's string, the т codepoint appears to be lowercase.

[] ~/projects/jruby $ jruby -e "puts 'тест'"
тест

[] ~/projects/jruby $ jruby -e "puts 'тест'.to_java"
тест

[] ~/projects/jruby $ jruby -e "puts 'тест'.codepoint(0)"
NoMethodError: undefined method `codepoint' for "тест":String
Did you mean?  codepoints
  <main> at -e:1

[] ~/projects/jruby $ jruby -e "puts 'тест'.codepoints[0]"
1090

[] ~/projects/jruby $ jruby -e "puts 'тест'.to_java.charAt(0)"
1090

[] ~/projects/jruby $ jruby -e "p java.lang.Character.isUpperCase(1090)"
-e:1: warning: ambiguous Java methods found, using isUpperCase(char)
false

So whatever the parser is using to determine capital letters for constants, it's not handling this MBC properly.

Member

headius commented Aug 16, 2016

I think the main issue here above and beyond #3880 is that тест is getting parsed as a constant access. Using both Ruby's String and Java's string, the т codepoint appears to be lowercase.

[] ~/projects/jruby $ jruby -e "puts 'тест'"
тест

[] ~/projects/jruby $ jruby -e "puts 'тест'.to_java"
тест

[] ~/projects/jruby $ jruby -e "puts 'тест'.codepoint(0)"
NoMethodError: undefined method `codepoint' for "тест":String
Did you mean?  codepoints
  <main> at -e:1

[] ~/projects/jruby $ jruby -e "puts 'тест'.codepoints[0]"
1090

[] ~/projects/jruby $ jruby -e "puts 'тест'.to_java.charAt(0)"
1090

[] ~/projects/jruby $ jruby -e "p java.lang.Character.isUpperCase(1090)"
-e:1: warning: ambiguous Java methods found, using isUpperCase(char)
false

So whatever the parser is using to determine capital letters for constants, it's not handling this MBC properly.

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Aug 16, 2016

Member

Oh, and I agree this isn't going to make 9.1.3.0, so punting.

Member

headius commented Aug 16, 2016

Oh, and I agree this isn't going to make 9.1.3.0, so punting.

@headius headius modified the milestones: JRuby 9.1.4.0, JRuby 9.1.3.0 Aug 16, 2016

@jodosha

This comment has been minimized.

Show comment
Hide comment
@jodosha

jodosha Aug 16, 2016

@enebo @headius Thanks for your follow up. This isn't a blocking issue for Hanami, I just reported to let you know 😄 .

Good luck with the next release and again thank you for JRuby!

jodosha commented Aug 16, 2016

@enebo @headius Thanks for your follow up. This isn't a blocking issue for Hanami, I just reported to let you know 😄 .

Good luck with the next release and again thank you for JRuby!

@enebo enebo closed this in cc1743c Aug 17, 2016

@enebo enebo modified the milestones: JRuby 9.1.4.0, JRuby 9.1.3.0 Aug 17, 2016

@enebo

This comment has been minimized.

Show comment
Hide comment
@enebo

enebo Aug 17, 2016

Member

@jodosha The comment explains this but it was pretty simple to fix and I feel this is not risky so this is in 9.1.3.0.

The send with a string is still broken but a different issue (alluded to in my previous comment).

Member

enebo commented Aug 17, 2016

@jodosha The comment explains this but it was pretty simple to fix and I feel this is not risky so this is in 9.1.3.0.

The send with a string is still broken but a different issue (alluded to in my previous comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment