-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix edge cases with unicode method names #10978
Fix edge cases with unicode method names #10978
Conversation
@@ -3134,14 +3134,24 @@ module Crystal | |||
Slice.new(@reader.string.to_unsafe + start_pos, end_pos - start_pos) | |||
end | |||
|
|||
def ident_start?(char) | |||
def self.ident_start?(char) | |||
char.ascii_letter? || char == '_' || char.ord > 0x9F |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be ascii_lowercase?
instead of ascii_letter?
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally yes. I think the lexer has already ruled out upper case letters as constants at any point where that might be relevant, though. So there should be no practical difference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or... maybe not: 😆
"".@FOO # Error: can't infer the type of instance variable '@FOO' of String
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we might want to tighten that down. Probably better in a follow-up. char.ord > 0x9F
looks also very permissive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uppercase instance variable names are actually allowed in Ruby. Here are the other direct uses of ident_start?
in the lexer:
- Reading a symbol literal.
:FOO
is definitely a valid symbol. - Reading a percent string literal. This is probably acceptable, because
%Q()
is a valid string literal. - Reading a global variable. The only ones that can be declared are lib external variables, and the parser rejects ones starting with an uppercase letter.
Uppercase def names are also allowed in Ruby. Even in Crystal they are allowed for lib funs, so a Call
node with an uppercase method name is not necessarily invalid. I don't think we need to do anything here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Fixes #10970, where
😂
is considered an operator:Also fixes some other places where setter methods are currently detected via
ascii_letter?
:y.😂
is not recognized as a setter target inside a multi-assign: