New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for sigils containing integers #13448
base: main
Are you sure you want to change the base?
Conversation
This commit adds support for sigils containing integers. Integers must be proceeded by an upper case alpha, but can otherwise be mixed (e.g., `~A1B2C3`).
@@ -157,7 +157,7 @@ is_sigil({Name, 2}) -> | |||
case Letters of | |||
[L] when L >= $a, L =< $z -> true; | |||
[] -> false; | |||
Letters -> lists:all(fun(L) -> L >= $A andalso L =< $Z end, Letters) | |||
Letters -> lists:all(fun(L) -> L >= 0 andalso L =< $Z end, Letters) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could make this filter tighter and ensure the sequence is correct, OTOH the tokenizer should have caught an issue already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anyone can generate any function. For example, I could do def unquote(:"sigil_\0") do
and we don't want to import that, so we should be strict here too. :)
@@ -1564,10 +1564,15 @@ tokenize_sigil_name([S | T], [], Line, Column, Scope, Tokens) when ?is_upcase(S) | |||
% If we have an uppercase letter, we keep tokenizing the name. | |||
tokenize_sigil_name([S | T], NameAcc, Line, Column, Scope, Tokens) when ?is_upcase(S) -> | |||
tokenize_sigil_name(T, [S | NameAcc], Line, Column + 1, Scope, Tokens); | |||
% A digit is allowed but an upcase or digit must proceed it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also make this tighter by checking that the first entry in the accumulator is an upper case, OTOH it's already caught.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should break this in two functions. The first clause will call either tokenize_upper_sigil_name or tokenize_lower_sigil_name. I think it will help remove conditionals. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You know how I love to minimize conditionals!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@josevalim do you think we should revise the error messages per the change such that it's limited to the potential error that can be encountered? In other words, we would have two different error messages depending on the path. If we are on the lower path, then we only emit an error about how a sigil may starting with lower case may only contain a single lower alpha; if we are on the upper path, we emit an error specific to that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unifying into a single error message is also fine by me
Note: I will update sigil docs when a signal this is to be accepted is received :) |
I come here with zero context. Was this discussed in the ML or somewhere else? |
As is tradition, I give you a context fail 😄 No, this was not discussed on the ML, it simply came up informally on EEF slack. The original context is a potential need to add a sigil that corresponds to a library name that ends with a digit |
Gotcha @starbelly no probs, I was just checking. Thanks for the info. 🙏 |
% With a lowercase letter and a non-empty NameAcc we return an error. | ||
tokenize_sigil_name([S | _T] = Original, [_ | _] = NameAcc, _Line, _Column, _Scope, _Tokens) when ?is_downcase(S) -> | ||
Message = "invalid sigil name, it should be either a one-letter lowercase letter or a" ++ | ||
" sequence of uppercase letters only, got: ", | ||
" sequence of uppercase letters and digits, starting with uppercase letters only, got: ", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
" sequence of uppercase letters and digits, starting with uppercase letters only, got: ", | |
" uppercase letter optionally followed by uppercase letters and digits, got: ", |
This commit adds support for sigils containing integers. Integers must be proceeded by an upper case alpha, but can otherwise be mixed (e.g.,
~A1B2C3
).