Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for sigils containing integers #13448

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

starbelly
Copy link
Contributor

This commit adds support for sigils containing integers. Integers must be proceeded by an upper case alpha, but can otherwise be mixed (e.g., ~A1B2C3).

This commit adds support for sigils containing integers. Integers must be proceeded
by an upper case alpha, but can otherwise be mixed (e.g., `~A1B2C3`).
@@ -157,7 +157,7 @@ is_sigil({Name, 2}) ->
case Letters of
[L] when L >= $a, L =< $z -> true;
[] -> false;
Letters -> lists:all(fun(L) -> L >= $A andalso L =< $Z end, Letters)
Letters -> lists:all(fun(L) -> L >= 0 andalso L =< $Z end, Letters)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could make this filter tighter and ensure the sequence is correct, OTOH the tokenizer should have caught an issue already.

Copy link
Member

@josevalim josevalim Mar 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyone can generate any function. For example, I could do def unquote(:"sigil_\0") do and we don't want to import that, so we should be strict here too. :)

@@ -1564,10 +1564,15 @@ tokenize_sigil_name([S | T], [], Line, Column, Scope, Tokens) when ?is_upcase(S)
% If we have an uppercase letter, we keep tokenizing the name.
tokenize_sigil_name([S | T], NameAcc, Line, Column, Scope, Tokens) when ?is_upcase(S) ->
tokenize_sigil_name(T, [S | NameAcc], Line, Column + 1, Scope, Tokens);
% A digit is allowed but an upcase or digit must proceed it.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also make this tighter by checking that the first entry in the accumulator is an upper case, OTOH it's already caught.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should break this in two functions. The first clause will call either tokenize_upper_sigil_name or tokenize_lower_sigil_name. I think it will help remove conditionals. :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You know how I love to minimize conditionals!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@josevalim do you think we should revise the error messages per the change such that it's limited to the potential error that can be encountered? In other words, we would have two different error messages depending on the path. If we are on the lower path, then we only emit an error about how a sigil may starting with lower case may only contain a single lower alpha; if we are on the upper path, we emit an error specific to that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unifying into a single error message is also fine by me

@starbelly
Copy link
Contributor Author

Note: I will update sigil docs when a signal this is to be accepted is received :)

@whatyouhide
Copy link
Member

I come here with zero context. Was this discussed in the ML or somewhere else?

@starbelly
Copy link
Contributor Author

I come here with zero context. Was this discussed in the ML or somewhere else?

As is tradition, I give you a context fail 😄 No, this was not discussed on the ML, it simply came up informally on EEF slack. The original context is a potential need to add a sigil that corresponds to a library name that ends with a digit ~HL7.

@whatyouhide
Copy link
Member

Gotcha @starbelly no probs, I was just checking. Thanks for the info. 🙏

% With a lowercase letter and a non-empty NameAcc we return an error.
tokenize_sigil_name([S | _T] = Original, [_ | _] = NameAcc, _Line, _Column, _Scope, _Tokens) when ?is_downcase(S) ->
Message = "invalid sigil name, it should be either a one-letter lowercase letter or a" ++
" sequence of uppercase letters only, got: ",
" sequence of uppercase letters and digits, starting with uppercase letters only, got: ",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
" sequence of uppercase letters and digits, starting with uppercase letters only, got: ",
" uppercase letter optionally followed by uppercase letters and digits, got: ",

@whatyouhide whatyouhide changed the title Add support for sigils containing integers. Add support for sigils containing integers Mar 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants