Skip to content

Conversation

@azizk
Copy link
Collaborator

@azizk azizk commented Mar 19, 2019

Hi!

I'm using Elixir every moment at work and would love to see improvements to the syntax highlighting definition. That's why I'm contributing these changes.

I hope to add a syntax test file soon in order to ensure a better quality and to avoid possible regressions when we modify the regular expressions.

Besides, what do you think about adopting the style of other syntax definition files like the one for Python in the PythonImproved package? Maybe that's the path we should take to make ours better and more ideal?

Copy link
Collaborator

@princemaple princemaple left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @azizk , thanks for the PR. I think you misunderstood how \w works.
Other than that, I asked a few questions.
Looking forward to your reply and future improvements.


variables:
module_name: '\b[A-Z]\w*\b'
module_name: '\b[A-Z][a-zA-Z0-9_]*\b'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

\w is exactly that

iex(3)> "0" =~ ~r/\w/
true

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite, because it also matches Unicode characters like "á". The compiler rejects those and only allows ASCII characters in module names. "\w" is a nice shorthand but it does more than ASCII letters.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

- match: '{{module_name}}'
scope: entity.name.protocol.elixir
- match: '^\s*(def|defmacro)\s+([a-zA-Z_]\w*(?:!|\?)?)(?:(\()|\s*)'
- match: '^\s*(def|defmacro)\s+(\w+(?:!|\?)?)(?:(\()|\s*)'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

identifiers cannot start with numbers. see my previous comment, \w can match a number

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that's true. So it's a mistake. This should do:

Suggested change
- match: '^\s*(def|defmacro)\s+(\w+(?:!|\?)?)(?:(\()|\s*)'
- match: '^\s*(def|defmacro)\s+([_[:alpha:]]\w*(?:!|\?)?)(?:(\()|\s*)'

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just tested, functions can actually start with unicode (or be entirely a unicode name).

Copy link
Collaborator

@princemaple princemaple Mar 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does [:alpha:] cover unicode?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, it does 👍

3: punctuation.definition.parameters.elixir
push: function_body
- match: '^\s*(defp|defmacrop)\s+([a-zA-Z_]\w*(?:!|\?)?)(?:(\()|\s*)'
- match: '^\s*(defp|defmacrop)\s+(\w+(?:!|\?)?)(?:(\()|\s*)'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- match: '^\s*(defp|defmacrop)\s+(\w+(?:!|\?)?)(?:(\()|\s*)'
- match: '^\s*(defp|defmacrop)\s+([_[:alpha:]]\w*(?:!|\?)?)(?:(\()|\s*)'

pop: true
- include: main
- match: \b(is_atom|is_binary|is_bitstring|is_boolean|is_float|is_function|is_integer|is_list|is_map|is_nil|is_number|is_pid|is_port|is_record|is_reference|is_tuple|is_exception|abs|bit_size|byte_size|div|elem|hd|length|map_size|node|rem|round|tl|trunc|tuple_size)\b
- match: \b(is_(?:atom|binary|bitstring|boolean|float|function|integer|list|map|nil|number|pid|port|record|reference|tuple|exception)|abs|bit_size|byte_size|div|elem|hd|length|map_size|node|rem|round|tl|trunc|tuple_size)\b
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

comment: as above, just doesn't need a 'end' and does a logic operation
scope: keyword.operator.elixir
- match: '{{module_name}}'
- match: '{{module_name}}(?!:)'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when would you do that?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you have a map or list containing keys starting with a capital letter: [A: 0, B: 1]

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems rare, but OK :)

- include: interpolated_elixir
- include: escaped_char
- match: '(?<!:)(:)(?>[a-zA-Z_][\w@]*(?>[?!]|=(?![>=]))?|\<\>|===?|!==?|<<>>|<<<|>>>|~~~|::|<\-|\|>|=>|=~|=|/|\\\\|\*\*?|\.\.?\.?|>=?|<=?|&&?&?|\+\+?|\-\-?|\|\|?\|?|\!|@|\%?\{\}|%|\[\]|\^(\^\^)?)'
- match: '(?<!:)(:)(?>[_[:alpha:]][\w@]*(?>[?!]|=(?![>=]))?|\<\>|===?|!==?|<<>>|<<<|>>>|~~~|::|<\-|\|>|=>|=~|=|/|\\\\|\*\*?|\.\.?\.?|>=?|<=?|&&?&?|\+\+?|\-\-?|\|\|?\|?|\!|@|\%?\{\}|%|\[\]|\^(\^\^)?)'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does this add?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allows Unicode atoms like or even :_

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

guess that answers my question above :)

@princemaple
Copy link
Collaborator

Besides, what do you think about adopting the style of other syntax definition files like the one for Python in the PythonImproved package? Maybe that's the path we should take to make ours better and more ideal?

What do you mean exactly? I'm not quite sure what PythonImproved does that you are referring to specifically. Some examples might help.

FYI I have no intention messing with tmLanguage in my lifetime 😆 .
There is https://github.com/elixir-editors/elixir-tmbundle for that.

@princemaple
Copy link
Collaborator

Hey @azizk , looks like you did not tick the allow maintainer to modify your code checkbox. You will have to commit the suggestions yourself. They look good 👍 .

@azizk
Copy link
Collaborator Author

azizk commented Mar 20, 2019

Thanks for the review! I added my suggestions.

Of course, I don't want any tmLanguage XML craziness either. The PythonImproved package uses YAML. It looks very extensive, complete and well thought out. I don't know Sublime's syntax file format fully yet, but one main difference to Elixir.sublime-syntax is that PythonImproved mainly uses beginCaptures and endCaptures which may enable Sublime to actually parse the text into a proper syntax tree for more accurate and contextful highlighting, which is better than just having some semi-tree structure with flat areas where regexes match blindly more or less. It's fine in the beginning but it can definitely be improved.

@azizk
Copy link
Collaborator Author

azizk commented Mar 20, 2019

Oh, but I think I did tick the box to allow changes by you. Don't know why it's not working.

I'll take a look again tomorrow. Until then!

@princemaple
Copy link
Collaborator

I'll take a look again tomorrow. Until then!

Thank you very much!

Of course, I don't want any tmLanguage XML craziness either. The PythonImproved package uses YAML. It looks very extensive, complete and well thought out. I don't know Sublime's syntax file format fully yet, but one main difference to Elixir.sublime-syntax is that PythonImproved mainly uses beginCaptures and endCaptures which may enable Sublime to actually parse the text into a proper syntax tree for more accurate and contextful highlighting, which is better than just having some semi-tree structure with flat areas where regexes match blindly more or less. It's fine in the beginning but it can definitely be improved.

Looks like it's just a yaml version of tm language. They probably got sick of XML and decided to write yaml first and compile to tmLanguage.
I believe sublime-syntax is more superior and does mostly everything tmLanguage does and more. If you intend to stick with Sublime for the foreseeable future like I do, we should just write sublime-syntax. The Elixir syntax is converted from the tmLanguage version, which is done by Sublime itself. The definition is not optimal, I already know and have made improvements to it. Let's make it better together!

@azizk
Copy link
Collaborator Author

azizk commented Mar 21, 2019

The definition is not optimal, I already know and have made improvements to it. Let's make it better together!

Yeah, let's do so! :)

I added one commit to fix the issue of matching numbers in function names like def 123() do end for example. While I was at it, I added a variable which should make the regex clearer.

@princemaple princemaple merged commit fb77c42 into elixir-editors:master Mar 21, 2019
@princemaple
Copy link
Collaborator

Thanks @azizk ! Merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants