Couple of improvements and fixes #2

azizk · 2019-03-19T23:02:29Z

Hi!

I'm using Elixir every moment at work and would love to see improvements to the syntax highlighting definition. That's why I'm contributing these changes.

I hope to add a syntax test file soon in order to ensure a better quality and to avoid possible regressions when we modify the regular expressions.

Besides, what do you think about adopting the style of other syntax definition files like the one for Python in the PythonImproved package? Maybe that's the path we should take to make ours better and more ideal?

…ore.

…erscore.

princemaple

Hi @azizk , thanks for the PR. I think you misunderstood how \w works.
Other than that, I asked a few questions.
Looking forward to your reply and future improvements.

princemaple · 2019-03-19T23:04:38Z

Elixir.sublime-syntax


 variables:
-  module_name: '\b[A-Z]\w*\b'
+  module_name: '\b[A-Z][a-zA-Z0-9_]*\b'


\w is exactly that

iex(3)> "0" =~ ~r/\w/ true

Not quite, because it also matches Unicode characters like "á". The compiler rejects those and only allows ASCII characters in module names. "\w" is a nice shorthand but it does more than ASCII letters.

princemaple · 2019-03-19T23:08:12Z

Elixir.sublime-syntax

        - match: '{{module_name}}'
          scope: entity.name.protocol.elixir
-    - match: '^\s*(def|defmacro)\s+([a-zA-Z_]\w*(?:!|\?)?)(?:(\()|\s*)'
+    - match: '^\s*(def|defmacro)\s+(\w+(?:!|\?)?)(?:(\()|\s*)'


identifiers cannot start with numbers. see my previous comment, \w can match a number

Oh, that's true. So it's a mistake. This should do:

Suggested change

- match: '^\s*(def|defmacro)\s+(\w+(?:!|\?)?)(?:(\()|\s*)'

- match: '^\s*(def|defmacro)\s+([_[:alpha:]]\w*(?:!|\?)?)(?:(\()|\s*)'

Just tested, functions can actually start with unicode (or be entirely a unicode name).

Does [:alpha:] cover unicode?

Yep, it does 👍

princemaple · 2019-03-19T23:08:23Z

Elixir.sublime-syntax

        3: punctuation.definition.parameters.elixir
      push: function_body
-    - match: '^\s*(defp|defmacrop)\s+([a-zA-Z_]\w*(?:!|\?)?)(?:(\()|\s*)'
+    - match: '^\s*(defp|defmacrop)\s+(\w+(?:!|\?)?)(?:(\()|\s*)'


same as above

Suggested change

- match: '^\s*(defp|defmacrop)\s+(\w+(?:!|\?)?)(?:(\()|\s*)'

- match: '^\s*(defp|defmacrop)\s+([_[:alpha:]]\w*(?:!|\?)?)(?:(\()|\s*)'

princemaple · 2019-03-19T23:08:37Z

Elixir.sublime-syntax

          pop: true
        - include: main
-    - match: \b(is_atom|is_binary|is_bitstring|is_boolean|is_float|is_function|is_integer|is_list|is_map|is_nil|is_number|is_pid|is_port|is_record|is_reference|is_tuple|is_exception|abs|bit_size|byte_size|div|elem|hd|length|map_size|node|rem|round|tl|trunc|tuple_size)\b
+    - match: \b(is_(?:atom|binary|bitstring|boolean|float|function|integer|list|map|nil|number|pid|port|record|reference|tuple|exception)|abs|bit_size|byte_size|div|elem|hd|length|map_size|node|rem|round|tl|trunc|tuple_size)\b


princemaple · 2019-03-19T23:11:04Z

Elixir.sublime-syntax

      comment: as above, just doesn't need a 'end' and does a logic operation
      scope: keyword.operator.elixir
-    - match: '{{module_name}}'
+    - match: '{{module_name}}(?!:)'


when would you do that?

When you have a map or list containing keys starting with a capital letter: [A: 0, B: 1]

That seems rare, but OK :)

princemaple · 2019-03-19T23:12:41Z

Elixir.sublime-syntax

        - include: interpolated_elixir
        - include: escaped_char
-    - match: '(?<!:)(:)(?>[a-zA-Z_][\w@]*(?>[?!]|=(?![>=]))?|\<\>|===?|!==?|<<>>|<<<|>>>|~~~|::|<\-|\|>|=>|=~|=|/|\\\\|\*\*?|\.\.?\.?|>=?|<=?|&&?&?|\+\+?|\-\-?|\|\|?\|?|\!|@|\%?\{\}|%|\[\]|\^(\^\^)?)'
+    - match: '(?<!:)(:)(?>[_[:alpha:]][\w@]*(?>[?!]|=(?![>=]))?|\<\>|===?|!==?|<<>>|<<<|>>>|~~~|::|<\-|\|>|=>|=~|=|/|\\\\|\*\*?|\.\.?\.?|>=?|<=?|&&?&?|\+\+?|\-\-?|\|\|?\|?|\!|@|\%?\{\}|%|\[\]|\^(\^\^)?)'


what does this add？

Allows Unicode atoms like :á or even :_

guess that answers my question above :)

princemaple · 2019-03-19T23:19:04Z

Besides, what do you think about adopting the style of other syntax definition files like the one for Python in the PythonImproved package? Maybe that's the path we should take to make ours better and more ideal?

What do you mean exactly? I'm not quite sure what PythonImproved does that you are referring to specifically. Some examples might help.

FYI I have no intention messing with tmLanguage in my lifetime 😆 .
There is https://github.com/elixir-editors/elixir-tmbundle for that.

princemaple · 2019-03-19T23:54:38Z

Hey @azizk , looks like you did not tick the allow maintainer to modify your code checkbox. You will have to commit the suggestions yourself. They look good 👍 .

azizk · 2019-03-20T00:02:15Z

Thanks for the review! I added my suggestions.

Of course, I don't want any tmLanguage XML craziness either. The PythonImproved package uses YAML. It looks very extensive, complete and well thought out. I don't know Sublime's syntax file format fully yet, but one main difference to Elixir.sublime-syntax is that PythonImproved mainly uses beginCaptures and endCaptures which may enable Sublime to actually parse the text into a proper syntax tree for more accurate and contextful highlighting, which is better than just having some semi-tree structure with flat areas where regexes match blindly more or less. It's fine in the beginning but it can definitely be improved.

azizk · 2019-03-20T00:03:24Z

Oh, but I think I did tick the box to allow changes by you. Don't know why it's not working.

I'll take a look again tomorrow. Until then!

princemaple · 2019-03-20T00:12:57Z

I'll take a look again tomorrow. Until then!

Thank you very much!

Of course, I don't want any tmLanguage XML craziness either. The PythonImproved package uses YAML. It looks very extensive, complete and well thought out. I don't know Sublime's syntax file format fully yet, but one main difference to Elixir.sublime-syntax is that PythonImproved mainly uses beginCaptures and endCaptures which may enable Sublime to actually parse the text into a proper syntax tree for more accurate and contextful highlighting, which is better than just having some semi-tree structure with flat areas where regexes match blindly more or less. It's fine in the beginning but it can definitely be improved.

Looks like it's just a yaml version of tm language. They probably got sick of XML and decided to write yaml first and compile to tmLanguage.
I believe sublime-syntax is more superior and does mostly everything tmLanguage does and more. If you intend to stick with Sublime for the foreseeable future like I do, we should just write sublime-syntax. The Elixir syntax is converted from the tmLanguage version, which is done by Sublime itself. The definition is not optimal, I already know and have made improvements to it. Let's make it better together!

azizk · 2019-03-21T19:51:10Z

The definition is not optimal, I already know and have made improvements to it. Let's make it better together!

Yeah, let's do so! :)

I added one commit to fix the issue of matching numbers in function names like def 123() do end for example. While I was at it, I added a variable which should make the regex clearer.

princemaple · 2019-03-21T22:41:16Z

Thanks @azizk ! Merged.

azizk added 6 commits March 19, 2019 23:29

Elixir: also apply syntax to files with elixirc and iex in hashbang.

29899c4

Elixir: fix: module names can only have ASCII letters.

47b1f89

Elixir: fix: don't highlight as module name if used as map/kwlist key.

099085a

Elixir: reduce regex length by grouping is_* function names.

7dd0671

Elixir: fix: atoms can also begin with a Unicode character or undersc…

0110e18

…ore.

Elixir: fix: function names can begin with a Unicode character or und…

781e17f

…erscore.

princemaple requested changes Mar 19, 2019

View reviewed changes

Elixir: added id_begin variable; fixed matching function names.

94e8dd9

princemaple approved these changes Mar 21, 2019

View reviewed changes

princemaple merged commit fb77c42 into elixir-editors:master Mar 21, 2019

	- match: '^\s(def\|defmacro)\s+(\w+(?:!\|\?)?)(?:(\()\|\s)'
	- match: '^\s(def\|defmacro)\s+([_[:alpha:]]\w(?:!\|\?)?)(?:(\()\|\s*)'

	- match: '^\s(defp\|defmacrop)\s+(\w+(?:!\|\?)?)(?:(\()\|\s)'
	- match: '^\s(defp\|defmacrop)\s+([_[:alpha:]]\w(?:!\|\?)?)(?:(\()\|\s*)'

Couple of improvements and fixes #2

Couple of improvements and fixes #2

Uh oh!

Conversation

azizk commented Mar 19, 2019

Uh oh!

princemaple left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

princemaple Mar 19, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

princemaple commented Mar 19, 2019

Uh oh!

princemaple commented Mar 19, 2019

Uh oh!

azizk commented Mar 20, 2019

Uh oh!

azizk commented Mar 20, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

princemaple commented Mar 20, 2019

Uh oh!

azizk commented Mar 21, 2019

Uh oh!

princemaple commented Mar 21, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

princemaple Mar 19, 2019 •

edited

Loading

azizk commented Mar 20, 2019 •

edited

Loading