Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(queries): add highlight queries #41

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

jimeh
Copy link

@jimeh jimeh commented Nov 24, 2022

Adds a set of queries for syntax highlighting yaml, similar to other tree-sitter grammar projects.

This is the same queries from my PR against Emacs' tree-sitter-langs project: emacs-tree-sitter/tree-sitter-langs#134

Screenshot

Here's how syntax highlighting turns out in Emacs, using the doom-vibrant theme:

Screen-Shot-2022-11-24-01-06-01 42

@lukepistrol
Copy link

Would love this getting merged!

@lcrownover
Copy link

Would this also support highlighting of variable interpolation? Ansible and Helm are two systems off the top of my head that use this and I'd love to be able to color the interpolation differently.

For example:

"My name is {{ name }}"

Where My name is would be normal string, {{ and }} would be some token, and name would be another token type, allowing us to color those tokens differently.

Currently:

- name: check if the correct loki version is downloaded
  ansible.builtin.stat:
    path: "/opt/loki/{{loki_server_version}}/loki-linux-amd64.zip"
  register: loki_server_check_downloaded
  changed_when: false

TSPlayground:

      block_sequence_item [9, 0] - [13, 21]
        block_node [9, 2] - [13, 21]
          block_mapping [9, 2] - [13, 21]
            block_mapping_pair [9, 2] - [9, 55]
              key: flow_node [9, 2] - [9, 6]
                plain_scalar [9, 2] - [9, 6]
                  string_scalar [9, 2] - [9, 6]
              value: flow_node [9, 8] - [9, 55]
                plain_scalar [9, 8] - [9, 55]
                  string_scalar [9, 8] - [9, 55]
--- relevant section
            block_mapping_pair [10, 2] - [11, 66]
              key: flow_node [10, 2] - [10, 22]
                plain_scalar [10, 2] - [10, 22]
                  string_scalar [10, 2] - [10, 22]
              value: block_node [11, 4] - [11, 66]
                block_mapping [11, 4] - [11, 66]
                  block_mapping_pair [11, 4] - [11, 66]
                    key: flow_node [11, 4] - [11, 8]
                      plain_scalar [11, 4] - [11, 8]
                        string_scalar [11, 4] - [11, 8]
                    value: flow_node [11, 10] - [11, 66]
                      double_quote_scalar [11, 10] - [11, 66]
--- end relevant section
            block_mapping_pair [12, 2] - [12, 40]
              key: flow_node [12, 2] - [12, 10]
                plain_scalar [12, 2] - [12, 10]
                  string_scalar [12, 2] - [12, 10]
              value: flow_node [12, 12] - [12, 40]
                plain_scalar [12, 12] - [12, 40]
                  string_scalar [12, 12] - [12, 40]
            block_mapping_pair [13, 2] - [13, 21]
              key: flow_node [13, 2] - [13, 14]
                plain_scalar [13, 2] - [13, 14]
                  string_scalar [13, 2] - [13, 14]
              value: flow_node [13, 16] - [13, 21]
                plain_scalar [13, 16] - [13, 21]
                  boolean_scalar [13, 16] - [13, 21]

@jimeh
Copy link
Author

jimeh commented Nov 14, 2023

@lcrownover I might be wrong, but I believe Ansible and Helm/Gotemplate style variable interpolation would require more than just syntax highlighting queries.

The syntax tree built by the parser only exposes string nodes, so there's no nodes for interpolated variables available to query against for syntax highlighting purposes I'm afraid.

@lcrownover
Copy link

@jimeh Maybe I'm showing my ignorance on the inner workings of treesitter, but why is python able to have this functionality:

print(f"Starting server on port {port}")

(string) ; [29:11 - 43]
 (string_start) ; [29:11 - 12]
 (string_content) ; [29:13 - 36]
 (interpolation) ; [29:37 - 42]
  expression: (identifier) ; [29:38 - 41]
 (string_end) ; [29:43 - 43]

Yet we can't modify the yaml parser to support something similar:

src: "{{ authselect_pam_access_conf_src }}"

value: (flow_node) ; [4:10 - 47]
 (double_quote_scalar) ; [4:10 - 47]

Isn't this the module that governs how the tokens get parsed?

@jimeh
Copy link
Author

jimeh commented Nov 16, 2023

@lcrownover Apologies, I probably should have elaborated a bit more. The major difference is that in Python string interpolation is part of the language and syntax.

Ansible and Helm templating stuff are not part of YAML's syntax. Hence I can't do anything with the highlight queries in this PR to syntax highlight Ansible and Helm things.

That said, the tree-sitter YAML parser could be modified to support parsing those things, but that's outside of my area expertise. And I would not bet on that happening as they are not part of the YAML specification.

What's more likely is that someone forks the YAML parser here to try and make Ansible and Helm specific tree-sitter parsers.

I think Ansible would be a relatively simple fork, as it is should only be a matter of dealing with string interpolation within quoted strings.

Helm however would be much more complex, because unlike Ansible, Helm templates are not valid YAML syntax. It's more like trying to parse Ansible Jinja2 template files that happen to output YAML. And it would need to fully support Go's text template engine that Helm uses as well.

Hopefully that's helped clarify things a bit :)

@lcrownover
Copy link

@jimeh Ahh, I see. It's been so long since I made the original comment that I forgot that this is a PR for the module, not the module itself. My mistake!

I do agree with you that it'd be quite a pain to write a parser to handle all the edge cases that have been laid over the YAML standard. Having an ansible treesitter module would maybe work, though due to the overloading of YAML syntax, it's already hard enough to differentiate between a standard YAML file, an Ansible file (jinja2), or a Helm template as they all share the same extension.

@jimeh
Copy link
Author

jimeh commented Nov 16, 2023

Yeah, I don't blame you. To make things slightly more convoluted, Ansible technically has two common uses of YAML:

  • Regular playbook files which are valid YAML syntax, with the addition of mustache style string interpolation within double quoted strings.
  • Full on template files which are rendered with Jinja2. These would live within a templates directory, and typically have a name that ends in .j2, though that's not required. And they don't have to produce YAML, they could produce anything.

</rant>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants