Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow { to appear in a concat statement string #6

Open
alexlafroscia opened this issue Mar 2, 2021 · 0 comments
Open

Allow { to appear in a concat statement string #6

alexlafroscia opened this issue Mar 2, 2021 · 0 comments
Labels
bug Something isn't working

Comments

@alexlafroscia
Copy link
Collaborator

alexlafroscia commented Mar 2, 2021

As part of implementing concat_statement parsing, I had to take a short-cut that helped to disambiguate a string as an attribute

<div class="foo"></div>

From a "concat statement"

<div class="bar {{foo}}"></div>

This shortcut was to tell the parser that, when used as an attribute value, a string cannot contain a {. This allows the grammar, when encountering a {, to know that we're in fact parsing a concat_statement instead.

There are a few things we could to do resolve this:

  1. Remove string_literal from the possible attribute values

    This would resolve the issue by removing the ambiguity while parsing, by removing one of the possibilities. It should be safe from a syntax-highlighting perspective, because we can highlight the concat_statement as a string anyway. However, the grammar would be "less correct", which may or may not really matter to anyone.

  2. Re-write the _mustache_safe_* string literal matches to allow { but not {{

Technically the invalid sequence is {{ not just {. However, I struggled to write a regular expression that would match a string only if it did not contain {{ together in a single sequence, since the "contents of a string" regex is based on characters that are invalid, not ones that are valid. When using the character set exclusion regular expression syntax, you provide a list of individual characters; you can't put two together in there that only preclude the regex from matching when found in-sequence. Some other kind of regular expression would be needed.

  1. Move scanning for _mustache_safe_* string literals to C

We would have more control over how the file is parsed in C, where we can use the lexer to go character by character and identify the string as a concat_statement when we find the {{ in sequence.


Overall, this would be nice to fix, but is in my opinion low-priority because it will only confuse the parser when an element has an attribute value of a string with a single { in it, in which case it should still be highlighted as a concat_statement anyway so things will look correct. I'm not even sure that this bug is impactful enough to be worth fixing at all!

alexlafroscia added a commit that referenced this issue Mar 2, 2021
Not supporting this was reported in #2

Note that we use a short-cut here to enable mustache statements inside
of a string that disallows a `{` to appear inside a "concat statement".

Technically this should be allowed, though I would say it's quite rare.

Issue #6 was files to keep track of eventually fixing that, though it
will likely require moving the scanner into C to pull that off.

The problem is that `{` alone is safe, but `{{` together is not, and I'm
not sure you can write a regex that TreeSitter supports that can capture
that correctly.
@alexlafroscia alexlafroscia added the bug Something isn't working label Mar 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant