New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Forbid comments with more than two dashes #87

Merged
merged 8 commits into from Jul 1, 2018

Conversation

Projects
None yet
3 participants
@TangSteven
Contributor

TangSteven commented Jun 27, 2018

Description of the Change

Any comment ending with more than 2 dashes is now considered as invalid.illegal.bad-comments-or-CDATA.xml

Alternate Designs

Benefits

Grammar no longer accepts ---> comment ending that is forbidden by XML grammar.

Possible Drawbacks

Applicable Issues

#84

@TangSteven TangSteven changed the title from :bug: Forbid comments from ending in triple dash to :bug: Forbid comments with more than two dashes Jun 27, 2018

@50Wliu

Thanks for the PR, @TangSteven! Just one question :).

Additionally, would you mind adding a new spec for this to ensure that this behavior doesn't regress? Thanks again 🙇

@@ -403,3 +403,9 @@
'name': 'punctuation.definition.comment.xml'
'end': '--%?>'
'name': 'comment.block.xml'
'patterns': [
{
'match': '-{3,}>'

This comment has been minimized.

@50Wliu

50Wliu Jun 28, 2018

Member

Is there a reason this is {3,} and not {3}?

This comment has been minimized.

@TangSteven

TangSteven Jun 28, 2018

Contributor

Well, if 3 dashes is invalid grammar, anything longer than 2 should also be invalid.

So if the end arrow was four long (---->)
{3} would only highlight - --->
{3,} highlights the entire arrow which makes more sense, ---->

This comment has been minimized.

@50Wliu

50Wliu Jun 28, 2018

Member

Hmm, I feel like we should only be highlighting the last three.
@Ingramz any opinions on this?

@TangSteven

This comment has been minimized.

Contributor

TangSteven commented Jun 28, 2018

I'm not quite sure how to add a new spec, would you mind pointing me in the right direction?

@50Wliu

This comment has been minimized.

Member

50Wliu commented Jun 28, 2018

I'm not quite sure how to add a new spec, would you mind pointing me in the right direction?

Yeah, of course! Specs are added to https://github.com/atom/language-xml/blob/master/spec/xml-spec.coffee. It looks like there aren't a lot of specs yet for this language, but you can take a look at language-html's for inspiration.

TangSteven added some commits Jun 28, 2018

@Ingramz

This comment has been minimized.

Ingramz commented Jun 28, 2018

The specification mentions that the grammar for comments is following:

Comment ::= '<!--' ((Char - '-') | ('-' (Char - '-')))* '-->'

Simplified:

Comment ::= '<!--' (CharNoDash | ('-' CharNoDash))* '-->'

Which can be interpreted as -- is illegal if it is not followed by >.

This pull request in its current state only covers the example provided in specification, but not the general case, which leaves cases such as the following valid:

<!-- Hello -- World -->

Do note that the following is still valid:

<!--- Hello World -->

I think it would be easier to just make a rule within the body of comment that just searched for --(?!>) and highlighted that part as the source of error. This way it doesn't really matter if it is 3, 4 or 10 hyphens and whether they are part of the comment body or the ending tag. Should be cleaner future maintenance wise too.

Edit: Also the comments for JSP (<%-- --%>) and XML (<!-- -->) should be broken apart as I think the double hyphen is only disallowed in XML comments and not in JSP.

@50Wliu 50Wliu changed the title from :bug: Forbid comments with more than two dashes to Forbid comments with more than two dashes Jun 28, 2018

TangSteven added some commits Jun 28, 2018

@TangSteven

This comment has been minimized.

Contributor

TangSteven commented Jun 28, 2018

I added a negative lookbehind (?<!<[!%]), as without it incorrectly matches the <!-- of the next comment. Example:

capture

With the negative lookbehind:
capture2

How would I break apart the JSP and XML comments?

@Ingramz

This comment has been minimized.

Ingramz commented Jun 29, 2018

Just create two separate rules like this:

  'comments':
    'patterns': [
      {
        'begin': '<%--'
        'captures':
          '0':
            'name': 'punctuation.definition.comment.xml'
        'end': '--%>'
        'name': 'comment.block.xml'
      }
      {
        'begin': '<!--'
        'captures':
          '0':
            'name': 'punctuation.definition.comment.xml'
        'end': '-->'
        'name': 'comment.block.xml'
      }
    ]

Then work only on the second one.

The lookbehind should be unnecessary. From the point where the error is first registered, the rest of the document will be invalid. To make it a little clearer where the error comes from, we can highlight the first occurrence only by making use of begin/end rule:

        'patterns': [
          {
            'begin': '--(?!>)'
            'beginCaptures':
              '0':
                'name': 'invalid.illegal.bad-comments-or-CDATA.xml'
          }
        ]

TangSteven added some commits Jun 29, 2018

Split JSP and XML comments
Also, only highlight first occurrence of `--`
@TangSteven

This comment has been minimized.

Contributor

TangSteven commented Jun 29, 2018

Updated!

@50Wliu 50Wliu merged commit 7b54428 into atom:master Jul 1, 2018

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment