Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(shell) Shell tokenization doesn't treat hyphens as parts of words #2668

Closed
sirosen opened this issue Sep 1, 2020 · 2 comments · Fixed by #2669
Closed

(shell) Shell tokenization doesn't treat hyphens as parts of words #2668

sirosen opened this issue Sep 1, 2020 · 2 comments · Fixed by #2669
Labels
bug help welcome Could use help from community language

Comments

@sirosen
Copy link
Contributor

sirosen commented Sep 1, 2020

Describe the issue

I've run into a case similar to #1740 and other issues. In my case, we had --enable-foo on our site, and highlightjs thinks that enable is a keyword.

Sample Code to Reproduce

minimal full reproduction
<!DOCTYPE html>
<html lang="en">
<head>
<link rel="stylesheet"
      href="//cdnjs.cloudflare.com/ajax/libs/highlight.js/10.1.2/styles/github.min.css">
<script src="//cdnjs.cloudflare.com/ajax/libs/highlight.js/10.1.2/highlight.min.js"></script>
</head>
<body>

<pre class="highlightjs highlight"><code data-lang="sh" class="language-sh hljs">--enable-https</code></pre>

<script>hljs.initHighlightingOnLoad()</script>

</body>
</html>

Expected behavior

In this specific case, highlightjs shouldn't treat enable as a keyword. More generally, when parsing shell scripts, I believe that - should be considered a "word character".
I'm aware from reading previous issues that when this comes up there's hesitance to try to do much with the bash parsing unless someone is ready to write a real parser. But I think hyphenate commands and options are a common case. Even if there are cases this might start to fail on which worked before (I can't think of any, but maybe that's my limited imagination), they're probably much more exotic than --set-foo, --enable-bar, and, cmd-case-while-blahblahblah.

Additional context

I'm okay with the idea of highlightjs closing out this issue if it's just too much the same as its predecessors.

If backwards compatibility is a major concern, perhaps this could be an option which is set by the user somehow?
hljs.SetParserOption("shell", "hyphens-are-wordchars")? (I have very little notion of how highlightjs works on the inside and how configurable the parsers are or ought to be by design.)

@sirosen sirosen added bug help welcome Could use help from community language labels Sep 1, 2020
@joshgoebel
Copy link
Member

I'd be open to a patch with a few tests that changes the keyword $pattern to include "-" which I think would resolve this for most common cases... and we'll see if it conflicts with any existing tests or edge cases.

It looks like perhaps we already do this for "." and "_"... since none of our keywords have . or _ in them.

@sirosen Want to make PR? :)

@sirosen
Copy link
Contributor Author

sirosen commented Sep 1, 2020

I can try to take a crack at it, but I warn you that I have very little JS background! 😄

sirosen added a commit to sirosen/highlight.js that referenced this issue Sep 1, 2020
In bash, option, param, and command names may contain builtins as
hyphen-delimited components. As in `foo --set-bar`.

In order to handle this, just add `-` to the pattern's set of word
characters. Includes a new test for bash tokens containing keywords,
which checks hyphens and underscores for a few common cases.

resolves highlightjs#2668
sirosen added a commit to sirosen/highlight.js that referenced this issue Sep 1, 2020
In bash, option, param, and command names may contain builtins as
hyphen-delimited components. As in `foo --set-bar`.

In order to handle this, just add `-` to the pattern's set of word
characters. Includes a new test for bash tokens containing keywords,
which checks hyphens and underscores for a few common cases.

Against current `master`, the new test fails, which helps to confirm
the fix.

resolves highlightjs#2668
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug help welcome Could use help from community language
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants