Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(bash) Language does not highlight heredocs #2622

Closed
bric3 opened this issue Jun 30, 2020 · 31 comments · Fixed by #2684
Closed

(bash) Language does not highlight heredocs #2622

bric3 opened this issue Jun 30, 2020 · 31 comments · Fixed by #2684
Labels
bug help welcome Could use help from community language

Comments

@bric3
Copy link

bric3 commented Jun 30, 2020

Describe the issue

Is is expected that the bash language do no highlight some specific characters. in the screenshot below
I would have expected the following would be highlighted

  • | a pipe
  • \ the continue command marker at the end of the line
  • > < Stream redirections
  • << and <<<, here doc, and here string
  • $(...) subshell variable (maybe not what's inside but at least the whole variable or just the parenthesis)

shell
image

bash
image

Which language seems to have the issue?

I believe it's the bash language which is also used by the shell language.

Are you using highlight or highlightAuto?

I'm not sure of the question, the script uses : hljs.initHighlightingOnLoad();

...

Sample Code to Reproduce

Reproducer : https://jsfiddle.net/kcyd487f/

Expected behavior

Highlight the menetionned element above. Maybe other characters elements may be interesting.

For heredoc I'm not sure everything could be highlighted, especially the multiline text, as the EOF marker can be anything.

Additional context

I'm using asciidoctor, but the issue can be reproduced in the JSfiddle too.

@bric3 bric3 added bug help welcome Could use help from community language labels Jun 30, 2020
@joshgoebel
Copy link
Member

joshgoebel commented Jul 1, 2020

Bash is a very simple grammar and could probably use a champion if one wanted to step up. We definitely could improve support for here docs and here strings (if we can detect them).

  • Pretty sure we already handle sub-shell variables. (ok, no, we only handle $() variables inside a string)
  • Shell doesn't know anything about \. That's possibly something we could fix that would improve shell substantially.

Operators is something we traditionally do not highlight (on purpose), but there is a separate discussion regarding that:

#2500

@joshgoebel joshgoebel changed the title Bash not highlighting special characters (bash) Language does not highlight heredocs, subshell variables, etc. Jul 1, 2020
@bric3
Copy link
Author

bric3 commented Jul 2, 2020

@yyyc514 Thanks for the pointers.

@reelbeelveel
Copy link

Also noticing numbers/integers aren't highlighted, as in:

x=3
phoneNum=5557778888

etc.

@joshgoebel
Copy link
Member

#2684

@joshgoebel
Copy link
Member

It's not immediately clear to me how or even if subshells should be highlighted... since it's currently not treated specially it'll currently be highlighted exactly the same as the surrounding shell itself, which is what you'd expect (for a subshell)... if the debate is just coloring the $() or not then I think we might already say "no" for the same reason we don't usually highlight operators.

@joshgoebel
Copy link
Member

joshgoebel commented Sep 11, 2020

But if you actually wanted to get involved yourself then some of this might indeed fall under: (As mentioned before)

#2500

@joshgoebel
Copy link
Member

joshgoebel commented Sep 11, 2020

To review:

@joshgoebel
Copy link
Member

joshgoebel commented Sep 11, 2020

I might also back out number support until we have a Bash person willing to work on this and think thru all the implications... it seems there are lots of edge cases where perhaps numbers should not be highlighted, and Bash provides no clues for when something is a Number vs a string, does it?

@egor-rogov
Copy link
Collaborator

or are we just wanting the <<< part to be highlighted as part of the string?

I think no.

@joshgoebel
Copy link
Member

Well that's how it's done for here-doc though... so if we don't do it for <<< is that being inconsistent?

@egor-rogov
Copy link
Collaborator

Hm, right, I missed it. I think it's better to not color <<-? (just like we do for redirection and other "operators").

@joshgoebel
Copy link
Member

joshgoebel commented Sep 11, 2020

I think it's better to not color <<-? (just like we do for redirection and other "operators").

But yet we ALWAYS color ", which is what that really is here... <<-BLAH and BLAH are opening and closing quotes, part of the string. :-)

Really <<< is for "words"... (which can be quoted but don't have to be) so <<<heyimalongword would be a string... and coloring it as a string WITHOUT the <<< is just confusing because then it looks like <<< is an operator, which it's not - it's "quoting" for the string.

I think I just persuaded myself that these things are "part of the string".

@joshgoebel
Copy link
Member

Of course if we ever had more nuance we might highlight:

<string>
<quote><<-BLAH</quote>
Content here!
<quote>BLAH</quote>
</sting>

But that's a talk for another day.

@egor-rogov
Copy link
Collaborator

I think we shouldn't do anything about highlighting <<< specially.
E. g. in <<<word word can be a quoted string. We should highlight <<< '$foo' and <<< "$foo" differently. Now it just works, because we actually don't pay attention to <<<. Making <<< part of a string complicates things (imho unnecessary).

And if so, we also shouldn't color here-doc's << for the sake of consistency (:

@joshgoebel
Copy link
Member

I don't really see any justification/logic for your reasoning though, just a preference. (well other than "it complicates") These "bookends" are quite literally string delimiters, which we have always highlighted AFAIK - and also currently highlight in other languages that support here-docs: Ruby, PHP, C++...

We should highlight <<< '$foo' and <<< "$foo" differently.

Not sure I follow this?

Now it just works

Or it's badly broken... Just highlighting word in <<<word seems quite strange and different than we handle strings in any other circumstance.

Making <<< part of a string complicates things (imho unnecessary).

It's just another variant... some languages have 20 string variants, doesn't seem that much more complex.

@joshgoebel
Copy link
Member

Long-term it seems perhaps we need a string.delimiter class and then we could split strings into their component pieces.

@egor-rogov
Copy link
Collaborator

We should highlight <<< '$foo' and <<< "$foo" differently.

Not sure I follow this?

Those are differently quoted strings:

$ export FOO=pwd
$ cat <<< '$FOO'
$FOO
$ cat <<< "$FOO"
pwd
$ cat <<< $FOO
pwd
$ cat <<< `$FOO`
/home/egor

So it's not just another variant, it's rather variants of BRACED_VAR, QUOTE_STRING, APOS_STRING and maybe some other modes.

@joshgoebel
Copy link
Member

joshgoebel commented Sep 11, 2020

Are you suggesting the string highlighting itself should change depending on whether it's a static string or allows interpolation? i don't think we do that in any other language.

I see now where you might consider <<< more of an operator... though I'm still pretty sure (based on legacy at least) that the << here-doc variant is "part" of the string.

If not, be clear... are you saying Bash is somehow different than other languages or that ALL the other languages with heredocs are currently highlighting them wrong?
Screen Shot 2020-09-11 at 1 53 32 PM

I know editors are often more particular, but we've never been AFAIK.

Screen Shot 2020-09-11 at 1 55 09 PM

@egor-rogov
Copy link
Collaborator

Are you suggesting the string highlighting itself should change depending on whether it's a static string or allows interpolation? i don't think we do that in any other language.

But we do highlight strings differently in bash right now:

"$FOO"
'$FOO'
<span class="hljs-string">&quot;<span class="hljs-variable">$FOO</span>&quot;</span>
<span class="hljs-string">&#x27;$FOO&#x27;</span>

I believe we must retain this behavior for here-strings, no matter how we highlight <<<.
Heredoc is different, it just doesn't allow interpolation.

Well, I'm ok with highlighting both << and <<< as part of the string. It's just require some more work. And I certainly agree with the idea of differentiating components of string in long-term.

@joshgoebel
Copy link
Member

But we do highlight strings differently in bash right now:

<span class="hljs-string">&quot;<span class="hljs-variable">$FOO</span>&quot;</span>
<span class="hljs-string">&#x27;$FOO&#x27;</span>

Oh yeah, of course breaking that would be no good, agreed. :-)

Heredoc is different, it just doesn't allow interpolation.

Good to know, and convenient.

Well, I'm ok with highlighting both << and <<< as part of the string.

But now you've persuaded me that <<< might be different LOL.

@egor-rogov
Copy link
Collaborator

LOL. But doesn't it seem natural to highlight << and <<< equally?

@joshgoebel
Copy link
Member

LOL. But doesn't it seem natural to highlight << and <<< equally?

That's what we're discussing. I'm seeing them an entirely different now. I'm seeing <<-BLAH BLAH as heredoc (string delimiters) but now I see <<< as a different type of piping operator... ie "pipe this string into stdin"... making it an operator... I'm not sure there is a 100% correct answer here.

Making <<< work the way I originally wanted it to is more effort though, so I'm in favor of the simpler way. :-)

Someone who uses and reads Bash code all day might have their own thoughts. VS Code colors them both as "operators".

@egor-rogov
Copy link
Collaborator

But heredoc is also a sort of piping. Bash manual describes << as no less than "redirection operator".
So imho the simplest (and consistent) approach is to exclude <<-? from highlighting, as I suggested earlier.

(And in most detailed way we'd highlight <<- as operator, BLAH as quote/delimiter, and everything in between as string itself.)

@joshgoebel
Copy link
Member

joshgoebel commented Sep 11, 2020

You're just saying what I just said: 😎

Josh: Someone who uses and reads Bash code all day might have their own thoughts.

Maybe in Bash it's piping, certainly not how I've ever thought about it in Ruby in 10+ years. :)

(And in most detailed way we'd highlight <<- as operator, BLAH as quote/delimiter, and everything in between as string itself.)

No disagreement. Same as VS Code does it.

So imho the simplest (and consistent) approach is to exclude <<-? from highlighting, as I suggested earlier.

Consistent with what? Are you indeed saying "Yes, these should be right now highlighted differently in Bash than they are in Ruby?" I'm not sure you answered that.

@egor-rogov
Copy link
Collaborator

Ah, Ruby. I meant consistency of << and <<< in bash. Either both are highlighted, or none is highlighted. But I really don't know what Ruby thinks of heredocs.
Quick googling shows that there are different opinions on that matter. Here are a couple of random links: https://www.rubyguides.com/2018/11/ruby-heredoc/, https://infinum.com/the-capsized-eight/multiline-strings-ruby-2-3-0-the-squiggly-heredoc
So shouldn't we reconsider heredoc highlighting in Ruby? (:

Anyway I think that internal consistency of one language is bit more important than consistency between two quite different languages. So answering

Are you indeed saying "Yes, these should be right now highlighted differently in Bash than they are in Ruby?"

I'd say yes I am.

@joshgoebel
Copy link
Member

Screen Shot 2020-09-28 at 7 25 18 PM

@egor-rogov So this?

@egor-rogov
Copy link
Collaborator

Hmm... << is okay now (:
But EOF is a sort of quote here, so shouldn't we highlight it as part of the string (as we do for ' and ")?

@joshgoebel
Copy link
Member

@bric3 Do you have any thoughts here?

@bric3
Copy link
Author

bric3 commented Oct 2, 2020

Hi, sorry for the delay.

I tend to think a bit, as in I like operator highlithing, as it it's easier to grasp how a code snippet is articulated.

  • For heredocs, I think highlighting the redirection and the delimiters make sense; highlighting the content as string is important but not essential to me. If herdoc delimeters are not possible at least in a first iteration, then highligting the cointent is probaly enough.
  • For herestrings, it's a tad more tricky, while the highlighting the redirection would be appreciated, differentiating the arguments would have it's benefits (even beyond the scope of herestring)

That leads me to says here(doc|string) operators and delimiters should be hightlighted the same way. Since it's yet possible to distinguish string delimiters I think it's ok to highlight them the same as the string and semantically they go together anyway.

I have a personal b

jshell -s - <<EOF
System.out.printf("Procs: %s%n", Runtime.getRuntime().availableProcessors())
EOF
jshell -s - <<-EOF
System.out.printf("Procs: %s%n", Runtime.getRuntime().availableProcessors())
EOF
jshell -s - <<< 'System.out.printf("Procs: %s%n", Runtime.getRuntime().availableProcessors())'

And maybe later herestring highlighting might be refined as needed.

$ (export FOO=pwd; cat <<< 'this is $FOO')
$FOO
$ ( export FOO=pwd; cat <<< "this is $FOO" )
pwd
$ {export FOO=pwd; cat <<< $FOO}
pwd
$ {export FOO=pwd; cat <<< ${FOO}}
pwd
$ { export FOO=pwd; cat <<< `$FOO` }
/home/egor

I know it's easy for me to express an "easy" opinion as I am not a highlightjs developer nor a JS developer. Sorry for that.

@joshgoebel
Copy link
Member

joshgoebel commented Oct 3, 2020

For heredocs, I think highlighting the redirection and the delimiters make sense; highlighting the content as string is important but not essential to me.

I think for now this issue is going to focus on just highlighting heredocs at all [as strings]... highlighting redirection and delimiters is a discussion for another day. In the past we have not highlighted such things and I'm not leaving an issue open per language for "higher fidelity" highlighting. We already have a global issue. If someone comes along that wants to actually do the work themselves then I'm happy to review a PR.

The question is between:

jshell -s - <<-EOF
<string>System.out.printf("Procs: %s%n", Runtime.getRuntime().availableProcessors())</string>
EOF

and

jshell -s - <<-<string>EOF
System.out.printf("Procs: %s%n", Runtime.getRuntime().availableProcessors())
EOF</string>

Should EOF be included as part of the string (it is in Ruby and other languages it's included in the same sense that we include " as a delimiter)... I think the answer is no though for Bash... resulting only line two being highlighted as a string, and nothing else.

@joshgoebel joshgoebel changed the title (bash) Language does not highlight heredocs, subshell variables, etc. (bash) Language does not highlight heredocs Oct 4, 2020
@bric3
Copy link
Author

bric3 commented Oct 4, 2020

Sorry I didn't express well enough my opinion. Heredoc delimiters are delimiters add should be part of the content in my opinion (just like other multiline strings in other languages*). But if for some reason this is tricky then only highlighting the content is ok.

* python, groovy, swift, java, ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug help welcome Could use help from community language
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants