Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix highlighting when hljs=true #171

Merged
merged 1 commit into from Feb 17, 2020
Merged

Fix highlighting when hljs=true #171

merged 1 commit into from Feb 17, 2020

Conversation

@ppwwyyxx
Copy link
Contributor

ppwwyyxx commented Dec 29, 2019

I found that the highlighting is incorrect when options.hljs=true.

It should use the tryHighlight function instead, which handles multi-line codeblock.

@coveralls

This comment has been minimized.

Copy link

coveralls commented Dec 29, 2019

Coverage Status

Coverage decreased (-0.005%) to 96.769% when pulling 097974d on ppwwyyxx:patch-1 into c876271 on hexojs:master.

@seaoak

This comment has been minimized.

Copy link
Member

seaoak commented Dec 30, 2019

I found that the highlighting is incorrect when options.hljs=true.

Could you show an example such case ?

Certainly, the existing code seems something strange.
But I think it works enough (already used by many users).
Of cource, it will works well for a multi-line codeblock.

I don't know the reason why this if() statement was introduced.
So I cannot decide to remove this if() statement.

Can you explain ?


BTW,

The API document of highlightjs package is:
https://highlightjs.readthedocs.io/en/latest/api.html#highlight-languagename-code-ignore-illegals-continuation

According to above API document, highlight() throws an exception in case of detecting illegal syntax for the language.
But existing code does not consider it.

I think try {...} catch() {...} statement will be necessary here
(in a similar way to tryHighlight()).

Please refer to issue #164.

@ppwwyyxx

This comment has been minimized.

Copy link
Contributor Author

ppwwyyxx commented Dec 30, 2019

Thanks for the response. I thought it cannot highlight multi-line block, but it turns out that, it simply cannot highlight certain multi-line block (bug report at highlightjs/highlight.js#2339):

> str =  'class B {\n' +
...     '  public:\n' +
...     '    explicit B(const A& a) : a_{a} {}\n' +   
...     '    const A& a_;\n' +                        
...     '};\n' +                                                                                                   
...     '\n' +                                           
...     'int main() {\n' +                               
...     '  A a; B b{a};\n' +
...     '  printf("HERE\\n");\n' +                        
...     '}'             
'class B {\n' +                                          
  '  public:\n' +                                        
  '    explicit B(const A& a) : a_{a} {}\n' +         
  '    const A& a_;\n' +
  '};\n' +                                                
  '\n' +                                                  
  'int main() {\n' +                                      
  '  A a; B b{a};\n' +                                    
  '  printf("HERE\\n");\n' +                              
  '}'                                                     
> hljs.highlight("cpp", str)                              
{                                                         
  illegal: true,                                          
  relevance: 0,                                           
  value: 'class B {\n' +                                  
    '  public:\n' +                                       
    '    explicit B(const A& a) : a_{a} {}\n' +   
    '    const A& a_;\n' +                        
    '};\n' +                                              
    '\n' +                                                
    'int main() {\n' +                                    
    '  A a; B b{a};\n' +                                  
    '  printf("HERE\\n");\n' +                            
    '}'                                                   
}                                 

After commenting out the if statement, it switches to per-line highlighting and works.

@ppwwyyxx

This comment has been minimized.

Copy link
Contributor Author

ppwwyyxx commented Dec 30, 2019

Given that highlight.js is not very robust (it failed for my above example), I think a good strategy is to:

  1. try highlight the whole block
  2. if failed, highlight each line one by one
  3. if failed, return original text

What do you think?

Still, I think the if-statement should be removed because there is no reason to use different logic to highlight code.

@ppwwyyxx

This comment has been minimized.

Copy link
Contributor Author

ppwwyyxx commented Dec 30, 2019

Or, an alternative to make highlighting more robust is to use highlight(..., ignore_illegals=true). Because according to https://highlightjs.readthedocs.io/en/latest/api.html#library-api, line-by-line highlighting (as is done now in tryHighlight) is not robust:

Note: continuation is NOT intended to support line-by-line highlighting because there is no requirement that a grammar handle linebreaks in any special way. It’s quite possible for a grammar to have a single mode/regex that matches MANY lines at once. This is not discouraged and entirely up to the grammar.

@curbengh

This comment has been minimized.

Copy link
Contributor

curbengh commented Jan 14, 2020

I don't know the reason why this if() statement was introduced.
So I cannot decide to remove this if() statement.

It's introduced along with hljs option #30. I don't know why either.

I tested this PR with/without line numbering, looks the same.

@ppwwyyxx

This comment has been minimized.

Copy link
Contributor Author

ppwwyyxx commented Jan 30, 2020

Any updates?
I think the if-statement is introduced for no reason in the first place. As a result it does lead to different behaviors depending on options.hljs.

@seaoak

This comment has been minimized.

Copy link
Member

seaoak commented Jan 30, 2020

I'm sorry, I'm too late.

I think @ppwwyyxx 's proposal will be better than current implementation.

That is,

  1. try highlight the whole block
  2. if failed, try highlight(..., ignore_illegals=true) for the whole block
  3. if failed, highlight each line one by one
  4. if failed, return original text (with the HTML class name plain)

Maybe the 2nd choice never fail.
But the document of the API does not say it clearly.
So the 3rd and the 4th choices will remain as current implementation.

In any case, the language name should be set as an HTML class name.
(see Issue #164)

@SukkaW

This comment has been minimized.

Copy link
Member

SukkaW commented Jan 30, 2020

@seaoak @ppwwyyxx LGTM then.

Copy link
Contributor

curbengh left a comment

LGTM

@SukkaW SukkaW merged commit 7e5633a into hexojs:master Feb 17, 2020
2 of 3 checks passed
2 of 3 checks passed
coverage/coveralls Coverage decreased (-0.005%) to 96.769%
Details
Travis CI - Pull Request Build Passed
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

5 participants
You can’t perform that action at this time.