Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: Invalid highlight html with rouge and linenos #2607 #3435

Closed
wants to merge 2 commits into from

Conversation

Projects
None yet
6 participants
@EdMcBane
Copy link

commented Feb 9, 2015

Fixes broken HTML being generated when using rouge highlighter with linenos feature enabled.

Caused by a naive regex replace done on rouge output to add <code></code> tags within <pre></pre> tags, that assumes there will be just one such <pre> block. Rouge actually generates two when linenos is enabled, one for the line numbers and one for the actual code block.

Regex replaces the first <pre> tag with <pre><code> and the first </pre> tag with </code></pre>. Should be the last.

Fixes issue #2067

Todo

  • The fix is more of a workaround, since regex-replacing tags within third-party library output should definitely be avoided, but a major change would be required and this needs input from project mantainers
  • The test for highlighting with rouge has been created as a separate file, may be merged into TestTags
  • The way the test themselves are written could be improved, in this case I kept it similar to what was already there.

Francesco Degrassi added some commits Feb 8, 2015

Francesco Degrassi
fix: add_code_tag breaks when multiple <pre> blocks are present in fo…
…rmatter output (e.g. when using rouge with linenos enabled)
@kleinfreund

This comment has been minimized.

Copy link
Contributor

commented Feb 9, 2015

The fix is more of a workaround, since regex-replacing tags within third-party library output should definitely be avoided

If this is an issue with how the Rouge highlighter is generating HTML, why is this not brought up over at jneen/rouge then?

@EdMcBane

This comment has been minimized.

Copy link
Author

commented Feb 9, 2015

It is actually an issue with how Jekyll does regex-replace on the output of rouge: rouge output is fine, it is Jekyll that is adding tags with a regex replace operation, and assuming that rouge will generate a single <pre> block.

There is not a clear interface between Jekyll and the highlighters. Pygments and rouge generate HTML, and Jekyll does regex-replace on it to inject a <code> block inside it.

IMHO the proper solution would be to "negotiate" with Pygments and rouge the structure of the output as an HTML document structure, and either:

  • wrap the output inside <code> tags, if applicable
  • or add parameters to pygments/rouge to include the <code> tags that Jekyll needs in the proper place

If this "negotiation" is not an option, it should be Jekyll responsibility to properly deal with whatever output both pygments and rouge generate, possibly avoiding fragile regex-replaces.

@parkr

This comment has been minimized.

Copy link
Member

commented Feb 10, 2015

it should be Jekyll responsibility to properly deal with whatever output both pygments and rouge generate, possibly avoiding fragile regex-replaces.

I agree here – any changes we make to the output are our bugs to handle.

As a super hacky but kind of clever way of handling this, StackOverflow suggests:

new_str = old_str.reverse.sub(pattern.reverse, replacement.reverse).reverse

I don't think we'll be going with that, but it was too funny not to share.

def add_code_tag(code)
# Add nested <code> tags to code blocks
code = code.sub(/<pre>\n*/,'<pre><code class="language-' + @lang.to_s.gsub("+", "-") + '" data-lang="' + @lang.to_s + '">')
code = code.sub(/\n*<\/pre>/,"</code></pre>")
code = string_rsub(code, /\n*<\/pre>/,"</code></pre>")

This comment has been minimized.

Copy link
@parkr

parkr Feb 10, 2015

Member

@jneen We're trying to insert a <code> layer inside Rouge's your <pre>. This is a pretty crummy way of handling this. Would you consider adding support for another encasing tag (in our case, code with custom attributes)?

This comment has been minimized.

Copy link
@jneen

jneen Feb 10, 2015

Hi @parkr, I'm working on a major refactor of this. Likely what you'll want to do for rouge 1.x is pass :wrap => false to disable all wrapping, and provide the html context manually.

This comment has been minimized.

Copy link
@jneen

jneen Feb 10, 2015

(:wrap => false is an option to Formatters::HTML that causes it to output just the highlighted spans and nothing else.)

This comment has been minimized.

Copy link
@parkr

parkr Feb 10, 2015

Member

@jneen Wonderful to hear. I'm sorry to have bothered you – it turns out we're wrapping it manually (i.e. Rouge isn't wrapping it) in order to maintain similarity to Pygments. I'm going to play around with nowrap and pygments and see what happens.

parkr added a commit that referenced this pull request Feb 10, 2015

highlight: fix problem with linenos and rouge.
Found by @EdMcBane in #3435

The strange regexp we were doing to replace the <pre><code></pre></code>
bits in the Pygments output were wreaking havoc on Rouge output
because Rouge uses <pre>'s to wrap line numbers.

To be consistent, the output from render_* should *not* include
the wrapping <div> and <pre> tags. It should just be what was
inside. We can then wrap it in our own custom tags without using
any regular expressions, as God intended. Death to regular
expressions and HTML manipulation!

@parkr parkr closed this Feb 10, 2015

@parkr

This comment has been minimized.

Copy link
Member

commented Feb 10, 2015

Fixed in #3436, thank you for figuring this bug out!!

@kusera

This comment has been minimized.

Copy link

commented Oct 15, 2015

I'm still experiencing this issue on Jekyll 2.5.3.

Generated tags:
tags

Source:
markdown

I'm building off an offshoot of Type Theme if that makes any difference.

@jneen

This comment has been minimized.

Copy link

commented Oct 15, 2015

Are yall still using regexen on the output of rouge?

@parkr

This comment has been minimized.

Copy link
Member

commented Oct 15, 2015

We don't seem to be doing anything weird with rouge: https://github.com/jekyll/jekyll/blob/master/lib/jekyll/tags/highlight.rb

@kusera

This comment has been minimized.

Copy link

commented Oct 15, 2015

It's strange, I see that you're using <figure> tags so I can't imagine why I'm getting <div> when serving the site with 2.5.3.

@parkr

This comment has been minimized.

Copy link
Member

commented Oct 15, 2015

@asl97 asl97 referenced this pull request Feb 28, 2016

Closed

Faulty linenos #3049

@jekyll jekyll locked and limited conversation to collaborators Feb 27, 2017

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.