codehilite extension double escapes HTML #725

kkinder · 2018-10-07T16:12:41Z

Here's an example:

import markdown

example_content = """# Test 
  
    >>> print('hi') 
    hi 

The above is valid MarkDown."""

output = markdown.markdown(example_content, extensions=['markdown.extensions.codehilite'])
print(output)

As you can see, my example file includes a Markdown example with > characters. Unfortunately, with Codehilite enabled, it produces doubly-escaped content. Here's the output of that script:

<h1>Test</h1>
<div class="codehilite"><pre><span></span>&amp;gt;&amp;gt;&amp;gt; print(&#39;hi&#39;) 
hi
</pre></div>


<p>The above is valid MarkDown.</p>

See those &> parts? That means your HTML output includes the > characters escaped twice. It should be just >>>, not &gt;&gt;&gt;.

The text was updated successfully, but these errors were encountered:

facelessuser · 2018-10-07T16:39:17Z

I'll take a look at this today when I get some time.

kkinder · 2018-10-07T16:41:49Z

Thanks. I created a pull request that fixes it a quick and dirty way. Since the extensions seem to get markup after it's been escaped, this seemed like the only obvious way to resolve the problem short of refactoring how Python-Markdown processes extensions.

facelessuser · 2018-10-07T16:49:02Z

I'll have to take a look, because this may unescape intentional escaped, literal syntax.

facelessuser · 2018-10-07T16:55:04Z

I don't think this used to be the case. I imagine this was introduced in recent refactoring or pull request. I'll get to the bottom of this though.

facelessuser · 2018-10-07T17:11:16Z

Yes, I introduced this I think when adjusting escapes in the serializer. We now escape the code when it is intially found by the code block processors. There should be no risk in unescaping literals. The proper approach is probably unescaping the content before processing it. I'll confirm in a bit.

We can bump the Markdown version when Python-Markdown/markdown#725 is released

kkinder mentioned this issue Oct 7, 2018

Fix issue with < and > characters (among others) being double-escaped. #726

Closed

facelessuser mentioned this issue Oct 7, 2018

Fix double escaping of block code #727

Merged

waylan closed this as completed in 2b064ff Oct 7, 2018

facelessuser mentioned this issue Oct 16, 2018

foot note adjustments #728

Merged

brianhlin added a commit to opensciencegrid/doc-ci-scripts that referenced this issue Oct 19, 2018

Lock Markdown version due to codehilite bug (SOFTWARE-3442)

a1015b9

We can bump the Markdown version when Python-Markdown/markdown#725 is released

waylan mentioned this issue Oct 30, 2018

Special characters are double-escaped if the CodeHilite extension is enabled #746

Closed

mitya57 mentioned this issue Mar 7, 2019

the > is unexpected escaped twice. #802

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

codehilite extension double escapes HTML #725

codehilite extension double escapes HTML #725

kkinder commented Oct 7, 2018

facelessuser commented Oct 7, 2018

kkinder commented Oct 7, 2018

facelessuser commented Oct 7, 2018

facelessuser commented Oct 7, 2018

facelessuser commented Oct 7, 2018 •

edited

Loading

codehilite extension double escapes HTML #725

codehilite extension double escapes HTML #725

Comments

kkinder commented Oct 7, 2018

facelessuser commented Oct 7, 2018

kkinder commented Oct 7, 2018

facelessuser commented Oct 7, 2018

facelessuser commented Oct 7, 2018

facelessuser commented Oct 7, 2018 • edited Loading

facelessuser commented Oct 7, 2018 •

edited

Loading