Possible XSS Vulnerability in Image and Hyperlinks (Markdown -> HTML) #1037

Preole · 2013-10-27T02:01:18Z

I have noticed that Pandoc allows the javascript: and data: URI schemes in the Markdown dialect. Namely, I can use the Javascript and Data:URI schemes in place of a valid URL for the hyperlink and image elements. As a consequence, the HTML back-end (Both strict mode and non-strict mode) can produce output capable of XSS attacks.

Below is my input fed through Babelmark 2 @ http://johnmacfarlane.net/babelmark2/

[JSLink](javascript:alert("XSS");)

![JSImage](javascript:alert("XSS");)

[DataLink](data:text/html;base64,PHNjcmlwdD5hbGVydCgiSGVsbG8iKTs8L3NjcmlwdD4=)

![DataImage](data:text/html;base64,PHNjcmlwdD5hbGVydCgiSGVsbG8iKTs8L3NjcmlwdD4=)

Output:

<p><a href="javascript:alert(&quot;XSS&quot;);">JSLink</a>
</p>
<p>
    <img src="javascript:alert(&quot;XSS&quot;);" alt="JSImage" />
</p>
<p><a href="data:text/html;base64,PHNjcmlwdD5hbGVydCgiSGVsbG8iKTs8L3NjcmlwdD4=">DataLink</a>
</p>
<p>
    <img src="data:text/html;base64,PHNjcmlwdD5hbGVydCgiSGVsbG8iKTs8L3NjcmlwdD4="
    alt="DataImage" />
</p>

When I clicked on the hyperlink with the Javascript payload, both in data:uri form and javascript:, an alert box pops up immediately, which means the embedded payload has been executed. (On Firefox 24)

The image element appears to be safe from this kind of XSS attack, at least on modern web browsers that disallow javascript: directives.

If a malicious writer distributes an HTML file with payload encoded using the above technique, the HTML file may be used for a phishing attack against the recipient.

I personally recommend disabling these two URI schemes altogether, but at the same time, some authors would like to embed images in Markdown using Data URI, which is a perfectly legitimate use for these schemes.

The text was updated successfully, but these errors were encountered:

dashed · 2013-10-27T02:24:47Z

I believe pandoc creates self-contained HTML documents using this technique. So, rather than disallowing the URI schemes altogether, it'll be better to convert all characters which have HTML character entity into those respective entities.

From babelmark2, it seems some markdown flavours do this by default.

So, pandoc may implement an option to escape characters into HTML entities: --escape-URLs, --escape-images.

The type of escaping I recommend is something like encodeURI() in JavaScript: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/encodeURI

I think pandoc should leave all URLs alone by default unless options like --escape-URLs is specified.

jgm · 2013-10-27T17:30:37Z

Yes, I'm aware of this. At one point I had a --sanitize option in
pandoc, that stripped these things (and many others) out. But then I was
convinced that the most reliable way to sanitize untrusted markdown
input is to run the HTML output of pandoc through a sanitizer. There
are many battle-tested sanitizers out there, which you can use.
(In Haskell, there is xss-sanitize, which started out as the former
sanitization code from pandoc.)

Bottom line: You should always sanitize the output of markdown
conversions before displaying them on a website.

But this isn't a bug in pandoc.

+++ Preole [Oct 26 13 19:01 ]:

I have noticed that Pandoc allows the javascript: and data: URI schemes in the Markdown dialect. Namely, I can use the Javascript and Data:URI schemes in place of a valid URL for the hyperlink and image elements. As a consequence, the HTML back-end (Both strict mode and non-strict mode) can produce output capable of XSS attacks.

Below is my input fed through Babelmark 2 @ http://johnmacfarlane.net/babelmark2/
[JSLink](javascript:alert("XSS");)

![JSImage](javascript:alert("XSS");)

[DataLink](data:text/html;base64,PHNjcmlwdD5hbGVydCgiSGVsbG8iKTs8L3NjcmlwdD4=)

![DataImage](data:text/html;base64,PHNjcmlwdD5hbGVydCgiSGVsbG8iKTs8L3NjcmlwdD4=)
Output:
<a href="javascript:alert(&quot;XSS&quot;);">JSLink</a>


 <img src="javascript:alert(&quot;XSS&quot;);" alt="JSImage" />

<a href="data:text/html;base64,PHNjcmlwdD5hbGVydCgiSGVsbG8iKTs8L3NjcmlwdD4=">DataLink</a>


 <img src="data:text/html;base64,PHNjcmlwdD5hbGVydCgiSGVsbG8iKTs8L3NjcmlwdD4="
 alt="DataImage" />

When I clicked on the hyperlink with the Javascript payload, both in data:uri form and javascript:, an alert box pops up immediately, which means the embedded payload has been executed. (On Firefox 24)

The image element appears to be safe from this kind of XSS attack, at least on modern web browsers that disallow javascript: directives.

If a malicious writer distributes an HTML file with payload encoded using the above technique, the HTML file may be used for a phishing attack against the recipient.

I personally recommend disabling these two URI schemes altogether, but at the same time, some authors would like to embed images in Markdown using Data URI, which is a perfectly legitimate use for these schemes.

Reply to this email directly or view it on GitHub:
#1037

Preole · 2013-10-29T04:02:08Z

Yes, I'm aware of this. At one point I had a --sanitize option in pandoc, that stripped these things (and many others) out. But then I was convinced that the most reliable way to sanitize untrusted markdown input is to run the HTML output of pandoc through a sanitizer. There are many battle-tested sanitizers out there, which you can use. (In Haskell, there is xss-sanitize, which started out as the former sanitization code from pandoc.)

I understand then. The verdict is to simply use an external library against the HTML output, rather than having the parser itself trying to produce safe HTML. I suppose it's safe to close this since it's not a big deal.

Preole closed this as completed Oct 29, 2013

badlydrawnrob mentioned this issue Apr 26, 2019

Make json easier badlydrawnrob/anki#44

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possible XSS Vulnerability in Image and Hyperlinks (Markdown -> HTML) #1037

Possible XSS Vulnerability in Image and Hyperlinks (Markdown -> HTML) #1037

Preole commented Oct 27, 2013

dashed commented Oct 27, 2013

jgm commented Oct 27, 2013

Preole commented Oct 29, 2013

Possible XSS Vulnerability in Image and Hyperlinks (Markdown -> HTML) #1037

Possible XSS Vulnerability in Image and Hyperlinks (Markdown -> HTML) #1037

Comments

Preole commented Oct 27, 2013

dashed commented Oct 27, 2013

jgm commented Oct 27, 2013

Preole commented Oct 29, 2013