Don't allow unsafe URIs #52

Closed
wants to merge 9 commits into
from

Conversation

Projects
None yet
6 participants

zauguin commented Jun 14, 2012

This fix filter out link and image URIs like javascript:alert(); that are used for XSS.
It uses whitelisting and only allows http(s) and ftp URIs.

Shedal commented Apr 14, 2013

Why hasn't this been pulled?

Collaborator

ashb commented Apr 15, 2013

Primarily because we forgot to look at it when this PR was first opened. A few things come to mind:

  1. Is that it is inflexible. Sane defaults are good but some people might want to allow javascript URLs (or schemes other than the hard-coded list)
  2. Secondly is that I think this is done at the wrong level and the filtering should be performed when converting the internal representation to HTML, not when converting it to JsonML
  3. No unit tests.

Do you fancy looking at these three points?

Collaborator

ashb commented Apr 15, 2013

Oh and one last thing - the checkURI function is exposed globally in a browser which is not ideal - exports should be kept to a minimum (having it 'per instance' rather than global is preferable too)

zauguin commented Apr 15, 2013

Can anyone try the last commit?
Currently I'm not using this and just wrote a quick hack to fix some problems.

@ashb ashb commented on an outdated diff Apr 17, 2013

lib/markdown.js
@@ -5,6 +5,22 @@
(function( expose ) {
+ //Just a little helper to check if the URI has a allowed schema
+ function checkURI(attribute, uri, options) {
+ options = options | {};
+ options.filteredTags = options.filteredTags | ['href', 'src'];
+ options.allowedSchemas = options.allowedSchemas | ['http', 'https', 'ftp', 'mailto'];
+ if(urlAttributes.indexOf(attribute) != -1) {
+ var schemaRegEx = /^\s*([A-Za-z][A-Za-z0-9\+\-\.]*):(.*)$/;
+ if(schemaRegEx.exec(uri)!=null) {
+ if(allowedSchemas.indexOf(RegExp.$1) == -1) {
@ashb

ashb Apr 17, 2013

Collaborator

This should be options.allowedSchemas

@ashb ashb commented on an outdated diff Apr 17, 2013

lib/markdown.js
@@ -5,6 +5,22 @@
(function( expose ) {
+ //Just a little helper to check if the URI has a allowed schema
+ function checkURI(attribute, uri, options) {
+ options = options | {};
+ options.filteredTags = options.filteredTags | ['href', 'src'];
+ options.allowedSchemas = options.allowedSchemas | ['http', 'https', 'ftp', 'mailto'];
+ if(urlAttributes.indexOf(attribute) != -1) {
@ashb

ashb Apr 17, 2013

Collaborator

urlAttributes should be attribute?

@ashb ashb commented on an outdated diff Apr 17, 2013

lib/markdown.js
@@ -5,6 +5,22 @@
(function( expose ) {
+ //Just a little helper to check if the URI has a allowed schema
+ function checkURI(attribute, uri, options) {
+ options = options | {};
+ options.filteredTags = options.filteredTags | ['href', 'src'];
@ashb

ashb Apr 17, 2013

Collaborator

Never used - either remove it or look at it. Though you aren't currently passing the tag in.

Are you filtering image urls? Do you need to?

Collaborator

ashb commented Apr 17, 2013

You've got a few bugs - did I mention that I'd like some unit tests? ;)

Let me know if you'd like some pointers on writing one - it shouldn't take much to add a few simple tests for this behaviour if you already have node.js installed.

Is this patch included in the master branch yet?

I've been using showdown to convert markdown->html, but I just realized showdown allows all sorts of XSS attacks, and I've been looking for an alternative (and found this project).

While some people might want to allow "javascript:" links, I think that most people who use markdown (as opposed to regular HTML), are actually trying to prevent the users from executing arbitrary scripts.

Owner

evilstreak commented Aug 22, 2013

I like this as a concept, but the execution isn't ready to be merged.

The lack of tests and bugs aside, the best way to handle this would be by running a filter over the whole JsonML tree before processing to the next step. We plan to make some architectural changes soon™ which will make filters like this a lot easier. Until then I'm going to close this issue unless someone wants to drive it forwards.

evilstreak closed this Aug 22, 2013

Just wanted to say that I think it is highly unprofessional to turn down a security fix and then not going fix the issue on over 9 months.

Owner

evilstreak commented May 19, 2014

There are a lot of ways to sneak exploits into HTML during conversion from Markdown. Here are a few: example exploits.

If you want to be sure your output is sanitised then you need to check that output — you shouldn't rely on the Markdown conversion tool to produce output matching your definition of safe.

Nevertheless, thanks for sharing your disappointment in this project.

This lib creates HTML from user input so it's the libs responsibility to sanitize the output. I can ofc create a markdown parser and sanitize stuff before I send it to this parser but then why do I use this library. Similarly I can create an HTML parser that runs through markdown html output but performance will suck and why do I have to do this, I just want to use the lib.

This lib has a markdown parser and the fix would be easy and perform well. No one wants to create JavaScript links with markdown and no one expects that he has to sanitize the output so it's the libs responsibility to guarantee safe output.

Shedal commented May 19, 2014

PageDown has a separate sanitizer module: https://code.google.com/p/pagedown/source/browse/Markdown.Sanitizer.js

This could be a way to go for you as well.

And if you still think that sanitation is no the libs job then please put a big fat disclaimer at the top of your readme that says that this lib is vulnerable to xss

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment