Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rename sanitize and add hook for URL sanitization #65

Closed
benatkin opened this issue Jul 20, 2012 · 7 comments
Closed

rename sanitize and add hook for URL sanitization #65

benatkin opened this issue Jul 20, 2012 · 7 comments
Labels

Comments

@benatkin
Copy link

Sanitize doesn't do all that's required for sanitized HTML. One thing that's important when accepting user input is to sanitize URLs.

The following JavaScript:

var marked = require('marked');
marked.setOptions({sanitize: true})
console.log(marked("<a href=\"javascript:alert('test')\"></a>\n\n[click me!][1]\n\n[1]: javascript:alert('test')"))

Will output the following HTML:

<p>&lt;a href=&quot;javascript:alert(&#39;test&#39;)&quot;&gt;&lt;/a&gt;

</p>
<p><a href="javascript:alert(&#39;test&#39;)">click me!</a>

</p>

I think the sanitize option would be better named escapeMarkup.

To solve the URL problem, I think a good path would be to have an URL filter function in the options, with a conservative URL filter as the default.

@chjj
Copy link
Member

chjj commented Jan 8, 2013

Yeah, this is a good point. Removing javascript: pseudo-protocol from links is now on the todo list for sanitize.

To solve the URL problem, I think a good path would be to have an URL filter function in the options, with a conservative URL filter as the default.

Also a possibility.

@jnordberg
Copy link

An url filter option would be great 👍

Now that the InlineLexer is exposed in the module i'm doing this to resolve urls, not pretty but at least a workaround until this functionality lands in some form.

# monkeypatch to add url resolving to marked
marked.InlineLexer.prototype._outputLink = marked.InlineLexer.prototype.outputLink
marked.InlineLexer.prototype._resolveLink = (href) -> href
marked.InlineLexer.prototype.outputLink = (cap, link) ->
  link.href = @_resolveLink link.href
  return @_outputLink cap, link

parseMarkdownSync = (content, baseUrl) ->
  ### Parse markdown *content* and resolve links using *baseUrl*, returns html. ###

  marked.InlineLexer.prototype._resolveLink = (href) ->
    url.resolve baseUrl, href

  tokens = marked.lexer content

  ...

  return marked.parser tokens

Note that autolinks and gfm-links wont be affected since they are not passed through the outputLink method.

@mshick
Copy link

mshick commented Oct 21, 2013

I also have need for some sort of uniform hook / processing for links. I'm looking to use markdown/marked in a commenting system, so there are some tight requirements for the linking features.

In particular I'm looking to add rel="nofollow", increment a count whenever a link appears (so I can flag posts as hasLink for moderation) and disable img processing.

I've had good luck overriding the outputLink method, but it isn't consistently utilized, so I'm having to override the output method as well, which is far from ideal.

Could outputLink, or some other method, perhaps become a clearing house for all urls, and could it potentially be self-contained (no references to "this", for instance)?

@firien
Copy link

firien commented Jan 15, 2014

you can use the new-ish Renderer feature (#129) to kill javascript: links

var renderer;
renderer = new marked.Renderer();
renderer.link = function(href, title, text) {
  if (/^javascript:/.test(href)) {
    return text;
  } else {
    return marked.Renderer.prototype.link.call(this, href, title, text);
  }
};

@alexcjohnson
Copy link

It appears that since this discussion sanitize: true has been updated to remove javascript: links, but it lets data:text/html;base64 links pass, which can also be used to execute javascript.

@scjody
Copy link

scjody commented Nov 3, 2016

The best way to sanitize links is to whitelist desired schemes, usually just http: and https:. People invent new schemes all the time and you can't be sure that they're all implemented securely or don't have unforeseen consequences.

@joshbruce
Copy link
Member

#984

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants