Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regular expressions aren’t the right way #223

Closed
charmander opened this issue Sep 26, 2013 · 8 comments
Closed

Regular expressions aren’t the right way #223

charmander opened this issue Sep 26, 2013 · 8 comments

Comments

@charmander
Copy link

A blacklist is also not the right way.

<a href="&amp;#106;avascript:&amp;#100;ocument.wr&amp;#105;te('\x3c\x73\x63\x72\x69\x70\x74\x3e\x61\x6c\x65\x72\x74\x28\x34\x32\x29\x3b\x3c\x2f\x73\x63\x72\x69\x70\x74\x3e');&amp;#100;ocument.close();">Hello</a>
@chriso
Copy link
Collaborator

chriso commented Sep 26, 2013

The xss() function is a port of the same function from CodeIgniter. I ported it across because there wasn't another solution for node.js at the time. A whitelist which doesn't resort to regex would be better than the current blacklist, you're right. Pull requests that improve the filter are always welcome.

FYI your example doesn't pass through the filter

<a>alert&#40;42&#41;;[removed]');&#100;ocument.close();">Hello</a>

@charmander
Copy link
Author

Well, I seem to be using the latest version (1.5.1), and here’s the result:

$ node
> var validator = require("./");
> var input = '<a href="&amp;#106;avascript:&amp;#100;ocument.wr&amp;#105;te(\'\\x3c\\x73\\x63\\x72\\x69\\x70\\x74\\x3e\\x61\\x6c\\x65\\x72\\x74\\x28\\x34\\x32\\x29\\x3b\\x3c\\x2f\\x73\\x63\\x72\\x69\\x70\\x74\\x3e\');&amp;#100;ocument.close();">Hello</a>';
> var output = validator.sanitize(input).xss();
> console.log(output);
<a href="&#106;avascript:&#100;ocument.wr&#105;te('\x3c\x73\x63\x72\x69\x70\x74\x3e\x61\x6c\x65\x72\x74\x28\x34\x32\x29\x3b\x3c\x2f\x73\x63\x72\x69\x70\x74\x3e');&#100;ocument.close();">Hello</a>

If you’d like a pull request that uses an HTML parser of some sort, by the way, I would be very happy to do that. Thanks.

@chriso
Copy link
Collaborator

chriso commented Sep 30, 2013

Ah, you're right. I wasn't escaping the backslashes inside the write().

I'd gladly accept a PR. The library is due for a major (api-breaking) update and cleanup soon so maybe it's best to wait for that.

@funseiki
Copy link

What about combining with Google's Caja sanitation engine? Here's a quick snippet comparing validator and a node port of Caja called sanitizer.

var validator = require('validator'),
    sanitizer = require('sanitizer');
var input = '<a href="&amp;#106;avascript:&amp;#100;ocument.wr&amp;#105;te(\'\\x3c\\x73\\x63\\x72\\x69\\x70\\x74\\x3e\\x61\\x6c\\x65\\x72\\x74\\x28\\x34\\x32\\x29\\x3b\\x3c\\x2f\\x73\\x63\\x72\\x69\\x70\\x74\\x3e\');&amp;#100;ocument.close();">Hello</a>';

var outVal = validator.sanitize(input).xss();
var outSan = sanitizer.sanitize(input);

console.log("Validator Output:\n", outVal);

console.log("Sanitizer Output:\n", outSan);

Validator Output:

 <a href="&#106;avascript:&#100;ocument.wr&#105;te('\x3c\x73\x63\x72\x69\x70\x74
\x3e\x61\x6c\x65\x72\x74\x28\x34\x32\x29\x3b\x3c\x2f\x73\x63\x72\x69\x70\x74\x3e
');&#100;ocument.close();">Hello</a>

Sanitizer Output:

 <a>Hello</a>

@charmander
Copy link
Author

Just tell people to use that instead; separate is better.

@chriso
Copy link
Collaborator

chriso commented Oct 29, 2013

Looks like Caja is written in Java

@chriso
Copy link
Collaborator

chriso commented Oct 29, 2013

The node port comes with the following warning

It’s use this at your own risk really - Caja HTML Sanitizer was written by people far cleverer than me. I have just repackaged it to solve a problem I had (sanitization on a Node server). It seems to work, and it passes all its tests in re-packaged form - however I don’t fully understand its internals so cannot guarantee its security.

I'll probably just remove xss() in the next version.

@funseiki
Copy link

Here's an SO post regarding (somewhat) its efficacy. There's a javascript port (Google written by the looks of it) that the post links to. It's apparently white-list based and configurable. I've just been using it out-of-the-box.

chriso added a commit that referenced this issue Oct 31, 2013
The xss() function was originally a port of the XSS filter from
CodeIgniter. I added it to the library because there wasn't an
alternative at the time. Unfortunately I don't have the time or
expertise to maintain the XSS filter or keep merging upstream
changes.

If you need one for your app, I suggest looking at Caja sanitisation
engine maintained by Google. (https://code.google.com/p/google-caja/
source/browse/trunk/src/com/google/caja/plugin/html-sanitizer.js)

Closes #123, #138, #181, #206, #210, #221, #223, #226, #227, #231, #232
@chriso chriso closed this as completed Oct 31, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants