New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extension point for intercepting HTML before DOM insertion #1747

Closed
mgol opened this Issue Oct 21, 2014 · 8 comments

Comments

Projects
None yet
5 participants
@mgol
Member

mgol commented Oct 21, 2014

Originally reported by nicholas@… at: http://bugs.jquery.com/ticket/14228

At my company, there's a desire to filter HTML strings as they go through methods like html() and append() to prevent XSS. Our code base is quite large and there are thousands of references to these types of methods that accept an HTML string for DOM insertion, so it's not feasible to replace each instance with our own method call.

As far as I can tell, there's no current way to intercept HTML strings before insertion. What I'm suggesting is adding some kind of extension point to jQuery Core that would allow someone to register a function that would receive the HTML before insertion and allow someone to change that HTML that would actually be inserted. This has the following potential uses:

  • custom XSS filters
  • auto linking filters (automatically turn an email address into a link, for example)
  • stripping out personal data

Just to be clear, I'm not suggesting including anything other than an extension point that would allow such filters to be written. Without this, I need to basically overwrite html(), append(), prepend(), and all the others to manfully check any string arguments.

Issue reported for jQuery 1.10.2

@mgol mgol added this to the 3.0.0 milestone Oct 21, 2014

@mgol

This comment has been minimized.

Show comment
Hide comment
@mgol

mgol Oct 21, 2014

Member

Comment author: dmethvin

A lot of us on the team consider the fact that we allow scripts to run by injecting them to be a bug that we can't remove for back-compat reasons, so we're sympathetic. I'll leave the ticket open for discussion to see if there's any way we can fix this.

Member

mgol commented Oct 21, 2014

Comment author: dmethvin

A lot of us on the team consider the fact that we allow scripts to run by injecting them to be a bug that we can't remove for back-compat reasons, so we're sympathetic. I'll leave the ticket open for discussion to see if there's any way we can fix this.

@mgol

This comment has been minimized.

Show comment
Hide comment
@mgol

mgol Oct 21, 2014

Member

Comment author: rwaldron

What is preventing something like this:

// Beginning of IIFE or AMD or whatever...
var html = jQuery.fn.html;
var append = jQuery.fn.append;
jQuery.fn.html = function() {
   // do the filtering
   return html.call(this, filtered/safe stuff);
};
jQuery.fn.append = function() {
   // do the filtering
   return append.call(this, filtered/safe stuff);
};
  
// End of IIFE or AMD or whatever...
Member

mgol commented Oct 21, 2014

Comment author: rwaldron

What is preventing something like this:

// Beginning of IIFE or AMD or whatever...
var html = jQuery.fn.html;
var append = jQuery.fn.append;
jQuery.fn.html = function() {
   // do the filtering
   return html.call(this, filtered/safe stuff);
};
jQuery.fn.append = function() {
   // do the filtering
   return append.call(this, filtered/safe stuff);
};
  
// End of IIFE or AMD or whatever...
@mgol

This comment has been minimized.

Show comment
Hide comment
@mgol

mgol Oct 21, 2014

Member

Comment author: nicholas@…

Nothing is preventing it, however, it is inefficient and risky. There are probably many paths to inserting HTML through jQuery, and without tracking them all down, the risk remains.

This is essentially what I'm doing right now, but I'm afraid I might have missed some corner case (and of course, methods change between versions, I don't want to have to keep updating this functionality). It would be much more efficient and safe to have one place to insert this functionality.

Member

mgol commented Oct 21, 2014

Comment author: nicholas@…

Nothing is preventing it, however, it is inefficient and risky. There are probably many paths to inserting HTML through jQuery, and without tracking them all down, the risk remains.

This is essentially what I'm doing right now, but I'm afraid I might have missed some corner case (and of course, methods change between versions, I don't want to have to keep updating this functionality). It would be much more efficient and safe to have one place to insert this functionality.

@mgol

This comment has been minimized.

Show comment
Hide comment
@mgol

mgol Oct 21, 2014

Member

Comment author: nicholas@…

I just wrote up a blog post on everything we had to do to secure jQuery:  http://tech.blog.box.com/2013/08/securing-jquery-against-unintended-xss/

I would love there to be a formal HTML filter capability, and it seems like the injection points for that functionality are fairly straightforward (see the blog post).

Member

mgol commented Oct 21, 2014

Comment author: nicholas@…

I just wrote up a blog post on everything we had to do to secure jQuery:  http://tech.blog.box.com/2013/08/securing-jquery-against-unintended-xss/

I would love there to be a formal HTML filter capability, and it seems like the injection points for that functionality are fairly straightforward (see the blog post).

@mgol

This comment has been minimized.

Show comment
Hide comment
@mgol

mgol Oct 21, 2014

Member

Comment author: m_gol

I agree this would be nice to have.

Member

mgol commented Oct 21, 2014

Comment author: m_gol

I agree this would be nice to have.

@mgol

This comment has been minimized.

Show comment
Hide comment
@mgol

mgol Oct 21, 2014

Member

Comment author: gibson042

Related: #14370; #14464

I don't know that we'd want to add another function for all input to pass through, but could definitely see exposing the arguments of our currently-existing .replace( rxhtmlTag, "<$1></$2>" ) as properties on jQuery. "Simple" extensions like no-innerHTML contents could still use a string replacement, but more complex ones like what this ticket is requesting could get wholesale filtering with something like the following (apologies for atrocious names):

var origPreMatch = jQuery.htmlPreMatch,
    origPreReplace = jQuery.htmlPreReplace;
jQuery.htmlPreMatch = /[\w\W]+/;
jQuery.htmlPreReplace = function( html ) {
    return extraProcess( html ).replace( origPreMatch, origPreReplace );
};
Member

mgol commented Oct 21, 2014

Comment author: gibson042

Related: #14370; #14464

I don't know that we'd want to add another function for all input to pass through, but could definitely see exposing the arguments of our currently-existing .replace( rxhtmlTag, "<$1></$2>" ) as properties on jQuery. "Simple" extensions like no-innerHTML contents could still use a string replacement, but more complex ones like what this ticket is requesting could get wholesale filtering with something like the following (apologies for atrocious names):

var origPreMatch = jQuery.htmlPreMatch,
    origPreReplace = jQuery.htmlPreReplace;
jQuery.htmlPreMatch = /[\w\W]+/;
jQuery.htmlPreReplace = function( html ) {
    return extraProcess( html ).replace( origPreMatch, origPreReplace );
};
@mgol

This comment has been minimized.

Show comment
Hide comment
@mgol

mgol Oct 21, 2014

Member

Comment author: m_gol

Replying to gibson042:

Related: #14370; #14464

I don't know that we'd want to add another function for all input to pass through

If we passed it through $.escapeHTML only if it's a function then we're saved from one additional function invocation on each manip operation and the only overhead is one additional if. Is that a lot?

I feel regex might not be enough in some cases and we really should make it possible to secure our manip methods.

Member

mgol commented Oct 21, 2014

Comment author: m_gol

Replying to gibson042:

Related: #14370; #14464

I don't know that we'd want to add another function for all input to pass through

If we passed it through $.escapeHTML only if it's a function then we're saved from one additional function invocation on each manip operation and the only overhead is one additional if. Is that a lot?

I feel regex might not be enough in some cases and we really should make it possible to secure our manip methods.

@mgol

This comment has been minimized.

Show comment
Hide comment
@mgol

mgol Oct 21, 2014

Member

Comment author: gibson042

Replying to m_gol:

If we passed it through $.escapeHTML only if it's a function then we're saved from one additional function invocation on each manip operation and the only overhead is one additional if. Is that a lot?

Well, it's still guaranteed to do at least two passes over the input (once for custom escaping and once to fix up self-closing tags) when someone sets up preprocessing, even if they're only trying to fix our mangling of script/textarea content (#14370) or attribute values (#14464). Come to think of it, that brings up the related question of whether preprocessing should precede, follow, or replace the currently-existing .replace( rxhtmlTag, "<$1></$2>" ).

I feel regex might not be enough in some cases and we really should make it possible to secure our manip methods.

str = str.replace( /[\w\W]+/, fn ) is equivalent to str = fn( str ) for any string-returning single-parameter fn, and therefore guaranteed to support the same use cases. This is purely a question of API surface area.

Member

mgol commented Oct 21, 2014

Comment author: gibson042

Replying to m_gol:

If we passed it through $.escapeHTML only if it's a function then we're saved from one additional function invocation on each manip operation and the only overhead is one additional if. Is that a lot?

Well, it's still guaranteed to do at least two passes over the input (once for custom escaping and once to fix up self-closing tags) when someone sets up preprocessing, even if they're only trying to fix our mangling of script/textarea content (#14370) or attribute values (#14464). Come to think of it, that brings up the related question of whether preprocessing should precede, follow, or replace the currently-existing .replace( rxhtmlTag, "<$1></$2>" ).

I feel regex might not be enough in some cases and we really should make it possible to secure our manip methods.

str = str.replace( /[\w\W]+/, fn ) is equivalent to str = fn( str ) for any string-returning single-parameter fn, and therefore guaranteed to support the same use cases. This is purely a question of API surface area.

@mgol mgol added the Feature label Oct 21, 2014

@markelog markelog added the Core label Dec 3, 2014

@gibson042 gibson042 self-assigned this Dec 8, 2014

@timmywil timmywil added the Blocker label Feb 9, 2015

gibson042 added a commit to gibson042/jquery that referenced this issue Apr 13, 2015

gibson042 added a commit that referenced this issue Apr 30, 2015

Manipulation: Make an HTML interception point
Fixes gh-1747
Closes gh-2203

(cherry picked from commit 225bde3)

Conflicts:
	src/manipulation.js
	test/unit/manipulation.js

@gibson042 gibson042 closed this in 225bde3 Apr 30, 2015

gibson042 added a commit that referenced this issue Nov 10, 2015

@dmethvin dmethvin modified the milestones: 1.12/2.2, 3.0.0 Jan 7, 2016

@cssmagic cssmagic referenced this issue May 18, 2016

Open

jQuery #5

@lock lock bot locked as resolved and limited conversation to collaborators Jun 19, 2018

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.