Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Manipulation: Make an HTML interception point #2203

Closed
wants to merge 3 commits into from

Conversation

gibson042
Copy link
Member

Fixes gh-1747

The failing unit test is blocked by #2206.

@dmethvin
Copy link
Member

Does anyone know when the regex transformation is really required? Simple cases like <div /> seem to pass through .innerHTML and get the right result in Chrome, Firefox, and IE11 even though they're not technically valid.

What's your thought on our recommendations for people using this? Perhaps we should just discuss pro/con of replacing the current method entirely vs punching it and calling the current method before/after?

@mgol
Copy link
Member

mgol commented Apr 15, 2015

Simple cases like <div /> seem to pass through .innerHTML and get the right result in Chrome, Firefox, and IE11 even though they're not technically valid.

Not really...

var div = document.createElement('div');
div.innerHTML = '<div /><span>aaa</span>';
div.innerHTML; // "<div><span>aaa</span></div>"

<tag /> is treated by innerHTML as <tag> without a matching closing tag; it's getting auto-closed at the end or at the first opening tag that cannot be contained inside the tag element.

@dmethvin
Copy link
Member

True, I was mostly thinking about the simple case like <div /> alone that die in old IE. Running that regex on really complex markup is a roll of the dice, we just had a bug reported with it. According to our docs that example is not valid input to jQuery methods taking HTML, but no doubt here are people depending on it.

When the parameter has a single tag (with optional closing tag or quick-closing) — $( "<img />" ) or $( "<img>" ), $( "<a></a>" ) or $( "<a>" ) — jQuery creates the element using the native JavaScript .createElement() function.

and

To ensure cross-platform compatibility, the snippet must be well-formed. Tags that can contain other elements should be paired with a closing tag ... (examples)

@@ -159,6 +159,10 @@ function fixInput( src, dest ) {
}

jQuery.extend({
htmlPrefilter: function( html ) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So user would need to redefine this method? And if their would, they would loose html.replace( rxhtmlTag, "<$1></$2>" ) User would need to duplicate it? Isn't that null out the whole point of that function?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyone overriding this method should keep and use a reference to what they're replacing, as in the test cases added here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be in the manipulation module? Not core?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A good question. The dependencies basically look like this:

  • $collection.html( value )
    • jQuery.fn.html (src/manipulation.js)
      • jQuery.htmlPrefilter (src/manipulation.js)
  • $collection[ otherManipulationMethod ]( value )
    • jQuery.fn[ otherManipulationMethod ] (src/manipulation.js)
      • jQuery.fn.domManip (src/manipulation.js)
        • (if value is a function)jQuery.fn.html (src/manipulation.js)
        • jQuery.buildFragment (src/manipulation.js)
          • jQuery.htmlPrefilter (src/manipulation.js)
  • jQuery( html )
    • jQuery (src/core.js)
      • jQuery.fn.init (src/core/init.js)
        • jQuery.parseHTML (src/core/parseHTML.js)
          • jQuery.buildFragment (src/manipulation.js)
            • jQuery.htmlPrefilter (src/manipulation.js)
          • (if !keepScripts)jQuery.fn.remove (src/manipulation.js)
            • → … (src/traversing/findFilter.js, src/manipulation.js, src/data.js, …)

htmlPrefilter is meaningful only for the "manipulation" module, and should not appear in a hypothetical build excluding it. This can be revisited when buildFragment is made private, at which point the decision may be to put it in a new module providing jQuery.htmlPrefilter and jQuery( html )—upon which "manipulation" would depend.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not necessarily a safe bet though.

Consider that the reference that you get is not the original method but one that has already been overridden by someone else in a plugin. If the owner of that original method then needed to remove themselves from the pipeline by simply restoring the original value, your function may inadvertently be lost as well.

On the other hand, you could attempt to remove yourself from the pipeline not by restoring the original value, but changing your definition to be something akin to an identify function. The problem with that is it becomes easy to create memory leaks via closures if you don't null things out properly.

@csnover pointed out that dojo/aspect was intended to support things like this, but you can see that's not a trivial bit of code there.

I do not have a concrete use case for removing functions from the pipeline ( although I can certainly see SPA wanting to to do possibly ), but I did implement something like what's proposed here for some internal functionality where I work and have come to regret it - it's hard to work with and leaks like a sieve unless you know how to work around it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a case of YAGNI. The jQuery analogy that comes to mind for me is the special events interface. There just aren't a lot of clients who need to use it, so coming up with a management interface for them is a lot of pain for no gain.

If you are writing a web page that incorporates lots of different code from relatively untrusted third-party sources, you will want to keep your one-and-only htmlPrefilter in place for the lifetime of the page because you don't trust that code to do the right thing and want to watch over it. In that case, having some other app mess with the htmlPrefilter is not good at all because you probably don't trust that code either.

If you are writing some kind of SPA then either the framework you are using should be ensuring sanitized HTML (and might provide hooks for that) or you will again be just doing one htmlPrefilter and keeping it installed permanently. I don't see the need for multiplexed functions here either.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @dmethvin for expertly elaborating on my position. The farthest I would want to go is directly exposing an array, and even for that I'd want a concrete use case—the intention here is to support permanent changes to HTML prefiltering where a user knows exactly how our default behavior is failing them and how to fix it. This is expert functionality, and experts can deal with their own registration and memory management.

dojo/aspect is really nice, but tremendous overkill here.

@markelog
Copy link
Member

@gibson042
Copy link
Member Author

Do we need https://github.com/jquery/jquery/blob/892625b3c36eda6bb6ac226d15e0a158ba35cf21/src/core/parseHTML.js#L27-29 bit after this?

Why wouldn't we? This isn't changing default functionality.

@markelog
Copy link
Member

Default no, but it if user wants to intercept those and clean them - their can, basically you need to clean html only when you except input from user, which is rare i guess.

Also, hack that i mentioned, doesn't work in all browsers so it might create a false sense of security which is not good.

gibson042 added a commit that referenced this pull request Apr 30, 2015
Fixes gh-1747
Closes gh-2203

(cherry picked from commit 225bde3)

Conflicts:
	src/manipulation.js
	test/unit/manipulation.js
@gibson042 gibson042 closed this in 225bde3 Apr 30, 2015
gibson042 added a commit that referenced this pull request Nov 10, 2015
@lock lock bot locked as resolved and limited conversation to collaborators Jan 19, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Extension point for intercepting HTML before DOM insertion
6 participants