-
Notifications
You must be signed in to change notification settings - Fork 3.1k
HTML API: Prevent setting double escape script content #9560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTML API: Prevent setting double escape script content #9560
Conversation
The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the Core Committers: Use this line as a base for the props when committing in SVN:
To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook. |
This could allow some more safe inputs with a regular expression based check. There may also be ways to escape the script tag contents. Before exploring those improvements, I'd like to address this issue where some potentially dangerous script tag contents are allowed. |
if ( false !== stripos( $plaintext_content, '</script' ) ) { | ||
if ( | ||
false !== stripos( $plaintext_content, '</script' ) || | ||
false !== stripos( $plaintext_content, '<script' ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The dangerous thing here is to enter the double escaped state. I wrote extensively about that here.
This check could be for both <!--
and <script
, which are necessary to enter the double escaped state, but I think the <script
check is sufficient for now.
Strictly speaking, the double escaped state is something like /<!---*(?!>).*<script[ \t\n\r\f\/>]/i
and the dangerous closer /<\/script[ \t\n\r\f\/>]/i
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with better understanding of this now it still makes me wonder about allowing vs. not allowing it. we can detect if we enter and exit the double-escaped state, so technically we could allow entering it before rejecting the update.
but assuming we can allow this, it carries the risk downstream to unaware code. not allowing it at all means simpler code doesn’t have to know about these nuances. but then again, we can’t always go out of our way to ensure that broken code doesn’t break.
no insight, just questions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been wondering about this as well. I'm confident in the logic around processing script tag contents. It would require some refactoring, but we could determine whether script tag contents are dangerous (with regards to HTML structure) in spec-compliant ways instead of these very basic tests.
- 👍 This check ensures unsafe scripts are not allowed
- 👎 This check disallows some safe scripts that appear dangerous.
A good example is that <!--<script></script>
is perfectly fine. It includes both of the substrings that are dangerous, but when combined in the correct way they become safe. The HTML API is smart enough to recognize and allow this, but it's very likely to confuse other simpler processing such as regular expression based approaches.
My feeling, at least at this time, is that it's preferable to be more restrictive now and fail on some inputs that are not safe but look dangerous. In the future it should be easy to make these checks more accurate and allow more safe but dangerous-seeming input.
Test using WordPress PlaygroundThe changes in this pull request can previewed and tested using a WordPress Playground instance. WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser. Some things to be aware of
For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation. |
'Comment with --!>' => array( '<!-- this is a comment -->', 'Invalid but legitimate comments end in --!>' ), | ||
'SCRIPT with </script>' => array( '<script>Replace me</script>', 'Just a </script>' ), | ||
'SCRIPT with </script attributes>' => array( '<script>Replace me</script>', 'before</script id=sneak>after' ), | ||
'SCRIPT with "<script " opener' => array( '<script>Replace me</script>', '<!--<script ' ), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
earmarking for follow-up: we could probably create a new unit test that better spells out what we’re wanting to enshrine here
/**
* @param 'early-exit'|'prevents-exit' $violation Whether the violating contents close the SCRIPT element early or prevent its normal closure.
*/
public function rejects_script_contents_which_escape_the_script_element_boundaries( string $violating_script_contents, string $violation ) {
…
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is good. Adding it to the Updates to the HTML API in 6.9 post!
Prevent WP_Tag_Processor::set_modifiable_text() from allowing SCRIPT contents with "<script" like it does with "</script". Either of these sequences may affect the script element's close. Developed in #9560. Props jonsurrell, westonruter, dmsnell. See #63738. git-svn-id: https://develop.svn.wordpress.org/trunk@60706 602fd350-edb4-49c9-b593-d223f7449a82
Merged in [60706]. |
Prevent WP_Tag_Processor::set_modifiable_text() from allowing SCRIPT contents with "<script" like it does with "</script". Either of these sequences may affect the script element's close. Developed in WordPress/wordpress-develop#9560. Props jonsurrell, westonruter, dmsnell. See #63738. Built from https://develop.svn.wordpress.org/trunk@60706 git-svn-id: http://core.svn.wordpress.org/trunk@60042 1a063a9b-81f0-0310-95a4-ce76da25c4cd
Prevent WP_Tag_Processor::set_modifiable_text() from allowing SCRIPT contents with "<script" like it does with "</script". Either of these sequences may affect the script element's close. Developed in WordPress/wordpress-develop#9560. Props jonsurrell, westonruter, dmsnell. See #63738. Built from https://develop.svn.wordpress.org/trunk@60706 git-svn-id: https://core.svn.wordpress.org/trunk@60042 1a063a9b-81f0-0310-95a4-ce76da25c4cd
Prevent WP_Tag_Processor::set_modifiable_text() from allowing SCRIPT contents with "<script" like it does with "</script". Either of these sequences may affect the script element's close. Developed in WordPress#9560. Props jonsurrell, westonruter, dmsnell. See #63738. git-svn-id: https://develop.svn.wordpress.org/trunk@60706 602fd350-edb4-49c9-b593-d223f7449a82
The
WP_Tag_Processor::set_modifiable_text()
method should treat<script
the same way as it treats</script
. Either of these sequences may affect the script elements close.Trac ticket: https://core.trac.wordpress.org/ticket/63738
This Pull Request is for code review only. Please keep all other discussion in the Trac ticket. Do not merge this Pull Request. See GitHub Pull Requests for Code Review in the Core Handbook for more details.