Skip to content

Conversation

sirreal
Copy link
Member

@sirreal sirreal commented Aug 21, 2025

The WP_Tag_Processor::set_modifiable_text() method should treat <script the same way as it treats </script. Either of these sequences may affect the script elements close.

Trac ticket: https://core.trac.wordpress.org/ticket/63738


This Pull Request is for code review only. Please keep all other discussion in the Trac ticket. Do not merge this Pull Request. See GitHub Pull Requests for Code Review in the Core Handbook for more details.

@sirreal sirreal marked this pull request as ready for review August 21, 2025 19:17
Copy link

github-actions bot commented Aug 21, 2025

The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the props-bot label.

Core Committers: Use this line as a base for the props when committing in SVN:

Props jonsurrell, westonruter, dmsnell.

To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook.

@sirreal
Copy link
Member Author

sirreal commented Aug 21, 2025

This could allow some more safe inputs with a regular expression based check. There may also be ways to escape the script tag contents.

Before exploring those improvements, I'd like to address this issue where some potentially dangerous script tag contents are allowed.

if ( false !== stripos( $plaintext_content, '</script' ) ) {
if (
false !== stripos( $plaintext_content, '</script' ) ||
false !== stripos( $plaintext_content, '<script' )
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dangerous thing here is to enter the double escaped state. I wrote extensively about that here.

This check could be for both <!-- and <script, which are necessary to enter the double escaped state, but I think the <script check is sufficient for now.

Strictly speaking, the double escaped state is something like /<!---*(?!>).*<script[ \t\n\r\f\/>]/i and the dangerous closer /<\/script[ \t\n\r\f\/>]/i.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with better understanding of this now it still makes me wonder about allowing vs. not allowing it. we can detect if we enter and exit the double-escaped state, so technically we could allow entering it before rejecting the update.

but assuming we can allow this, it carries the risk downstream to unaware code. not allowing it at all means simpler code doesn’t have to know about these nuances. but then again, we can’t always go out of our way to ensure that broken code doesn’t break.

no insight, just questions

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been wondering about this as well. I'm confident in the logic around processing script tag contents. It would require some refactoring, but we could determine whether script tag contents are dangerous (with regards to HTML structure) in spec-compliant ways instead of these very basic tests.

  • 👍 This check ensures unsafe scripts are not allowed
  • 👎 This check disallows some safe scripts that appear dangerous.

A good example is that <!--<script></script> is perfectly fine. It includes both of the substrings that are dangerous, but when combined in the correct way they become safe. The HTML API is smart enough to recognize and allow this, but it's very likely to confuse other simpler processing such as regular expression based approaches.

My feeling, at least at this time, is that it's preferable to be more restrictive now and fail on some inputs that are not safe but look dangerous. In the future it should be easy to make these checks more accurate and allow more safe but dangerous-seeming input.

Copy link

Test using WordPress Playground

The changes in this pull request can previewed and tested using a WordPress Playground instance.

WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser.

Some things to be aware of

  • The Plugin and Theme Directories cannot be accessed within Playground.
  • All changes will be lost when closing a tab with a Playground instance.
  • All changes will be lost when refreshing the page.
  • A fresh instance is created each time the link below is clicked.
  • Every time this pull request is updated, a new ZIP file containing all changes is created. If changes are not reflected in the Playground instance,
    it's possible that the most recent build failed, or has not completed. Check the list of workflow runs to be sure.

For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation.

Test this pull request with WordPress Playground.

@sirreal sirreal requested a review from westonruter August 26, 2025 09:35
'Comment with --!>' => array( '<!-- this is a comment -->', 'Invalid but legitimate comments end in --!>' ),
'SCRIPT with </script>' => array( '<script>Replace me</script>', 'Just a </script>' ),
'SCRIPT with </script attributes>' => array( '<script>Replace me</script>', 'before</script id=sneak>after' ),
'SCRIPT with "<script " opener' => array( '<script>Replace me</script>', '<!--<script ' ),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

earmarking for follow-up: we could probably create a new unit test that better spells out what we’re wanting to enshrine here

/**
 * @param 'early-exit'|'prevents-exit' $violation Whether the violating contents close the SCRIPT element early or prevent its normal closure.
 */
public function rejects_script_contents_which_escape_the_script_element_boundaries( string $violating_script_contents, string $violation ) {
	…
}

Copy link
Member

@dmsnell dmsnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is good. Adding it to the Updates to the HTML API in 6.9 post!

pento pushed a commit that referenced this pull request Sep 4, 2025
Prevent WP_Tag_Processor::set_modifiable_text() from allowing SCRIPT contents with "<script" like it does with "</script". Either of these sequences may affect the script element's close.

Developed in #9560.

Props jonsurrell, westonruter, dmsnell.
See #63738.


git-svn-id: https://develop.svn.wordpress.org/trunk@60706 602fd350-edb4-49c9-b593-d223f7449a82
@sirreal
Copy link
Member Author

sirreal commented Sep 4, 2025

Merged in [60706].

@sirreal sirreal closed this Sep 4, 2025
@sirreal sirreal deleted the html-api/disallow-open-script-tags-in-script-contents branch September 4, 2025 14:40
markjaquith pushed a commit to markjaquith/WordPress that referenced this pull request Sep 4, 2025
Prevent WP_Tag_Processor::set_modifiable_text() from allowing SCRIPT contents with "<script" like it does with "</script". Either of these sequences may affect the script element's close.

Developed in WordPress/wordpress-develop#9560.

Props jonsurrell, westonruter, dmsnell.
See #63738.

Built from https://develop.svn.wordpress.org/trunk@60706


git-svn-id: http://core.svn.wordpress.org/trunk@60042 1a063a9b-81f0-0310-95a4-ce76da25c4cd
github-actions bot pushed a commit to gilzow/wordpress-performance that referenced this pull request Sep 4, 2025
Prevent WP_Tag_Processor::set_modifiable_text() from allowing SCRIPT contents with "<script" like it does with "</script". Either of these sequences may affect the script element's close.

Developed in WordPress/wordpress-develop#9560.

Props jonsurrell, westonruter, dmsnell.
See #63738.

Built from https://develop.svn.wordpress.org/trunk@60706


git-svn-id: https://core.svn.wordpress.org/trunk@60042 1a063a9b-81f0-0310-95a4-ce76da25c4cd
jonnynews pushed a commit to spacedmonkey/wordpress-develop that referenced this pull request Sep 24, 2025
Prevent WP_Tag_Processor::set_modifiable_text() from allowing SCRIPT contents with "<script" like it does with "</script". Either of these sequences may affect the script element's close.

Developed in WordPress#9560.

Props jonsurrell, westonruter, dmsnell.
See #63738.


git-svn-id: https://develop.svn.wordpress.org/trunk@60706 602fd350-edb4-49c9-b593-d223f7449a82
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants