Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Script Modules: Add data server->client data passing #6433

Closed
wants to merge 10 commits into from

Conversation

sirreal
Copy link
Member

@sirreal sirreal commented Apr 23, 2024

See https://make.wordpress.org/core/2024/05/06/proposal-server-to-client-data-sharing-for-script-modules/

Includes #6452
Requires WordPress/gutenberg#60952 (this PR builds and registers the @wordpress/api-fetch module).

Trac ticket:


This Pull Request is for code review only. Please keep all other discussion in the Trac ticket. Do not merge this Pull Request. See GitHub Pull Requests for Code Review in the Core Handbook for more details.

Copy link

Test using WordPress Playground

The changes in this pull request can previewed and tested using a WordPress Playground instance.

WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser.

Some things to be aware of

  • The Plugin and Theme Directories cannot be accessed within Playground.
  • All changes will be lost when closing a tab with a Playground instance.
  • All changes will be lost when refreshing the page.
  • A fresh instance is created each time the link below is clicked.
  • Every time this pull request is updated, a new ZIP file containing all changes is created. If changes are not reflected in the Playground instance,
    it's possible that the most recent build failed, or has not completed. Check the list of workflow runs to be sure.

For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation.

Test this pull request with WordPress Playground.

@sirreal sirreal force-pushed the add/script-module-data-passing branch from d0f1c02 to 529b670 Compare April 23, 2024 14:05
@sirreal sirreal force-pushed the add/script-module-data-passing branch from e3faa00 to 3dc7e5f Compare May 3, 2024 07:42
$config = apply_filters( 'scriptmoduledata_' . $module_id, array() );
if ( ! empty( $config ) ) {
wp_print_inline_script_tag(
wp_json_encode( $config, JSON_HEX_TAG | JSON_HEX_AMP ),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason not to include JSON_UNESCAPED_SLASHES here too? I suspect URLs of endpoints will be relatively common data, and avoiding the leaning toothpick syndrome would be nice. Having JSON_HEX_TAG should be sufficient to avoid problems with </script> that the default escaping of slashes was added to avoid.

JSON_UNESCAPED_UNICODE is another one to consider, but that would need testing for whether it breaks if pages are served with an encoding other than UTF-8.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't consider this at all, it's copied from how the Interactivity API implements the same thing:

wp_print_inline_script_tag(
wp_json_encode(
$interactivity_data,
JSON_HEX_TAG | JSON_HEX_AMP
),
array(
'type' => 'application/json',
'id' => 'wp-interactivity-data',
)
);

I did some testing and it seems like we should be able to add JSON_UNESCAPED_SLASHES in both of these places, thanks!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JSON_UNESCAPED_UNICODE might also be nice to add, but as you mention it might need to be behind a UTF-8 check:

$opts = JSON_HEX_TAG | JSON_HEX_AMP | JSON_UNESCAPED_SLASHES;
if ( get_bloginfo( 'blog_charset' ) === 'UTF-8' ) {
  $opts |= JSON_UNESCAPED_UNICODE;
}

It looks like it would certainly improve readability and potentially size:

<?php
$opts = 0;
$t = '🏴󠁧󠁢󠁥󠁮󠁧󠁿';
var_dump(
    $t,
    strlen($t),
    json_encode($t, $opts),
    strlen(json_encode($t, $opts)),
    json_encode($t, $opts | JSON_UNESCAPED_UNICODE),
    strlen(json_encode($t, $opts | JSON_UNESCAPED_UNICODE)),
);
string(28) "🏴󠁧󠁢󠁥󠁮󠁧󠁿"
int(28)
string(86) ""\ud83c\udff4\udb40\udc67\udb40\udc62\udb40\udc65\udb40\udc6e\udb40\udc67\udb40\udc7f""
int(86)
string(30) ""🏴󠁧󠁢󠁥󠁮󠁧󠁿""
int(30)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll follow up for Interactivity API here: #6520

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In some quick experimenting, I find Firefox 125 and Chromium 124 both do need the document to be utf-8 to interpret unescaped Unicode properly.

Also they seem to use the actual Content-Type HTTP header over <meta> tags if both specify a different charset, but they'll use <meta> tags if the HTTP header doesn't specify a charset. The deprecated charset attribute on a <script> tag doesn't seem to be used by either.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is vital that JSON_UNESCAPED_SLASHES not be added, as it avoids a security vulnerability when encoding a string that includes </script>. When slashes are escaped, this becomes <\/script> and the browser won't prematurely close the script tag.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for flagging that! I'd like to continue discussing on #6520 (comment), where I've replied. That PR is likely to land sooner, and whatever implementation is used there should also be used here. (In fact, I'd expect Interactivity API to use this proposal to print its data.)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@westonruter I already addressed that concern

Having JSON_HEX_TAG should be sufficient to avoid problems with </script> that the default escaping of slashes was added to avoid.

Is there any reason to think that's not accurate?

Copy link
Member

@westonruter westonruter May 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes. As long as JSON_HEX_TAG is present, then it seems safe to include JSON_UNESCAPED_SLASHES. Sorry I missed that. Maybe add a comment explaining why the particular set of flags are used, noting the security implications.

pento pushed a commit that referenced this pull request May 15, 2024
The Interactivity API has been rendering client data in a SCRIPT element with the
type `application/json` so that it's not executed as a script, but is available
to one. The data runs through `wp_json_encode()` and is encoded with some flags
to ensure that potentially-dangerous characters are escaped.

However, this can lead to some challenges. Eagerly escaping when not necessary
can make the data difficult to comprehend when reading the output HTML. For example,
all non-ASCII Unicode characters are escaped with their code point equivalent.
This results in `\ud83c\udd70` instead of `🅰`.

In this patch, the flags for JSON encoding are refined to ensure what's necessary
while relaxing other rules (leaving in those Unicode characters if the blog charset
is UTF-8). This makes for Interactivity API data that's quicker as a human reader
to decipher and diagnose.

In summary:

 - This data is JSON encoded and printed in a `<script type="application/json">` tag.

 - If we ensure that `<` is never printed inside the data, it should be impossible to
   break out of the script tag and the browser treats everything as the element's `textContent`.

 - All other escaping becomes unnecessary at that point, including unicode escaping 
   if the page uses the UTF-8 charset (the same encoding as JSON).

See #6433 (review)

Developed in #6520
Discussed in https://core.trac.wordpress.org/ticket/61170

Fixes: #61170
Follow-up to: [57563].
Props: bjorsch, dmsnell, jonsurrell, sabernhardt, westonruter.


git-svn-id: https://develop.svn.wordpress.org/trunk@58159 602fd350-edb4-49c9-b593-d223f7449a82
markjaquith pushed a commit to markjaquith/WordPress that referenced this pull request May 15, 2024
The Interactivity API has been rendering client data in a SCRIPT element with the
type `application/json` so that it's not executed as a script, but is available
to one. The data runs through `wp_json_encode()` and is encoded with some flags
to ensure that potentially-dangerous characters are escaped.

However, this can lead to some challenges. Eagerly escaping when not necessary
can make the data difficult to comprehend when reading the output HTML. For example,
all non-ASCII Unicode characters are escaped with their code point equivalent.
This results in `\ud83c\udd70` instead of `🅰`.

In this patch, the flags for JSON encoding are refined to ensure what's necessary
while relaxing other rules (leaving in those Unicode characters if the blog charset
is UTF-8). This makes for Interactivity API data that's quicker as a human reader
to decipher and diagnose.

In summary:

 - This data is JSON encoded and printed in a `<script type="application/json">` tag.

 - If we ensure that `<` is never printed inside the data, it should be impossible to
   break out of the script tag and the browser treats everything as the element's `textContent`.

 - All other escaping becomes unnecessary at that point, including unicode escaping 
   if the page uses the UTF-8 charset (the same encoding as JSON).

See WordPress/wordpress-develop#6433 (review)

Developed in WordPress/wordpress-develop#6520
Discussed in https://core.trac.wordpress.org/ticket/61170

Fixes: #61170
Follow-up to: [57563].
Props: bjorsch, dmsnell, jonsurrell, sabernhardt, westonruter.

Built from https://develop.svn.wordpress.org/trunk@58159


git-svn-id: http://core.svn.wordpress.org/trunk@57622 1a063a9b-81f0-0310-95a4-ce76da25c4cd
github-actions bot pushed a commit to platformsh/wordpress-performance that referenced this pull request May 15, 2024
The Interactivity API has been rendering client data in a SCRIPT element with the
type `application/json` so that it's not executed as a script, but is available
to one. The data runs through `wp_json_encode()` and is encoded with some flags
to ensure that potentially-dangerous characters are escaped.

However, this can lead to some challenges. Eagerly escaping when not necessary
can make the data difficult to comprehend when reading the output HTML. For example,
all non-ASCII Unicode characters are escaped with their code point equivalent.
This results in `\ud83c\udd70` instead of `🅰`.

In this patch, the flags for JSON encoding are refined to ensure what's necessary
while relaxing other rules (leaving in those Unicode characters if the blog charset
is UTF-8). This makes for Interactivity API data that's quicker as a human reader
to decipher and diagnose.

In summary:

 - This data is JSON encoded and printed in a `<script type="application/json">` tag.

 - If we ensure that `<` is never printed inside the data, it should be impossible to
   break out of the script tag and the browser treats everything as the element's `textContent`.

 - All other escaping becomes unnecessary at that point, including unicode escaping 
   if the page uses the UTF-8 charset (the same encoding as JSON).

See WordPress/wordpress-develop#6433 (review)

Developed in WordPress/wordpress-develop#6520
Discussed in https://core.trac.wordpress.org/ticket/61170

Fixes: #61170
Follow-up to: [57563].
Props: bjorsch, dmsnell, jonsurrell, sabernhardt, westonruter.

Built from https://develop.svn.wordpress.org/trunk@58159


git-svn-id: https://core.svn.wordpress.org/trunk@57622 1a063a9b-81f0-0310-95a4-ce76da25c4cd
@sirreal
Copy link
Member Author

sirreal commented May 30, 2024

This proof of concept is superseded by #6682.

@sirreal sirreal closed this Jun 13, 2024
@sirreal sirreal deleted the add/script-module-data-passing branch June 13, 2024 14:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants