New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make sure the WP embeds security process is applied #4226

Merged
merged 1 commit into from Feb 21, 2018

Conversation

Projects
None yet
4 participants
@imath
Contributor

imath commented Jan 2, 2018

Description

In #3548 @PareshRadadiya suggested a css fix to remove the link above the WordPress embed posts. @swissspidy advised to have a look at the wp-embed.js script and @aduth suggested to inject this script into the Sandbox component instead of adding some css rules.
After exploring another issue about the fact it is not possible to embed posts from the same site, it appears the security mechanism @swissspidy is probably thinking about when saying "it does a bit more", is not applied during REST requests made with the WP_oEmbed_Controller class that Gutenberg uses to fetch WordPress embeds.
The current PR is my attempt to fix both issues: the remaining top link and "self" embeds.

How Has This Been Tested?

I've tested this PR on a regular AMP on my Macbook on a Multisite config.

Screenshots (jpeg or gifs if applicable):

self-embed

Self embeds can be previewed with this PR and the above blockquote is removed as the security process is applied.

external-embeds

Embeds from other WordPress sites are still failing because, imho, the WordPress site does not include one of the function of this PR: gutenberg_filter_oembed_result(). I've added inline comments above this function.

Types of changes

Fixes self embedding WP posts and the remaining blockquote above the iframe for these kind of WP Embeds.

Checklist:

  • My code is tested.
  • My code follows the WordPress code style.
  • My code follows has proper inline documentation.
@imath

This comment has been minimized.

Contributor

imath commented Jan 2, 2018

i'll remove the multi spaces found in blocks/library/embed/index.js. But about the fact URL is not defined, should I use a comment to tell the linter to ignore or use something else like wpApiSettings.root.replace( '/wp-json', '')

@aduth

This comment has been minimized.

Member

aduth commented Jan 3, 2018

But about the fact URL is not defined

Generally, you could simply be explicit about accessing URL on the window global, i.e. window.URL.

However, URL does not fall within the required browser support, specifically lacking support for IE11:

https://caniuse.com/#feat=url

That said, Node supports the same URL API via the url built-in:

https://nodejs.org/api/url.html#url_constructor_new_url_input_base

You should be able to import this directly, which will be polyfilled by Webpack.

import { URL } from 'url';

If you run into issues, it might be that the WHATWG URL API is too new† to have been polyfilled by Webpack, in which case you can lean on the legacy url.parse function.

† I don't know that it's necessarily new, but I learned only today that url.parse was considered legacy, as it was what I'm accustomed to using. 😄

@@ -142,6 +142,8 @@ export default class Sandbox extends Component {
}
`;
const wpEmbedScript = 'wp-embed' === this.props.type && window._wpEmbedScript ? ( <script type="text/javascript" src={ window._wpEmbedScript }></script> ) : '';

This comment has been minimized.

@aduth

aduth Jan 3, 2018

Member

Rather than making the Sandbox component aware of types of scripts, I wonder if it should instead accept a children prop that allows additional elements to be injected into the sandboxed content.

This comment has been minimized.

@imath

imath Jan 4, 2018

Contributor

If there were such a property, i guess it should be an object with at least two keys eg { head: 'to add styles', footer: 'to add JavaScripts' } as so far there's only some specific code for Video Embeds.

Btw, i've cleaned up the mess i put when i rebased earlier.

This comment has been minimized.

@aduth

aduth Jan 5, 2018

Member

If there were such a property, i guess it should be an object with at least two keys

Perhaps. Though, to the best of my knowledge, loading a stylesheet at the end of a page is totally valid.

https://stackoverflow.com/a/21749882/995445

Alternatively, assuming appending styles and scripts are the common use case, we could specify styles and scripts as array props of URL strings, e.g.:

<Sandbox scripts={ [ window._wpEmbedScript ] } />

Related prior art: https://github.com/Automattic/wp-calypso/tree/master/client/lib/embed-frame-markup

If there were such a property

We can/should introduce it here, since it's as part of the introduction of the requirement to include an additional script in the sandbox code.

@@ -100,7 +100,14 @@ function getEmbedBlockSettings( { title, icon, category = 'embed', transforms, k
event.preventDefault();
}
const { url } = this.props.attributes;
const apiURL = addQueryArgs( wpApiSettings.root + 'oembed/1.0/proxy', {
const rootUrl = new URL( wpApiSettings.root );
let action = 'proxy';

This comment has been minimized.

@aduth

aduth Jan 3, 2018

Member

A ternary could work well here:

const action = includes( url, rootUrl.hostname ) ? 'embed' : 'proxy';

See also: https://lodash.com/docs/4.17.4#includes

This comment has been minimized.

@imath

imath Jan 4, 2018

Contributor

I agree, thanks a lot 👍

@@ -114,6 +121,10 @@ function getEmbedBlockSettings( { title, icon, category = 'embed', transforms, k
return;
}
response.json().then( ( obj ) => {
if ( obj.html && -1 !== obj.html.indexOf( 'class="wp-embedded-content"' ) ) {

This comment has been minimized.

@aduth

aduth Jan 3, 2018

Member

Minor: Complementing the above remark, _.includes also automatically handles falsey values, so this could be simplified to:

if ( includes( obj.html, 'class="wp-embedded-content"' ) ) {

This comment has been minimized.

@imath

imath Jan 4, 2018

Contributor

Nice, i will include this for sure. Thanks.

@@ -114,6 +121,10 @@ function getEmbedBlockSettings( { title, icon, category = 'embed', transforms, k
return;
}
response.json().then( ( obj ) => {
if ( obj.html && -1 !== obj.html.indexOf( 'class="wp-embedded-content"' ) ) {
obj.type = 'wp-embed';

This comment has been minimized.

@aduth

aduth Jan 3, 2018

Member

The mutation of obj here makes me a bit uneasy, especially since we pass the object around (see below this.getPhotoHtml( obj )). I might instead suggest making this part of the destructuring and reassignment of type:

let { html, type } = obj;
if ( /* ... */ ) {
	type = 'wp-embed';
}

Problem here is that you might run into a prefer-const ESLint warning since html is not reassigned. Generally speaking, I'd be in favor of enabling the "destructuring": "all" option, but otherwise you might either have to avoid the destructuring for html or destructure separately:

const { html } = obj;
let { type } = obj;

This comment has been minimized.

@imath

imath Jan 4, 2018

Contributor

Ok, i understand your concern, i'll try to "destructure separately".

return $data;
}
$data['html'] = wp_filter_oembed_result( $data['html'], (object) $data, $_GET['url'] );

This comment has been minimized.

@aduth

aduth Jan 3, 2018

Member

Minor: This function could be made more concise and have fewer return paths if we just moved this single line into the negation of the previous condition:

<?php

function gutenberg_filter_oembed_result( $data ) {
	if ( defined( 'REST_REQUEST' ) && false !== REST_REQUEST && ! empty( $data['html'] ) && ! empty( $_GET['url'] ) ) {
		$data['html'] = wp_filter_oembed_result( $data['html'], (object) $data, $_GET['url'] );
	}

	return $data;
}

This comment has been minimized.

@imath

imath Jan 4, 2018

Contributor

Great, thanks a lot.

return $data;
}
add_filter( 'oembed_response_data', 'gutenberg_filter_oembed_result', 14, 1 );

This comment has been minimized.

@aduth

aduth Jan 3, 2018

Member

The fourth argument here is redundant, since 1 is the default value if omitted.

https://developer.wordpress.org/reference/functions/add_filter/

This comment has been minimized.

@aduth

aduth Jan 3, 2018

Member

How did you arrive at the priority of 14?

This comment has been minimized.

@imath

imath Jan 4, 2018

Contributor

I'll remove the fourth argument. About the priority: wp_filter_oembed_result() must happen after get_oembed_response_data_rich() which already hooks at 10 oembed_response_data. So 11 or upper works. 14 leaves some space for others to eventually filter in between.

@imath

This comment has been minimized.

Contributor

imath commented Jan 4, 2018

Thanks a lot for your recommandations and your explanations about the URL thing. I'll update the PR asap.

@imath

This comment has been minimized.

Contributor

imath commented Jan 4, 2018

@aduth i've applied the changes but i guess i put a mess trying to rebase my branch, sorry! I'll try to see how to fix this to only show my 3 latest commits.

@aduth

I'm curious... why isn't this script included in the markup generated from the embed endpoint? Can or should we consider updating / filtering the endpoint response to include it? (Maybe context-specific?)

@@ -142,6 +142,8 @@ export default class Sandbox extends Component {
}
`;
const wpEmbedScript = 'wp-embed' === this.props.type && window._wpEmbedScript ? ( <script type="text/javascript" src={ window._wpEmbedScript }></script> ) : '';

This comment has been minimized.

@aduth

aduth Jan 5, 2018

Member

If there were such a property, i guess it should be an object with at least two keys

Perhaps. Though, to the best of my knowledge, loading a stylesheet at the end of a page is totally valid.

https://stackoverflow.com/a/21749882/995445

Alternatively, assuming appending styles and scripts are the common use case, we could specify styles and scripts as array props of URL strings, e.g.:

<Sandbox scripts={ [ window._wpEmbedScript ] } />

Related prior art: https://github.com/Automattic/wp-calypso/tree/master/client/lib/embed-frame-markup

If there were such a property

We can/should introduce it here, since it's as part of the introduction of the requirement to include an additional script in the sandbox code.

@imath

This comment has been minimized.

Contributor

imath commented Jan 6, 2018

@aduth I think the wp-embed.js script is not included in the markup generated from the embed endpoint, because it can deal with all the embedded WordPress posts of the page. In the following example:

4226patch

I'm using changes added by f17f6ed

I think we should probably do like this, instead of including wp-embed.js in each WordPress embed blocks.

I still need to improve this latest commit to avoid the nested ternary.

const { html } = obj;
let { type } = obj;
if ( includes( html, 'class="wp-embedded-content" data-secret' ) ) {

This comment has been minimized.

@imath

imath Jan 6, 2018

Contributor

If the external WordPress site returns a response not containing the 'data-secret' attribute (eg: previous version of WordPress once the code we temporarily added in lib/compat.php will be included in WordPress core), then the response is treated like any other embed content.

@aduth

Embeds from other WordPress sites are still failing because, imho, the WordPress site does not include one of the function of this PR: gutenberg_filter_oembed_result().

Can you clarify what you mean by this?

I think we should probably do like this, instead of including wp-embed.js in each WordPress embed blocks.

Personally I don't have a huge issue with wp-embed.js being loaded independently in each iframe, but I'm also overly cautious about dangerouslySetInnerHTML 😄 Noting that this is the same markup that we'd expect to be shown on the front of the site anyways, I think it's reasonable.

An initial thought I had was not being sure how well the embed script handles dynamically-added content (since the markup is added after the initial page load). From my testing this appears to work well though.

@@ -181,6 +189,21 @@ function getEmbedBlockSettings( { title, icon, category = 'embed', transforms, k
const parsedUrl = parse( url );
const cannotPreview = includes( HOSTS_NO_PREVIEWS, parsedUrl.host.replace( /^www\./, '' ) );
const iframeTitle = sprintf( __( 'Embedded content from %s' ), parsedUrl.host );
const embedWrapper = 'wp-embed' !== type ? (

This comment has been minimized.

@aduth

aduth Jan 16, 2018

Member

Minor: In a condition or ternary with both if and else parts, I tend to encourage the positive form as the first case, since "if positive else negative" reads more naturally than "if negative else positive".

@@ -228,6 +228,25 @@ function gutenberg_get_post_type_capabilities( $user, $name, $request ) {
return $value;
}
/**
* Make sure oEmbed REST Requests apply the WP Embed security mechanism for WordPress embeds.

This comment has been minimized.

@aduth

aduth Jan 16, 2018

Member

This comment is a bit light on rationale for why the security mechanism should be applied. Might be worth elaborating or adding a link to context.

This comment has been minimized.

@imath

imath Jan 17, 2018

Contributor

In 6dd9e9a i've added a link to the relative Core ticket

*
* TODO: This is a temporary solution. This code should be included in WordPress Core.
*
* @since ?

This comment has been minimized.

@aduth

aduth Jan 16, 2018

Member

While you're here you can go ahead and set this to the next release version number (2.1.0).

@imath

This comment has been minimized.

Contributor

imath commented Jan 17, 2018

Thanks a lot for your review @aduth

I've applied your recommandation in latest commit.

Can you clarify what you mean by this?

I'll try!

  1. Let's say Site A is using a Gutenberg version including this PR. When you embed a post from Site A into Site A, the WP Embed will work and the link at the top will be removed by wp-embed.json because it finds the data-secret attribute of the blockquote.

  2. Now in Site A, if i try to embed a content of a WordPress site B that uses a Gutenberg that is not including this PR, the data-secret attribute will be missing because gutenberg_filter_oembed_result() is not in WordPress site B yet. That's why for this case, the link above the Embed html will still be there. As soon as WordPress site B updates Gutenberg to include this PR, then the above link should be removed just like in case 1)

@aduth

aduth approved these changes Jan 20, 2018

Could you resolve the merge conflict?

Will also wait to see how the build runs; I'm guessing the last one ran during Travis's downtime this past week.

Otherwise looks good 👍

/**
* Make sure oEmbed REST Requests apply the WP Embed security mechanism for WordPress embeds.
*
* @see https://core.trac.wordpress.org/ticket/32522

This comment has been minimized.

@aduth

aduth Jan 20, 2018

Member

Should a comment be made to this ticket (or a new ticket) describing what's being added here and why?

This comment has been minimized.

@imath

imath Jan 21, 2018

Contributor

Sure, I just opened this ticket in WordPress Core Trac.

@imath

This comment has been minimized.

Contributor

imath commented Jan 21, 2018

@aduth I just updated the PR to solve the merge conflict.

@swissspidy

This comment has been minimized.

Member

swissspidy commented Jan 23, 2018

Out of curiosity, are you using rebase (vs. merge) to resolve merge conflicts? I'm getting a notification for e84b88d over and over again.

Regarding the changes, I left a comment on that Trac ticket as I think this could be fixed in JS alone.

@pento

pento requested changes Jan 24, 2018 edited

Per my comment on the core ticket, changing the oembed API result isn't an option.

However, on digging into this some more, I believe that Core's WP_oEmbed_Controller::get_proxy_item() is not correct here. $data is retrieved from WP_oEmbed::get_data(), but the HTML should be filtered, to match existing behaviour.

So, we'll need a new ticket on core.trac to fix the WP_oEmbed_Controller::get_proxy_item() behaviour.

The workaround in compat.php should look something like this:

function gutenberg_filter_oembed_result( $response, $handler, $request ) {
	if ( 'GET' === $request->get_method() && '/oembed/1.0/proxy' === $request->get_route() ) {
		$response->html = apply_filters( 'oembed_result', _wp_oembed_get_object()->data2html( $response, $_GET['url'] ), $_GET['url'], array() );
	}

	return $response;
}
add_filter( 'rest_request_after_callbacks', 'gutenberg_filter_oembed_result', 10, 3 );

And so that the resize message from the embed is recognised, the Sandbox htmlDoc should include:

<script type="text/javascript" src="/wp-includes/js/wp-embed.min.js" />
@imath

This comment has been minimized.

Contributor

imath commented Jan 24, 2018

Thanks a lot @swissspidy for your feedback & @pento for your review, I’ll update the PR as required asap.

@imath

This comment has been minimized.

Contributor

imath commented Jan 25, 2018

@pento I've applied your patch in 080ca75 but i had to add another condition block for self embeds (posts embedded in the same WordPress site) which was the subject of this PR initialy.

@aduth The good new is, thanks to Pento, the blockquote above self WordPress site embeds or external WordPress site embeds is now removed in both cases !

4226

@pento

pento requested changes Jan 25, 2018 edited

This needs some testing, but it allows us to drop the switching from embed/index.js. Everything goes through the proxy endpoint.

function gutenberg_filter_oembed_result( $response, $handler, $request ) {
	if ( 'GET' !== $request->get_method() ) {
		return $response;
	}

	if ( is_wp_error( $response ) && 'oembed_invalid_url' !== $response->get_error_code() ) {
		return $response;
	}

	// External embeds.
	if ( '/oembed/1.0/proxy' === $request->get_route() ) {
		if ( is_wp_error( $response ) ) {
			// It's possible a local post, so lets try and retrieve it that way.
			$post_id = url_to_postid( $_GET['url'] );
			$data = get_oembed_response_data( $post_id, apply_filters( 'oembed_default_width', 600 ) );

			if ( ! $data ) {
				// Not a local post, return the original error.
				return $response;
			}
			$response = (object) $data;
		}

		// Make sure the HTML is run through the oembed santisation routines.
		$response->html = wp_oembed_get( $_GET['url'], $_GET );
	}

	return $response;
}
// Internal embeds.
} elseif ( '/oembed/1.0/embed' === $request->get_route() ) {
$response['html'] = apply_filters( 'oembed_result', _wp_oembed_get_object()->data2html( (object) $response, $_GET['url'] ), $_GET['url'], array() );

This comment has been minimized.

@pento

pento Jan 25, 2018

Member

This is removing the <script> from the public embed response, which is necessary for non-WordPress sites to be able to handle WordPress embeds. We shouldn't be filtering the response for this endpoint.

@imath

This comment has been minimized.

Contributor

imath commented Jan 26, 2018

Hi @pento I confirm. I just updated the PR with your code and tested it and all WordPress embedded posts (self of external) are displayed the right way. Thanks a lot for your help.

About this failing test, i have no idea about how to fix the checksum integrity thing 😢

@swissspidy

This comment has been minimized.

Member

swissspidy commented Jan 26, 2018

Probably can be fixed by clearing the cache on Travis and re-running the tests.

// External embeds.
if ( '/oembed/1.0/proxy' === $request->get_route() ) {
if ( is_wp_error( $response ) ) {
// It's possible a local post, so lets try and retrieve it that way.

This comment has been minimized.

@swissspidy

swissspidy Jan 26, 2018

Member

*possibly

@imath

This comment has been minimized.

Contributor

imath commented Jan 27, 2018

Thanks for the feedback @swissspidy i've fixed the typo issue, and tests are now ok 😃

@imath

This comment has been minimized.

Contributor

imath commented Feb 4, 2018

Hi @pento
As you haven't edited your review since my last PR update, does it mean you think there are still changes/improvements to be made ?

@swissspidy swissspidy requested a review from pento Feb 4, 2018

@swissspidy

This comment has been minimized.

Member

swissspidy commented Feb 4, 2018

I just requested a new review to make sure :-)

@imath

This comment has been minimized.

Contributor

imath commented Feb 8, 2018

I've just updated the PR to resolve conflicts, just in case you find some time to progress on this.

imath added a commit to imath/gutenblocks that referenced this pull request Feb 13, 2018

Improve the WordPress Embed GutenBlock
Untill WordPress/gutenberg#4226 is fixed, this will make sure the data-secret mechanism is also applied to WordPress embeds in Gutenberg.
@pento

pento approved these changes Feb 21, 2018

@pento

This comment has been minimized.

Member

pento commented Feb 21, 2018

@imath: I was on vacation for the last few weeks. 🙂

If you could update your branch for the merge conflict, let's go ahead and get this merged.

Make sure the oembed proxy route applies the WP Embed security mechan…
…ism.

- Include the wp-embed.js script only once into the WP Editor.
- Add a temporary filter to make sure the HTML is run through the oembed sanitisation routines.
- Use a specific wrapper for WordPress Embeds.
@imath

This comment has been minimized.

Contributor

imath commented Feb 21, 2018

Hi @pento, thanks a lot for your update (@aduth told me you were afk for a little while yes 😉 ). I've just updated my branch to fix the merge conflict.

@swissspidy swissspidy merged commit 38e9503 into WordPress:master Feb 21, 2018

2 checks passed

codecov/project 34.2% (-0.02%) compared to 25de703
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

@aduth aduth referenced this pull request Mar 1, 2018

Closed

Embedded Content Block #3497

0 of 2 tasks complete
@swissspidy

This comment has been minimized.

Member

swissspidy commented Oct 22, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment