Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#34115: Cache oEmbed responses in a custom post type #254

Closed
wants to merge 17 commits into from

Conversation

Projects
None yet
3 participants
@swissspidy
Copy link
Collaborator

commented Sep 12, 2017

This PR explores caching embed responses in a custom post instead of post meta or a transient.

To do:

  • Leverage in WP_oEmbed_Controller::get_proxy_item() and add tests for it.
  • Probably rename that post type (currently named oembed)
  • Adjust post type properties (caps, etc.)
  • Think about what cache expiration should look like

See #34115 on Trac for more information.

@swissspidy swissspidy requested a review from westonruter Sep 12, 2017

@westonruter
Copy link
Collaborator

left a comment

In regards to:

Think about what cache expiration should look like

Could just be looking at the oEmbed cache post's modified time, and if it is older than $cached_recently then try to fetch it again and then do wp_update_post(). If the request fails to get the new oEmbed data, it could still just touch the post to bump it's modified time, and continue serving back the stale oEmbed in the case of request failures. This could also be done for the postmeta caching approach.

// Use oEmbed to get the HTML
$html = wp_oembed_get( $url, $attr );
wp_insert_post( array(

This comment has been minimized.

Copy link
@westonruter

westonruter Sep 12, 2017

Collaborator

The array here needs to be wrapped by wp_slash(), sadly.

$oembed_post_query = new WP_Query( array(
'post_type' => 'oembed',
'post_status' => get_post_stati(),

This comment has been minimized.

Copy link
@westonruter

westonruter Sep 12, 2017

Collaborator

This could be just publish since they're only ever getting inserted with that status.

@@ -195,6 +195,21 @@ function create_initial_post_types() {
),
) );
register_post_type( 'oembed', array(

This comment has been minimized.

Copy link
@westonruter

westonruter Sep 12, 2017

Collaborator

I think oembed_cache would be better perhaps.

$cache = $cached_post->post_content;
} elseif ( $post_ID ) {
$cache = get_post_meta( $post_ID, $cachekey, true );
$cache_time = get_post_meta( $post_ID, $cachekey_time, true );

This comment has been minimized.

Copy link
@westonruter

westonruter Sep 12, 2017

Collaborator

Eliminate this postmeta call.

if ( $cached_post_id ) {
$cached_post = get_post( $cached_post_id );
$cache = $cached_post->post_content;

This comment has been minimized.

Copy link
@westonruter

westonruter Sep 12, 2017

Collaborator

The $cache_time here can simply be strtotime( $cached_post->post_modified_gmt )

swissspidy added some commits Sep 13, 2017

$cache_group = 'oembed_cache_post';
$oembed_post_id = wp_cache_get( $cache_key, $cache_group );
if ( $oembed_post_id && 'oembed' === get_post_type( $oembed_post_id ) ) {

This comment has been minimized.

Copy link
@westonruter

westonruter Sep 13, 2017

Collaborator

s/oembed/oembed_cache/

$cached_post_id = $this->find_oembed_post_id( $key_suffix );
if ( $cached_post_id ) {

This comment has been minimized.

Copy link
@westonruter

westonruter Sep 13, 2017

Collaborator

Maybe the condition here should be reversed. The postmeta caches should already be primed so it would be better to look first at postmeta to see if there is a match there and then if not, fallback to find_oembed_post_id. So like:

if ( $post_ID ) {
    $cache = get_post_meta( $post_ID, $cachekey, true );
    // ...
} elseif ( $cached_post_id = $this->find_oembed_post_id( $key_suffix ) ) {
    $cache = get_post( $cached_post_id )->post_content;
    // ...
}
$cache = get_post_meta( $post_ID, $cachekey, true );
$cache_time = get_post_meta( $post_ID, $cachekey_time, true );

This comment has been minimized.

Copy link
@westonruter

westonruter Sep 13, 2017

Collaborator

Why is the postmeta for the $cachekey_time being removed? I don't see $cache_time ever getting set in the case of a postmeta cache now.

This comment has been minimized.

Copy link
@swissspidy

swissspidy Sep 14, 2017

Author Collaborator

Sorry, I guess I misunderstood your comment in #254 (comment) then. Perhaps you can elaborate a bit more on it and I'll revert & fix this.

This comment has been minimized.

Copy link
@westonruter

westonruter Sep 14, 2017

Collaborator

I think when I wrote that comment I was thinking that all the caching would be handled by the new oembed_cache post without there being any postmeta caches remaining. However, I think there is some value now in keeping the postmeta cache around because it will be faster to get the data due to the postmeta caches being primed with the WP_Query. So if we retain caching in postmeta, then we'll need to retain the $cache_time in order to know when it has expired.

swissspidy added some commits Sep 14, 2017

@swissspidy

This comment has been minimized.

Copy link
Collaborator Author

commented Sep 14, 2017

Could just be looking at the oEmbed cache post's modified time, and if it is older than $cached_recently then try to fetch it again and then do wp_update_post().

Right now, cached responses are never updated. If we do this, we should increase $ttl (currently DAY_IN_SECONDS). Otherwise there will be way too many HTTP requests.

If the request fails to get the new oEmbed data, it could still just touch the post to bump it's modified time, and continue serving back the stale oEmbed in the case of request failures.

👍

This could also be done for the postmeta caching approach.

It would be great if we could migrate post meta caches over to the post type instead of bumping the cache time there.

@westonruter

This comment has been minimized.

Copy link
Collaborator

commented Sep 14, 2017

Right now, cached responses are never updated. If we do this, we should increase $ttl (currently DAY_IN_SECONDS). Otherwise there will be way too many HTTP requests.

Why would there by way more requests than there are currently in WP for the existing postmeta cache? How is this different?

@westonruter

This comment has been minimized.

Copy link
Collaborator

commented Sep 15, 2017

It would be great if we could migrate post meta caches over to the post type instead of bumping the cache time there.

I think it's going to be more performant to continue using the postmeta cache by default. Consider having an index page for videos where you are showing 30 posts, each of which has an oEmbed. When postmeta caches are used, when the WP_Query is made, WordPress will update post caches of the postmeta for all of the queried posts at once via update_post_caches().

If, however, oEmbed caches are stored in a custom post type, then these will not be able to be fetched en-mass and a separate DB query will result for every oEmbed in every post.

@swissspidy

This comment has been minimized.

Copy link
Collaborator Author

commented Sep 19, 2017

Why would there by way more requests than there are currently in WP for the existing postmeta cache? How is this different?

What I meant is that WordPress currently does not attempt to update cached embeds ever. If we do that, the TTL should be higher than 24h.

swissspidy added some commits Sep 19, 2017

@swissspidy

This comment has been minimized.

Copy link
Collaborator Author

commented Sep 19, 2017

@westonruter Were you thinking of something like that?

@@ -229,7 +229,7 @@ class WP_Embed {
 
 		$cached_recently = ( time() - $cache_time ) < $ttl;
 
-		if ( $this->usecache || $cached_recently ) {
+		if ( $this->usecache && $cached_recently ) {
 			// Failures are cached. Serve one if we're using the cache.
 			if ( '{{unknown}}' === $cache ) {
 				return $this->maybe_make_link( $url );
@@ -269,11 +269,15 @@ class WP_Embed {
 
 		if ( $post_ID ) {
 			if ( $html ) {
-				update_post_meta( $post_ID, $cachekey, $html );
-				update_post_meta( $post_ID, $cachekey_time, time() );
+				update_post_meta( $post_ID, $cachekey, $html  );
 			} elseif ( ! $cache ) {
 				update_post_meta( $post_ID, $cachekey, '{{unknown}}' );
 			}
+
+			update_post_meta( $post_ID, $cachekey_time, time() );
+		} elseif( $cached_post ) {
+			$cached_post->post_content = $html ? $html : $cached_post->post_content;
+			wp_update_post( $cached_post );
 		} else {
 			wp_insert_post( wp_slash( array(
 				'post_name'    => $key_suffix,
@westonruter

This comment has been minimized.

Copy link
Collaborator

commented Sep 19, 2017

@swissspidy

What I meant is that WordPress currently does not attempt to update cached embeds ever. If we do that, the TTL should be higher than 24h.

😲 Wow. I can see you're right. This OR condition here $this->usecache || $cached_recently indeed will mean that a cached oEmbed will never be updated. Doesn't that seem wrong to you?

But, if that is how it has been working for however long, maybe this isn't a good time to change it. The issue of keeping a cached oEmbed fresh could be tackled in a separate ticket. Sorry my assumptions about how it worked distracted from the main thing you're working on here.

@swissspidy

This comment has been minimized.

Copy link
Collaborator Author

commented Sep 19, 2017

It does seem wrong on first glance and there have been a few trac tickets about that particular line in the past. There's a good explanation by Helen on https://core.trac.wordpress.org/ticket/37597. It's why I pointed out there would be way too many HTTP requests for outdated embeds.

I think what we can do — and we should indeed do this in a separate ticket — is to add a new cron action in that case. Cron would then handle the updating instead of having potentially dozens of external HTTP requests when viewing a post.

@westonruter

This comment has been minimized.

Copy link
Collaborator

commented Sep 19, 2017

Thanks for pointing me to that ticket. 🙇

@westonruter

This comment has been minimized.

Copy link
Collaborator

commented Sep 19, 2017

@swissspidy anything WIP left in this PR as far as you know? Ready for wider testing?

@westonruter

This comment has been minimized.

Copy link
Collaborator

commented Sep 19, 2017

Oh, probably these two which you already list in the description:

  • Leverage in WP_oEmbed_Controller::get_proxy_item() and add tests for it.
  • Adjust post type properties (caps, etc.)
@swissspidy

This comment has been minimized.

Copy link
Collaborator Author

commented Sep 20, 2017

@westonruter Yeah I'm not sure about the WP_oEmbed_Controller part because currently we store the resulting HTML as a post or post meta, but the proxy needs the original data object, not just the HTML.

} else {
wp_insert_post( wp_slash( array(
'post_name' => $key_suffix,
'post_content' => $html ? $html : '{{unknown}}',

This comment has been minimized.

Copy link
@westonruter

westonruter Sep 26, 2017

Collaborator

This could be liable for Kses corruption. You may need to disable Kses before and restore after, as is done for customize_changeset and custom_css post types when inserting a post. Add a test for an oEmbed that contains an iframe but for which the logged-in user is just an author.

Disable KSES before inserting oEmbed post
To do: add test for this.

@westonruter westonruter changed the title [WIP] #34115: Cache oEmbed responses in a custom post type #34115: Cache oEmbed responses in a custom post type Sep 29, 2017

@westonruter

This comment has been minimized.

Copy link
Collaborator

commented Sep 30, 2017

Committed to trunk in r41651.

@westonruter westonruter deleted the 34115-oembed-post-type branch Sep 30, 2017

@JJJ

This comment has been minimized.

Copy link

commented Sep 30, 2017

Tangentially related: https://core.trac.wordpress.org/ticket/31245#comment:45

If we had wp_optionmeta we could associate option-based object oEmbed renders in the same meta-data way other components do, vs. introducing a bespoke approach for widgets.

FWIW, BuddyPress went all-in with the same meta-data approach years ago, at it's worked well (aside from the same points about updates/invalidation noted by Helen above.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.