Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add filter to set custom cache timeouts for podcasts #24966

Merged
merged 16 commits into from
Jul 28, 2022

Conversation

daledupreez
Copy link
Contributor

@daledupreez daledupreez commented Jul 5, 2022

Related to Automattic/wp-calypso#50740

Changes proposed in this Pull Request:

  • This PR adds a new jetpack_podcast_feed_cache_timeout filter to allow for custom cache timeouts for podcasts instead of the default of (12 hours in WP_Feed_Cache_Transient).
  • The default podcast cache timeout is undefined so we don't trigger unexpected changes in RSS feed timeouts if sites have non-default cache timeouts for RSS feeds. We only hook into the WordPress Core wp_feed_cache_transient_lifetime filter when we have a podcast cache timeout value returned from the filter.
    • My concern here is that if someone has hooked into the wp_feed_cache_transient_lifetime filter and is setting the RSS cache timeout to 14 hours, us introducing a new default could result in podcasts now having a 12 hour cache timeout (depending on the specifics about how the filters were added).
    • As noted in the filter documentation, the code only works if the site is using WP_Feed_Cache_Transient for caching, or their cache implementation uses the wp_feed_cache_transient_lifetime filter.

Other information:

  • Have you written new tests for your changes, if applicable?
    • No, the logic here falls around the WordPress Core fetch_feed() function, and I don't know how to stub that code out effectively.
  • Have you checked the E2E test CI results, and verified that your changes do not break them?

Jetpack product discussion

This is a small enhancement to make it possible for podcast cache timeouts to be reduced so updates to podcasts can be picked up sooner.

Does this pull request change what data or activity we track or use?

The PR does not include tracking or data changes.

Testing instructions:

Testing on a self-hosted Jetpack site
  • Ensure that you have the Jetpack plugin installed, connected, and activated
  • Ensure that you have the Jetpack Beta plugin installed and activated
  • Ensure that you have the Gutenberg plugin installed and activated
  • Navigate to Jetpack -> Jetpack Beta in the left nav
  • Enter add/podcast-player-custom-cache-timeout into the "Search for a Jetpack Feature Branch" search field, and click on the Activate button next to the branch when it appears
  • Add the following code to your site, either manually or using a tool like Code Snippets:
add_filter( 'jetpack_podcast_feed_cache_timeout', function ( $cache_timeout, $feed_url ) {
	error_log( "jetpack_podcast_feed_cache_timeout filter; \$feed_url=$feed_url; \$cache_timeout=$cache_timeout" );
	return 3600;
} );

add_action( 'setted_transient', function ( $transient_name, $transient_value, $expiration ) {
	// We want to log sets to `feed_*` and `feed_mod_*`
	if ( 0 !== strpos( $transient_name, 'feed_' ) ) {
		return;
	}

	error_log( "Setting RSS feed transient $transient_name with expiration $expiration" );
}, 10, 3 );
  • Ensure you have a way to view your PHP error log, either via file system access, or via a tool like Error Log Monitor.
  • Navigate to Posts -> Add New
  • In the editor, add a new podcast player block
  • In the input field, enter the URL of a podcast RSS feed that you haven't used previously. You can use a URL like https://distributed.blog/category/podcast/feed/?somerandomsuffix873287 to ensure the feed hasn't been cached.
  • View your PHP error log, and verify that you see three distinct entries like:
jetpack_podcast_feed_cache_timeout; $feed_url=https://distributed.blog/category/podcast/feed/?abcyxzedf; $cache_timeout=
Setting RSS feed transient feed_[hash] with expiration 3600
Setting RSS feed transient feed_mod_[hash] with expiration 3600
  • Update the jetpack_podcast_feed_cache_timeout function above to return $cache_timeout instead of 3600
  • Go back to edit your post and enter a new feed URL (possibly with a mangled suffix)
  • Verify that you see the following entries in the PHP error log:
jetpack_podcast_feed_cache_timeout; $feed_url=https://distributed.blog/category/podcast/feed/?abcyxzedf; $cache_timeout=
Setting RSS feed transient feed_[hash] with expiration 43200
Setting RSS feed transient feed_mod_[hash] with expiration 43200
Testing steps on a WordPress.com Simple site

I will document these steps as part of D83749-code, as this depends on a number of internal tools and a different RSS caching infrastructure.

@matticbot
Copy link
Contributor

Caution: This PR has changes that must be merged to WordPress.com
Hello daledupreez! These changes need to be synced to WordPress.com - If you 're an a11n, please commandeer and confirm D83749-code works as expected before merging this PR. Once this PR is merged, please commit the changes to WP.com. Thank you!
This revision will be updated with each commit to this PR

@github-actions github-actions bot added the [Plugin] Jetpack Issues about the Jetpack plugin. https://wordpress.org/plugins/jetpack/ label Jul 5, 2022
@github-actions
Copy link
Contributor

github-actions bot commented Jul 5, 2022

Thank you for your PR!

When contributing to Jetpack, we have a few suggestions that can help us test and review your patch:

  • ✅ Include a description of your PR changes.
  • ⚠️ All commits were linted before commit.
  • ✅ Add a "[Status]" label (In Progress, Needs Team Review, ...).
  • ✅ Add testing instructions.
  • ✅ Specify whether this PR includes any changes to data or privacy.
  • ✅ Add changelog entries to affected projects

This comment will be updated as you work on your PR and make changes. If you think that some of those checks are not needed for your PR, please explain why you think so. Thanks for cooperation 🤖


The e2e test report can be found here. Please note that it can take a few minutes after the e2e tests checks are complete for the report to be available.


Once your PR is ready for review, check one last time that all required checks (other than "Required review") appearing at the bottom of this PR are passing or skipped.
Then, add the "[Status] Needs Team review" label and ask someone from your team review the code.
Once you’ve done so, switch to the "[Status] Needs Review" label; someone from Jetpack Crew will then review this PR and merge it to be included in the next Jetpack release.


Jetpack plugin:

  • Next scheduled release: August 2, 2022.
  • Scheduled code freeze: July 25, 2022.

@mpkelly
Copy link
Contributor

mpkelly commented Jul 8, 2022

@daledupreez I just had a look at the code change so far and it seems ok. I haven't tested anything though.

@daledupreez daledupreez force-pushed the add/podcast-player-custom-cache-timeout branch from 4da40d8 to 3be7ef9 Compare July 12, 2022 19:23
@daledupreez daledupreez changed the title [WIP] Add ability to set custom cache timeouts for the podcast player block Add filter to set custom cache timeouts for podcasts Jul 12, 2022
Copy link
Contributor

@creativecoder creativecoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't tested, but I think that managing the cached feed is complicated by the fact there is both the transient created by WP_Feed_Cache_Transient when fetching the feed, and the transient created by Jetpack_Podcast_Helper when constructing the data from the feed (https://github.com/Automattic/jetpack/blob/add/podcast-player-custom-cache-timeout/projects/plugins/jetpack/_inc/lib/class-jetpack-podcast-helper.php#L155).

Perhaps we should either

  • Remove the transient created by Jetpack_Podcast_Helper... as none of the data collection steps look particularly expensive to me (should track performance to be sure)
  • Tie the Jetpack_Podcast_Helper transient to the filtered cache timeout value, so it expires just a little after the WP_Feed_Cache_Transient transient, and is sure to get a fresh response from the feed.

@@ -9,20 +9,46 @@
* Class Jetpack_Podcast_Helper
*/
class Jetpack_Podcast_Helper {
const MIN_PODCAST_CACHE_TIMEOUT = 30 * MINUTE_IN_SECONDS;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be a minimum that we enforce? Is there a reason the value shouldn't be 10 or 20 min, if a user wanted?

I think a default makes more sense to me, that can be overridden as the user desires.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the issue from my perspective is that we've never specified a default value in Jetpack -- this code always used the general WordPress timeout that applies to all RSS feeds. So I started with a minimum to prevent unusually small values here -- we could easily remove the minimum altogether, or make it something like 5-10 minutes too.

So the no-default is really a backwards-compatibility move -- I don't want to suddenly apply a default for all podcast fetches.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the default now is effectively 12 hours, because that's what the WordPress feed caching uses, why not just use that as the default here?

Ultimately anyone can use the underlying wp_feed_cache_transient_lifetime filter and override the minimum anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the minimum to in 1c5d462, though we could also remove this minimum entirely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've now removed the minimum -- I think users capable of implementing these hooks can do far worse!

* @param string $podcast_url The URL of the podcast feed.
* @param int|null $cache_timeout The number of seconds to cache the podcast data. Default value is null.
*/
$podcast_cache_timeout = apply_filters( 'jetpack_podcast_feed_cache_timeout', $this->feed, null );
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't how I'd expect a filter on the timeout value to be applied, as the default value for the timeout isn't available to any filter callback functions. I'd expect the default timeout to be the second argument to apply_filters, as that is the value being filtered.

Providing the feed url as an additional argument to apply_filters is a great idea, though, as this will allow callback functions to customize the timeout based on the specific feed.

Also, what is the null argument for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default value in this case is null - we're using that to indicate we should fall through to the default value returned from the wp_feed_cache_transient_lifetime filter. We could definitely invert this to apply that filter to get our default value, but that feels like it could easily get into recursive territory if we then hook into the same filter to inject our custom value. Maybe this needs additional documentation to clarify why this is unusual?

I would be more than happy to change, but I wanted to try and do less if we don't have a custom value supplied by the environment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, with null as the default timeout, I would expect to see the following, with null as the second argument.

apply_filters( 'jetpack_podcast_feed_cache_timeout', null, $this->feed );

Because the timeout is what's being filtered, not the feed url.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦 You're 100% right!

I've updated this in 1c5d462 to switch the argument order (and documentation), as well as using the default value instead of null.

*
* @var int|null
*/
protected $cache_timeout = null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noting that if we discard the min, you could assign the default value right here--I don't think there's really a need for a separate class constant.

Also, this is called the "lifetime" in the WP_Feed_Cache_Transient class (what we're ultimately filtering), in case you want to follow that convention (https://github.com/WordPress/wordpress-develop/blob/e49f452f65b309a20434a87d363ae339016cdec9/src/wp-includes/class-wp-feed-cache-transient.php#L41).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the code to set the default to 12 * HOUR_IN_SECONDS in 1c5d462.

I've kept this called cache_timeout so its purpose for readers is a bit clearer. Without knowing more about the guts of WP_Feed_Cache_Transient, I wouldn't have any expectation of lifetime being in the value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed my mind on this one -- I am now defaulting to null, and have updated the PR description to note why.

The core problem is that if someone has already hooked into the wp_feed_cache_transient_lifetime filter and was returning a higher value for all RSS feeds, our default would now make podcast feeds have a different timeout. I think it's simpler to defer to the value from the hook until we get a different value returned from the new jetpack_podcast_feed_cache_timeout filter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

our default would now make podcast feeds have a different timeout.

This is a great point--I think not interfering with the cache timeout unless there's a filter explicitly set makes sense.

@daledupreez daledupreez requested a review from a team July 20, 2022 14:20
@daledupreez daledupreez marked this pull request as ready for review July 20, 2022 14:21
 * Default timeout to 12 hours
 * Fix parameter order for the new filter, and use the non-null default value
 * Reduce the minimum cache time to 10 minutes
public function filter_podcast_cache_timeout( $cache_timeout_in_seconds ) {
if ( $this->cache_timeout !== null ) {
// If the cache timeout has already been reduced, return the smaller value of the two.
return min( $cache_timeout_in_seconds, $this->cache_timeout );
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using min here is surprising to me--I would expect the value I provided to the jetpack_podcast_feed_cache_timeout filter to be used if set, not to only be used if it happens to be smaller than the current timeout.

If keeping it this way, the the doc comment should indicate the filter value is conditionally used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Along with keeping the null value as default, I have updated the code to always apply the value if it has been specified. I think we need both pieces, because it means we can assume the user specifically wants a different cache timeout for podcast feeds, because we wouldn't be in this code if the user left the podcast cache timeout as null.

creativecoder
creativecoder previously approved these changes Jul 27, 2022
Copy link
Contributor

@creativecoder creativecoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good to me.

Left a couple of questions/suggestions, but nothing blocking.

One caveat: I verified the jetpack_podcast_feed_cache_timeout filter is working as expected and providing a value that will then affect the wp_feed_cache_transient_lifetime value.

However, I did not explicitly check how quickly new podcast episodes will show up if using this filter. I suspect this is still complicated by the dual caching we have here, that also sets a separate transient for the podcast data after being formatted from the feed.

*
* @var int|null
*/
protected $cache_timeout = null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

our default would now make podcast feeds have a different timeout.

This is a great point--I think not interfering with the cache timeout unless there's a filter explicitly set makes sense.

@daledupreez daledupreez added [Status] Needs Review To request a review from Crew. Label will be renamed soon. and removed [Status] In Progress DO NOT MERGE don't merge it! labels Jul 27, 2022
@daledupreez
Copy link
Contributor Author

Thanks for the review and great feedback, @creativecoder! 🙇

One caveat: I verified the jetpack_podcast_feed_cache_timeout filter is working as expected and providing a value that will then affect the wp_feed_cache_transient_lifetime value.

However, I did not explicitly check how quickly new podcast episodes will show up if using this filter. I suspect this is still complicated by the dual caching we have here, that also sets a separate transient for the podcast data after being formatted from the feed.

I think you're absolutely right that this won't be fully effective for cache timeouts below 1 hour, and even then we could see podcast metadata cached for an hour longer than we might expect due to the dual caching for the episode and feed data. I don't think it makes sense to tackle that work in this PR, but I am also not 100% sure how best to address that overall, because even adding a filter to the timeout for the episode-level caching can still result in the episode cache outlasting the lower-level feed cache in some edge cases.

Comment on lines 53 to 56
$podcast_cache_timeout = apply_filters( 'jetpack_podcast_feed_cache_timeout', $this->cache_timeout, $this->feed );

// Make sure we force new values for $this->cache_timeout to be integers.
if ( is_int( $podcast_cache_timeout ) ) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if someone were to pass a string to the filter:

add_filter( 'jetpack_podcast_feed_cache_timeout', function () { return '42'; } );

Maybe we should take that into account, and cast everything as an int as long as it's not null?

Suggested change
$podcast_cache_timeout = apply_filters( 'jetpack_podcast_feed_cache_timeout', $this->cache_timeout, $this->feed );
// Make sure we force new values for $this->cache_timeout to be integers.
if ( is_int( $podcast_cache_timeout ) ) {
$podcast_cache_timeout = apply_filters( 'jetpack_podcast_feed_cache_timeout', $this->cache_timeout, $this->feed );
// Make sure we force new values for $this->cache_timeout to be integers.
if ( $podcast_cache_timeout !== null ) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jeherve, would it make more sense to keep the logic as-is but check for is_numeric() instead? I don't mind keeping the logic looser and simply casting to an int for any non-null value, but I am happy to defer to the Jetpack team's preference on this one!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, thanks for the ultra-quick review!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_numeric() would work, sure thing! That works for me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've switched to is_numeric() in 6138202.

@jeherve jeherve added [Type] Enhancement Changes to an existing feature — removing, adding, or changing parts of it [Status] Needs Author Reply We would need you to make some changes or provide some more details about your PR. Thank you! [Pri] Normal and removed [Status] Needs Review To request a review from Crew. Label will be renamed soon. labels Jul 27, 2022
@jeherve jeherve added this to the jetpack/11.3 milestone Jul 27, 2022
@daledupreez daledupreez added [Status] Needs Review To request a review from Crew. Label will be renamed soon. and removed [Status] Needs Author Reply We would need you to make some changes or provide some more details about your PR. Thank you! labels Jul 27, 2022
@jeherve jeherve added [Status] Ready to Merge Go ahead, you can push that green button! and removed [Status] Needs Review To request a review from Crew. Label will be renamed soon. labels Jul 27, 2022
@daledupreez daledupreez merged commit 7d13a9a into trunk Jul 28, 2022
@daledupreez daledupreez deleted the add/podcast-player-custom-cache-timeout branch July 28, 2022 08:50
@github-actions github-actions bot removed the [Status] Ready to Merge Go ahead, you can push that green button! label Jul 28, 2022
@github-actions
Copy link
Contributor

Great news! One last step: head over to your WordPress.com diff, D83749-code, and deploy it.
Once you've done so, come back to this PR and add a comment with your SVN changeset ID (e.g. r12345-wpcom).

Thank you!

@daledupreez
Copy link
Contributor Author

Deployed to WordPress.com in r249734-wpcom.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Block] Podcast Player [Plugin] Jetpack Issues about the Jetpack plugin. https://wordpress.org/plugins/jetpack/ [Pri] Normal Touches WP.com Files [Type] Enhancement Changes to an existing feature — removing, adding, or changing parts of it
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants