-
Notifications
You must be signed in to change notification settings - Fork 798
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add filter to set custom cache timeouts for podcasts #24966
Conversation
Caution: This PR has changes that must be merged to WordPress.com |
Thank you for your PR! When contributing to Jetpack, we have a few suggestions that can help us test and review your patch:
This comment will be updated as you work on your PR and make changes. If you think that some of those checks are not needed for your PR, please explain why you think so. Thanks for cooperation 🤖 The e2e test report can be found here. Please note that it can take a few minutes after the e2e tests checks are complete for the report to be available. Once your PR is ready for review, check one last time that all required checks (other than "Required review") appearing at the bottom of this PR are passing or skipped. Jetpack plugin:
|
@daledupreez I just had a look at the code change so far and it seems ok. I haven't tested anything though. |
…feeds used by the podcast player block
4da40d8
to
3be7ef9
Compare
…ustom-cache-timeout
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't tested, but I think that managing the cached feed is complicated by the fact there is both the transient created by WP_Feed_Cache_Transient
when fetching the feed, and the transient created by Jetpack_Podcast_Helper
when constructing the data from the feed (https://github.com/Automattic/jetpack/blob/add/podcast-player-custom-cache-timeout/projects/plugins/jetpack/_inc/lib/class-jetpack-podcast-helper.php#L155).
Perhaps we should either
- Remove the transient created by
Jetpack_Podcast_Helper
... as none of the data collection steps look particularly expensive to me (should track performance to be sure) - Tie the
Jetpack_Podcast_Helper
transient to the filtered cache timeout value, so it expires just a little after theWP_Feed_Cache_Transient
transient, and is sure to get a fresh response from the feed.
@@ -9,20 +9,46 @@ | |||
* Class Jetpack_Podcast_Helper | |||
*/ | |||
class Jetpack_Podcast_Helper { | |||
const MIN_PODCAST_CACHE_TIMEOUT = 30 * MINUTE_IN_SECONDS; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to be a minimum that we enforce? Is there a reason the value shouldn't be 10 or 20 min, if a user wanted?
I think a default makes more sense to me, that can be overridden as the user desires.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the issue from my perspective is that we've never specified a default value in Jetpack -- this code always used the general WordPress timeout that applies to all RSS feeds. So I started with a minimum to prevent unusually small values here -- we could easily remove the minimum altogether, or make it something like 5-10 minutes too.
So the no-default is really a backwards-compatibility move -- I don't want to suddenly apply a default for all podcast fetches.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the default now is effectively 12 hours, because that's what the WordPress feed caching uses, why not just use that as the default here?
Ultimately anyone can use the underlying wp_feed_cache_transient_lifetime
filter and override the minimum anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the minimum to in 1c5d462, though we could also remove this minimum entirely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've now removed the minimum -- I think users capable of implementing these hooks can do far worse!
* @param string $podcast_url The URL of the podcast feed. | ||
* @param int|null $cache_timeout The number of seconds to cache the podcast data. Default value is null. | ||
*/ | ||
$podcast_cache_timeout = apply_filters( 'jetpack_podcast_feed_cache_timeout', $this->feed, null ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't how I'd expect a filter on the timeout value to be applied, as the default value for the timeout isn't available to any filter callback functions. I'd expect the default timeout to be the second argument to apply_filters
, as that is the value being filtered.
Providing the feed url as an additional argument to apply_filters
is a great idea, though, as this will allow callback functions to customize the timeout based on the specific feed.
Also, what is the null
argument for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default value in this case is null
- we're using that to indicate we should fall through to the default value returned from the wp_feed_cache_transient_lifetime
filter. We could definitely invert this to apply that filter to get our default value, but that feels like it could easily get into recursive territory if we then hook into the same filter to inject our custom value. Maybe this needs additional documentation to clarify why this is unusual?
I would be more than happy to change, but I wanted to try and do less if we don't have a custom value supplied by the environment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case, with null
as the default timeout, I would expect to see the following, with null
as the second argument.
apply_filters( 'jetpack_podcast_feed_cache_timeout', null, $this->feed );
Because the timeout is what's being filtered, not the feed url.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤦 You're 100% right!
I've updated this in 1c5d462 to switch the argument order (and documentation), as well as using the default value instead of null
.
* | ||
* @var int|null | ||
*/ | ||
protected $cache_timeout = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noting that if we discard the min, you could assign the default value right here--I don't think there's really a need for a separate class constant.
Also, this is called the "lifetime" in the WP_Feed_Cache_Transient
class (what we're ultimately filtering), in case you want to follow that convention (https://github.com/WordPress/wordpress-develop/blob/e49f452f65b309a20434a87d363ae339016cdec9/src/wp-includes/class-wp-feed-cache-transient.php#L41).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the code to set the default to 12 * HOUR_IN_SECONDS
in 1c5d462.
I've kept this called cache_timeout
so its purpose for readers is a bit clearer. Without knowing more about the guts of WP_Feed_Cache_Transient
, I wouldn't have any expectation of lifetime
being in the value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed my mind on this one -- I am now defaulting to null, and have updated the PR description to note why.
The core problem is that if someone has already hooked into the wp_feed_cache_transient_lifetime
filter and was returning a higher value for all RSS feeds, our default would now make podcast feeds have a different timeout. I think it's simpler to defer to the value from the hook until we get a different value returned from the new jetpack_podcast_feed_cache_timeout
filter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
our default would now make podcast feeds have a different timeout.
This is a great point--I think not interfering with the cache timeout unless there's a filter explicitly set makes sense.
* Default timeout to 12 hours * Fix parameter order for the new filter, and use the non-null default value * Reduce the minimum cache time to 10 minutes
public function filter_podcast_cache_timeout( $cache_timeout_in_seconds ) { | ||
if ( $this->cache_timeout !== null ) { | ||
// If the cache timeout has already been reduced, return the smaller value of the two. | ||
return min( $cache_timeout_in_seconds, $this->cache_timeout ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using min
here is surprising to me--I would expect the value I provided to the jetpack_podcast_feed_cache_timeout
filter to be used if set, not to only be used if it happens to be smaller than the current timeout.
If keeping it this way, the the doc comment should indicate the filter value is conditionally used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Along with keeping the null
value as default, I have updated the code to always apply the value if it has been specified. I think we need both pieces, because it means we can assume the user specifically wants a different cache timeout for podcast feeds, because we wouldn't be in this code if the user left the podcast cache timeout as null.
…d/podcast-player-custom-cache-timeout
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking good to me.
Left a couple of questions/suggestions, but nothing blocking.
One caveat: I verified the jetpack_podcast_feed_cache_timeout
filter is working as expected and providing a value that will then affect the wp_feed_cache_transient_lifetime
value.
However, I did not explicitly check how quickly new podcast episodes will show up if using this filter. I suspect this is still complicated by the dual caching we have here, that also sets a separate transient for the podcast data after being formatted from the feed.
* | ||
* @var int|null | ||
*/ | ||
protected $cache_timeout = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
our default would now make podcast feeds have a different timeout.
This is a great point--I think not interfering with the cache timeout unless there's a filter explicitly set makes sense.
projects/plugins/jetpack/_inc/lib/class-jetpack-podcast-helper.php
Outdated
Show resolved
Hide resolved
projects/plugins/jetpack/_inc/lib/class-jetpack-podcast-helper.php
Outdated
Show resolved
Hide resolved
projects/plugins/jetpack/_inc/lib/class-jetpack-podcast-helper.php
Outdated
Show resolved
Hide resolved
Thanks for the review and great feedback, @creativecoder! 🙇
I think you're absolutely right that this won't be fully effective for cache timeouts below 1 hour, and even then we could see podcast metadata cached for an hour longer than we might expect due to the dual caching for the episode and feed data. I don't think it makes sense to tackle that work in this PR, but I am also not 100% sure how best to address that overall, because even adding a filter to the timeout for the episode-level caching can still result in the episode cache outlasting the lower-level feed cache in some edge cases. |
$podcast_cache_timeout = apply_filters( 'jetpack_podcast_feed_cache_timeout', $this->cache_timeout, $this->feed ); | ||
|
||
// Make sure we force new values for $this->cache_timeout to be integers. | ||
if ( is_int( $podcast_cache_timeout ) ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if someone were to pass a string to the filter:
add_filter( 'jetpack_podcast_feed_cache_timeout', function () { return '42'; } );
Maybe we should take that into account, and cast everything as an int as long as it's not null?
$podcast_cache_timeout = apply_filters( 'jetpack_podcast_feed_cache_timeout', $this->cache_timeout, $this->feed ); | |
// Make sure we force new values for $this->cache_timeout to be integers. | |
if ( is_int( $podcast_cache_timeout ) ) { | |
$podcast_cache_timeout = apply_filters( 'jetpack_podcast_feed_cache_timeout', $this->cache_timeout, $this->feed ); | |
// Make sure we force new values for $this->cache_timeout to be integers. | |
if ( $podcast_cache_timeout !== null ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jeherve, would it make more sense to keep the logic as-is but check for is_numeric()
instead? I don't mind keeping the logic looser and simply casting to an int for any non-null value, but I am happy to defer to the Jetpack team's preference on this one!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, thanks for the ultra-quick review!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is_numeric()
would work, sure thing! That works for me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've switched to is_numeric()
in 6138202.
Great news! One last step: head over to your WordPress.com diff, D83749-code, and deploy it. Thank you! |
Deployed to WordPress.com in r249734-wpcom. |
Related to Automattic/wp-calypso#50740
Changes proposed in this Pull Request:
jetpack_podcast_feed_cache_timeout
filter to allow for custom cache timeouts for podcasts instead of the default of (12 hours inWP_Feed_Cache_Transient
).wp_feed_cache_transient_lifetime
filter when we have a podcast cache timeout value returned from the filter.wp_feed_cache_transient_lifetime
filter and is setting the RSS cache timeout to 14 hours, us introducing a new default could result in podcasts now having a 12 hour cache timeout (depending on the specifics about how the filters were added).WP_Feed_Cache_Transient
for caching, or their cache implementation uses thewp_feed_cache_transient_lifetime
filter.Other information:
fetch_feed()
function, and I don't know how to stub that code out effectively.Jetpack product discussion
This is a small enhancement to make it possible for podcast cache timeouts to be reduced so updates to podcasts can be picked up sooner.
Does this pull request change what data or activity we track or use?
The PR does not include tracking or data changes.
Testing instructions:
Testing on a self-hosted Jetpack site
add/podcast-player-custom-cache-timeout
into the "Search for a Jetpack Feature Branch" search field, and click on the Activate button next to the branch when it appearshttps://distributed.blog/category/podcast/feed/?somerandomsuffix873287
to ensure the feed hasn't been cached.jetpack_podcast_feed_cache_timeout
function above to return$cache_timeout
instead of3600
Testing steps on a WordPress.com Simple site
I will document these steps as part of D83749-code, as this depends on a number of internal tools and a different RSS caching infrastructure.