Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve post processor response cache #1325

Merged
merged 26 commits into from Aug 23, 2018

Conversation

Projects
None yet
2 participants
@hellofromtonya
Copy link
Contributor

hellofromtonya commented Aug 7, 2018

Randomly generated content can occur on a web page. When that happens, the response will differ for each request causing cache misses for the plugin's AMP response cache. As @westonruter notes in Issue #1239:

the object cache will get polluted with values that never get used.

This PR detects continual cache misses by caching the page's URL and then comparing the number of cached URLs to a threshold. When that threshold is exceeded, it disables the response cache and then notifies the admin.

TODO

  • Detect continual cache misses
  • Disable caching
  • Notify
  • Provide a UI to reset the option's value and re-enable response caching.

Closes #1239.

hellofromtonya added some commits Aug 7, 2018

Add post-processor cache effectiveness.
We are capturing each response's URL and storing it into the cache.

@todo turn off response caching when the number of caches for the response's URL exceeds the cache miss threshold.
Turn off response caching when exceeding cache miss threshold.
When the number of cached URLs for the given response exceeds the cache miss threshold, disable the response caching.  It also disables continuing to cache the response's URLs.

@todo add getter for response cache to test the response.
Expose keys and add tests to validate caching conditions.
Added the caching keys as public properties.  Why?

1. Eliminates passing it into the caching closure.
2. Exposes both of them for testing.

Then we use these keys to valid the caching conditions before and after we hit the threshold.

@hellofromtonya hellofromtonya requested a review from westonruter Aug 7, 2018

@hellofromtonya

This comment has been minimized.

Copy link
Contributor Author

hellofromtonya commented Aug 7, 2018

@westonruter Commits through to 1d1f399 are a prototype per our discussions here. It's implementing:

  1. Cache the URLs.
  2. Then determine if we should turn off the response caching.
  3. If caching is enabled, then inside of the caching closure, we'll cache the new URL.

Take a look and see what you think. BTW Notices are not included yet.

self::$post_processor_cache_key = null;
if ( true === $args['enable_response_caching'] ) {
self::$post_processor_cache_key = md5( $current_url );
$caches_for_url = wp_cache_get( self::$post_processor_cache_key, self::POST_PROCESSOR_CACHE_EFFECTIVENESS );

This comment has been minimized.

Copy link
@westonruter

westonruter Aug 8, 2018

Member

Sone of the subsequent checks could perhaps be simplified by checking if this is an array here, and if not, just set it to be an empty array.

*
* @var string
*/
public static $response_cache_key;

This comment has been minimized.

Copy link
@westonruter

westonruter Aug 8, 2018

Member

Is this necessary? Aren't references limited to a single method? Since this is a static class it would be better of we limited the state that is persisted in it's class variables IMO.

This comment has been minimized.

Copy link
@hellofromtonya

hellofromtonya Aug 9, 2018

Author Contributor

@westonruter I added this public property for the following purposes:

  1. to give us the ability to deeply test by knowing what the cache key is and then checking for it in the cache.
  2. to simplify access to the logic within the caching closure, i.e. meaning that we don't have to pass it via use.

It serves no other external purpose as part of an API.

Let's think about it.

From a testing point of view, we can infer that we disabled the cache by the state of the post-processor cache. While that's not a direct test on the response cache, both are within the closure. Therefore, it's a minimal risk to test by inference.

Hmm, I like that approach better actually. We get the benefit only exposing what is needed externally while avoided the extra work of resetting persistent state.

I'll make that change.

if ( empty( $caches_for_url ) ) {
$caches_for_url = array( AMP_Theme_Support::$response_cache_key );
} else {
$caches_for_url[] = AMP_Theme_Support::$response_cache_key;

This comment has been minimized.

Copy link
@westonruter

westonruter Aug 8, 2018

Member

Per above, this could be the sole line instead of the if/else.

hellofromtonya added some commits Aug 9, 2018

Init $caches_for_url to an empty array.
When there are no URLs cached, we get `false` back.  To simplify the code, if we do not get an array back, the variable is initialized to an empty array.

@westonruter westonruter changed the title [WIP] Improve post processor response cache Improve post processor response cache Aug 9, 2018

@westonruter westonruter changed the title Improve post processor response cache [WIP] Improve post processor response cache Aug 9, 2018

@westonruter

This comment has been minimized.

Copy link
Member

westonruter commented Aug 9, 2018

For the notice, are you thinking there would be a new key added to the amp-options option which would flag whether or not the response cache is enabled?

@hellofromtonya

This comment has been minimized.

Copy link
Contributor Author

hellofromtonya commented Aug 9, 2018

For the notice, are you thinking there would be a new key added to the amp-options option which would flag whether or not the response cache is enabled?

Well that's a good question @westonruter. I think to find the solution, we need to explore what we want to report back to the developer/admin.

We are disabling the cache on a page-by-page basis. Do we want to give a list of URLs (i.e. that have cache disabled)? Or do we want a general notice? Or both?

It's a more broad discussion for the DX/UX.

@westonruter

This comment has been minimized.

Copy link
Member

westonruter commented Aug 9, 2018

@hellofromtonya humm, how about showing the notice on the general settings page, similar to how we are showing a notice for not having the object cache active.

I didn't think about letting the cache be disabled on a page-by-page basis. I realize that is how it is implemented now. I think that once we get to the point where the uncached count is greater than CACHE_MISS_THRESHOLD we can just set an option to stop the caching. In the future we could go to a page-by-page basis, but I think that could be overkill for now.

@westonruter

This comment has been minimized.

Copy link
Member

westonruter commented Aug 9, 2018

Also, currently the $post_processor_cache_key is specific to the current URL, so there would be no way to query for all URLs that are surpassing the CACHE_MISS_THRESHOLD, since the object cache cannot be queried. Maybe a compromise is that we could store the URL that causes the CACHE_MISS_THRESHOLD to be surpassed, and then use that URL in the notice.

Something else: if there are logged-in users and the REST API is being used on the frontend, it is likely that the API nonce is going to be output. This nonce will be unique for each user. This being the case, it is very likely that the CACHE_MISS_THRESHOLD will be very quickly passed, since each user landing on the homepage would then cause the increment to go up. Maybe this is expected and is proper behavior, but it's also something to be aware of as it could make the cache very prone to being turned off.

@westonruter

This comment has been minimized.

Copy link
Member

westonruter commented Aug 9, 2018

Maybe such caching should just be disabled entirely if the user is logged in.

Also, I just realized something else, when we do:

$caches_for_url[] = $response_cache_key;

We could easily cause for the homepage, for example, to trigger the cache to get disabled once you've published 3 posts, each causing the homepage to change. So what I think we're missing here is a timestamp for when the cache MISS happened. And then what we'd want to do is only consider such entries that have happened within a recent time period (e.g. 10 minutes) for whether the CACHE_MISS_THRESHOLD has been surpassed.

@hellofromtonya

This comment has been minimized.

Copy link
Contributor Author

hellofromtonya commented Aug 9, 2018

Also, currently the $post_processor_cache_key is specific to the current URL, so there would be no way to query for all URLs that are surpassing the CACHE_MISS_THRESHOLD, since the object cache cannot be queried.

Correct.

Maybe a compromise is that we could store the URL that causes the CACHE_MISS_THRESHOLD to be surpassed, and then use that URL in the notice.

I agree. That's the next step as we think about how to notify.

We could easily cause for the homepage, for example, to trigger the cache to get disabled once you've published 3 posts, each causing the homepage to change. So what I think we're missing here is a timestamp for when the cache MISS happened. And then what we'd want to do is only consider such entries that have happened within a recent time period (e.g. 10 minutes) for whether the CACHE_MISS_THRESHOLD has been surpassed.

How about instead if we set the cache expiration to say 10 minutes?

@westonruter

This comment has been minimized.

Copy link
Member

westonruter commented Aug 9, 2018

How about instead if we set the cache expiration to say 10 minutes?

Yeah, that sounds good.

@hellofromtonya

This comment has been minimized.

Copy link
Contributor Author

hellofromtonya commented Aug 9, 2018

I didn't think about letting the cache be disabled on a page-by-page basis. I realize that is how it is implemented now. I think that once we get to the point where the uncached count is greater than CACHE_MISS_THRESHOLD we can just set an option to stop the caching. In the future we could go to a page-by-page basis, but I think that could be overkill for now.

Maybe a compromise is that we could store the URL that causes the CACHE_MISS_THRESHOLD to be surpassed, and then use that URL in the notice.

@westonruter okay so instead of page-by-page, once we hit the threshold, we:

  1. turn off the cache for all pages/URLs
  2. store that URL in an option
  3. display a notice with a link to that URL

That's a simple solution. And as you said, page-by-page is likely overkill right now.

hellofromtonya added some commits Aug 9, 2018

Convert post-processor cache to occurrence instead of page-by-page.
Instead of storing cache miss URLs page-by-page, this commit stores it on occurrence, using a key and group constant.
Store the cache miss URL in an option. Then disable.
1. Refactored into a separate method as there's a lot going on.
2. Check if cache is turned off.
3. If no, check if we've exceeded the threshold.
4. If yes, store the URL.
printf(
'<div class="notice notice-warning"><p>%s <a href="%s">%s</a></p></div>',
esc_html__( 'Response caching was disabled due to exceeding the cache miss threshold.', 'amp' ),

This comment has been minimized.

Copy link
@westonruter

westonruter Aug 10, 2018

Member

We'll probably need to figure out a way to explain what this means to non-developers. Maybe there should be a link to the Wiki with more information about what causes this?

Also, should the notice be dismissible?

Should there be a checkbox to re-enable? If so, then the notice could be displayed inline with the checkbox.

This comment has been minimized.

Copy link
@hellofromtonya

hellofromtonya Aug 13, 2018

Author Contributor

We'll probably need to figure out a way to explain what this means to non-developers. Maybe there should be a link to the Wiki with more information about what causes this?

Agreed. We need more work on the messaging. I've noted it as a @todo.

should the notice be dismissible?

I think so. I think it's okay that the developer or admin dismisses this warning.

Thinking through it, I'd like to use this top-level warning as an alert: "Hey, we automatically turned off caching. Go to the AMP > General page for more information." Use it to alert them that we did something automatically and it needs their attention.

Should there be a checkbox to re-enable? If so, then the notice could be displayed inline with the checkbox.

Agreed. This is the next step, i.e. adding a re-enable checkbox that appears when caching is disabled.

hellofromtonya added some commits Aug 13, 2018

Add caching section to the AMP General sub page.
@todo Change the messaging/description to be more user-friendly and helpful.
printf(
'<div class="notice notice-warning is-dismissible"><p>%s <a href="%s">%s</a></p></div>',
esc_html__( "The AMP plugin's response cache disabled due to detecting randomly generated content.", 'amp' ),
esc_url( 'https://github.com/Automattic/amp-wp/wiki/Response-cache#automatically-disabling-of-the-response-cache' ),

This comment has been minimized.

Copy link
@hellofromtonya

hellofromtonya Aug 13, 2018

Author Contributor

@westonruter I opened a new Wiki page and started the process of explaining the cache. Once we get the UI done, then screenshots can be added to the page to provide users/developers more clarity.

@@ -48,6 +49,7 @@ public static function register_settings() {
add_action( 'update_option_' . self::OPTION_NAME, array( __CLASS__, 'maybe_flush_rewrite_rules' ), 10, 2 );
add_action( 'admin_notices', array( __CLASS__, 'persistent_object_caching_notice' ) );
add_action( 'admin_notices', array( __CLASS__, 'render_cache_miss_notice' ) );

This comment has been minimized.

Copy link
@hellofromtonya

hellofromtonya Aug 13, 2018

Author Contributor

@westonruter As we are automatically turning off the response cache, I'm thinking we may want to notify not only on this settings page, but on other pages too. Why? We can't be certain the admin/developer will come to this page. At the very least, I'm thinking this notice should be on all AMP settings' pages. What do you think?

This comment has been minimized.

Copy link
@westonruter

westonruter Aug 15, 2018

Member

What if we limit it to the general settings page but add an indicator to the admin menu item so that the user knows there has something changed on this page?

This comment has been minimized.

Copy link
@hellofromtonya

hellofromtonya Aug 23, 2018

Author Contributor

@westonruter I like that approach in general. I think it spans more than just this PR and edge case. Thinking about, it seems we could have a general indicator to alert of something that needs attention. That indicator could link to the plugin's settings page where more information lives.

If we go for that approach, I think that's a separate feature, issue, and PR.

// Handle the caching option.
$options['enable_response_caching'] = ! empty( $new_options['enable_response_caching'] );
if ( $options['enable_response_caching'] ) {
do_action( 'amp_reenable_response_cache' );

This comment has been minimized.

Copy link
@westonruter

westonruter Aug 15, 2018

Member

Wouldn't it not be better to call the method here rather than indirectly trigger the action that the method is is connected to? Is there any utility for the amp_reenable_response_cache action generally?

This comment has been minimized.

Copy link
@hellofromtonya

hellofromtonya Aug 22, 2018

Author Contributor

Upon review, you're right. There is not need for a separation action here. Switching to call direct.

@@ -109,6 +109,17 @@ public function add_menu_items() {
)
);
add_settings_field(

This comment has been minimized.

Copy link
@westonruter

westonruter Aug 15, 2018

Member

This setting should only be added if wp_using_ext_object_cache() returns true, as we should only do response caching if we know that the cached value will persist across requests.

This comment has been minimized.

Copy link
@hellofromtonya

hellofromtonya Aug 22, 2018

Author Contributor

True. Good point.

@@ -31,6 +31,7 @@ class AMP_Options_Manager {
'disable_admin_bar' => false,
'all_templates_supported' => true,
'supported_templates' => array( 'is_singular' ),
'enable_response_caching' => true,

This comment has been minimized.

Copy link
@westonruter

westonruter Aug 15, 2018

Member

The default value here can be the return value of wp_using_ext_object_cache() (which wouldn't work as-is right here in this static context)

This comment has been minimized.

Copy link
@hellofromtonya

hellofromtonya Aug 22, 2018

Author Contributor

One way to accomplish this is to set the default state within get_options():

self::$defaults['enable_response_caching'] = wp_using_ext_object_cache();
return array_merge( self::$defaults, $options );

This change is implemented in commit 752de78.

@@ -198,6 +200,12 @@ public static function validate_options( $new_options ) {
// Store the current version with the options so we know the format.
$options['version'] = AMP__VERSION;
// Handle the caching option.
$options['enable_response_caching'] = ! empty( $new_options['enable_response_caching'] );

This comment has been minimized.

Copy link
@westonruter

westonruter Aug 15, 2018

Member

The condition here should probably be wp_using_ext_object_cache() && ! empty( $new_options['enable_response_caching'] )

hellofromtonya added some commits Aug 22, 2018

Use wp_using_ext_object_cache() as the default value of 'enable_respo…
…nse_caching'.

The $defaults array is a static property.  Therefore, we set the default value within the `get_options()` method just before we merge defaults and the stored options.

@hellofromtonya hellofromtonya changed the title [WIP] Improve post processor response cache Improve post processor response cache Aug 23, 2018

@hellofromtonya hellofromtonya added this to the v1.0 milestone Aug 23, 2018

hellofromtonya added some commits Aug 23, 2018

@hellofromtonya

This comment has been minimized.

Copy link
Contributor Author

hellofromtonya commented Aug 23, 2018

@westonruter When we changed the conditions by using wp_using_ext_object_cache(), the notices were displaying incorrectly compared to the state. In the commit e575cde, I fixed it so that the notice appears when:

  1. wp_using_ext_object_cache() is enabled.
  2. The enable_response_caching option is disabled.
  3. The cache miss threshold was exceeded.

westonruter added some commits Aug 23, 2018

@@ -355,8 +355,8 @@ public static function render_cache_miss_notice() {
printf(
'<div class="notice notice-warning is-dismissible"><p>%s <a href="%s">%s</a></p></div>',
esc_html__( "The AMP plugin's response cache disabled due to detecting randomly generated content.", 'amp' ),
esc_url( 'https://github.com/Automattic/amp-wp/wiki/Response-cache#automatically-disabling-of-the-response-cache' ),
esc_html__( "The AMP plugin's post-processor cache disabled due to the detection of highly-variable content.", 'amp' ),

This comment has been minimized.

Copy link
@hellofromtonya

hellofromtonya Aug 23, 2018

Author Contributor

Much better @westonruter. Thank you 👍

@westonruter westonruter merged commit 234b722 into develop Aug 23, 2018

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details

@westonruter westonruter deleted the improve/post-processor-response-cache branch Aug 23, 2018

@westonruter

This comment has been minimized.

Copy link
Member

westonruter commented Aug 23, 2018

@hellofromtonya Great work. This was tough!

@hellofromtonya

This comment has been minimized.

Copy link
Contributor Author

hellofromtonya commented Aug 23, 2018

Collaborative effort between both of us. Thanks for your help and for first identifying this as an issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.