Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement]: Enable forcing a refresh of cached analytics data #33221

Closed
coreymckrill opened this issue May 26, 2022 · 10 comments · Fixed by #33325
Closed

[Enhancement]: Enable forcing a refresh of cached analytics data #33221

coreymckrill opened this issue May 26, 2022 · 10 comments · Fixed by #33325
Labels
focus: analytics focus: rest api Issues related to WooCommerce REST API. plugin: woocommerce Issues related to the WooCommerce Core plugin. priority: normal The issue/PR is of normal priority—not many people are affected or there’s a workaround, etc. type: enhancement The issue is a request for an enhancement.

Comments

@coreymckrill
Copy link
Collaborator

coreymckrill commented May 26, 2022

Describe the solution you'd like

The WooCommerce mobile app displays order-related analytics data on the home screen for the current day, current week, current month, and current year. While viewing this data, you can "pull to refresh", causing the app to send a new request to the API to retrieve the latest version of the data. If nothing has changed on the store since the last request, the API will theoretically grab the data from the cache instead of re-running the queries, and this is how we usually want it to work. However, if for some reason the cached data is stale, it would be useful to have a way to force a refresh, meaning re-run the queries and then replace the stale cached data with the new data.

This would require adding a parameter to each analytics API endpoint, something like force_cache_refresh=true, which would trigger some kind of mechanism for skipping the retrieval of the cached data.

The force_cache_refresh flag needs to find its way from the API endpoint controller to the get_data method in the related DataStore class (example). Ideally, since this flag doesn't actually affect the query itself, it would be separate from the $query_args array. However, that would require changing the method signature, and thus DataStoreInterface, which would necessitate changing the method signature in every class that extends that interface. Instead, I think we could either:

  1. Include the flag in the $query_args array, and potentially unset it before it actually gets fed into the query object (to avoid it becoming part of the cache key).
  2. Create a new interface for the reports datastores, something like ReportsDataStoreInterface and define get_data with an expanded signature.

Once the flag has made its way into get_data it's just a matter of checking for its presence before running the get_cached_data method. If that gets skipped, we just continue on to run the necessary queries, compile the data, save it to the cache, and return it, thus "refreshing" the data.

Describe alternatives you've considered

  • I looked into using the woocommerce_analytics_report_should_use_cache filter hook to prevent pulling the data from the cache. This gives us a simple way to toggle the cache off, but the problem is that there wouldn't be an opportunity to toggle the cache back on in time to save the re-queried data. So this would allow us to skip the cache, but the data in the cache would continue to be stale.
  • In the past, the mobile apps have had the need to bust the cache because of stale data, and they've resorted to submitting a random number in an unneeded query parameter (i.e. per_page={random int}), which generated a new cache key. While this worked in a pinch, it's certainly a non-ideal hack, and shouldn't be necessary.
  • Instead of passing the force_cache_refresh flag from the API controller down into the datastore, we could simply use it to conditionally call Cache::invalidate(), which would wipe the entire reports cache. However, this seems like an efficiency killer, as not all cached data becomes stale at the same time.

Additional context

  • p91TBi-8Ro-p2
@coreymckrill coreymckrill added type: enhancement The issue is a request for an enhancement. focus: rest api Issues related to WooCommerce REST API. plugin: woocommerce Issues related to the WooCommerce Core plugin. status: awaiting triage This is a newly created issue waiting for triage. focus: analytics labels May 26, 2022
@psealock psealock added status: prioritization and removed status: awaiting triage This is a newly created issue waiting for triage. labels May 27, 2022
@joshbetz
Copy link
Contributor

joshbetz commented Jun 3, 2022

Include the flag in the $query_args array, and potentially unset it before it actually gets fed into the query object (to avoid it becoming part of the cache key).

This seems easier, so I would probably lean toward this option.

@louwie17
Copy link
Contributor

louwie17 commented Jun 3, 2022

Include the flag in the $query_args array, and potentially unset it before it actually gets fed into the query object (to avoid it becoming part of the cache key).

I am not a huge fan of this one, which makes me lean towards number two, but that does feel like a big gradual overhaul.

For simplicity sake could we actually partly use this point:

Instead of passing the force_cache_refresh flag from the API controller down into the datastore, we could simply use it to conditionally call Cache::invalidate(), which would wipe the entire reports cache. However, this seems like an efficiency killer, as not all cached data becomes stale at the same time.

But instead of calling Cache::invalidate(), we update the Cache to allow us to invalidate/remove items by key. Then when force_cache_refresh is used, we can just call Cache::remove( $key ), making use of get_cached_key.

We will probably need this logic at some point anyway, so it won't cause to much tech debt. If we aren't going the DataStore restructure route.

@coreymckrill
Copy link
Collaborator Author

But instead of calling Cache::invalidate(), we update the Cache to allow us to invalidate/remove items by key.

I like this! Although, the place where the query args are normalized and the cache key is generated is still down in the datastore, and get_cache_key is a protected method, so we'd still need to change the shape of the datastore class in some way...

@joshbetz
Copy link
Contributor

joshbetz commented Jun 3, 2022

Can we just bypass the cache retrieval here?

So we force the report to be generated even if the cache exists, which will then update the cache entry.

@coreymckrill
Copy link
Collaborator Author

Yep. The question is, how do we get the boolean value specifying that bypass from the API controller to that point in the logic?

@coreymckrill
Copy link
Collaborator Author

Thinking about this more, I suspect option 1 is the safest approach. If we start changing interfaces or the visibility of methods, and some plugin is extending the datastore class, that would potentially cause fatal errors (see #33039).

@louwie17
Copy link
Contributor

louwie17 commented Jun 6, 2022

Although, the place where the query args are normalized and the cache key is generated is still down in the datastore, and get_cache_key is a protected method, so we'd still need to change the shape of the datastore class in some way...

I was thinking we could add the get_cache_key as an static function to the DataStore, or on another class as a static function. So then we can call it from within controller without having to modify get_data structure.

Thinking about this more, I suspect option 1 is the safest approach. If we start changing interfaces or the visibility of methods, and some plugin is extending the datastore class, that would potentially cause fatal errors (see #33039).

I would be fine with this, as it is the safest option. But I get a sense we might end up changing this down the road, as we will always have to make sure to unset it for it to actually work. (especially if our plan is to do more cache related work and optimizations).

@joshbetz
Copy link
Contributor

joshbetz commented Jun 6, 2022

I think that's a good call @coreymckrill, especially since part of the goal here is to avoid doing too much refactoring. We can always plan out a larger refactor later if we need to.

@joshuatf
Copy link
Contributor

joshuatf commented Jun 7, 2022

I think we can modify the method get_cache_key to check for the existence of the force_cache_refresh and simply return false if it exists.

This way we're not introducing brittle logic around unsetting query params.

I'm not sure if get_cached_data handles a request for false specifically, but we could introduce this as well if needed.

@coreymckrill
Copy link
Collaborator Author

@joshuatf I like that idea, and I've updated #33325 to use get_cache_key in a way similar to what you describe. This allowed me to keep all of the changed logic within the parent \Automattic\WooCommerce\Admin\API\Reports\DataStore class, without having to spill over into each of the child classes 💯

@Konamiman Konamiman added priority: normal The issue/PR is of normal priority—not many people are affected or there’s a workaround, etc. and removed status: prioritization labels Jun 9, 2022
coreymckrill added a commit that referenced this issue Jun 14, 2022
…3325)

Adds a new collection parameter to all Reports API endpoints that utilize caching, `force_cache_refresh`, which will cause the current request to bypass the cache, re-run the queries for the requested data, and overwrite the previous cache entry with the new results.

Note that this doesn't invalidate the entire cache, only the entry for the particular set of collection parameters and values specified in the request.

This also adds a way to include debugging information related to the cache in the API response. Modeled after the way the Query Monitor plugin adds such information, you can get this by including an `_envelope` parameter in your API request. The debugging info includes whether the cache has been disabled via filter (`should_use_cache`), whether the `force_cache_refresh` parameter was used, whether the returned data was a `cache_hit` or not, and an array of the query parameters that were actually used to create the cache key.

Closes #33221
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
focus: analytics focus: rest api Issues related to WooCommerce REST API. plugin: woocommerce Issues related to the WooCommerce Core plugin. priority: normal The issue/PR is of normal priority—not many people are affected or there’s a workaround, etc. type: enhancement The issue is a request for an enhancement.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants