New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue with Redis-Cache & Cache Invalidation / Update #58
Comments
Can you narrow down the code that's setting/reading the user meta that's stale? |
@tillkruss We're actually seeing something similar here. Our cache grows to about 3.5M keys when it's been running for a couple of days. On a "fresh" cache, everything works as expected. But at some point, the cache begins returning stale results, even though the DB has been updated with different values. The most obvious place we see it is on the admin console when setting options or meta values. I've tracked a few different code paths in the admin console, and it is simple calls to update_option or update_post_meta that somehow aren't replacing the Redis key. In a couple cases, I could see via Query Monitor that two plugins were trying to update_option on every pageload. Those plugins would check the value of a particular option, then update it if needed, but the DB saves were never being replaced in Redis, so it kept trying over and over. I'm trying to do more research here to pinpoint it, but it is not plugin related, as I've seen it across a variety of plugins using update_option and update_post_meta. (and add_post_meta/delete_post_meta) Could this be related to TTL by chance? We do not have anything set there. |
I assume the outdated cache data is plugin/theme related. In that case your issues is most likely caused by the theme or plugin(s) itself, because they don't handle their cached data properly. Invalidating cache is one of the hardest things in computer science. You can set a MaxTTL for all keys to keep the outdated cache data at bay, but it doesn't solve the problem of having outdated data. I personally prefer solving it by updating the cache when the data changes, because I dislike stale cache data. |
Actually, the update_option issue I'm seeing could be related to how wordpress stores, caches and loads the alloptions array. Related bug in trac here: https://core.trac.wordpress.org/ticket/31245 I'm looking at multiple plugins, and they are simply calling functions like update_option, delete_option, add_option and relying on core to do the wp_cache_set/wp_cache_delete functions which puts stuff into the object cache and removes or updates it as needed. For whatever reason, those updates propagate to the DB, but not the cache. These functions work fine without an object cache and write to the DB properly, and based on the cache_get/cache_delete functions in core, they should be handled correctly by an object cache, but are not. I have run these plugins for years on memcached object cache, and just switched to Redis last week, and began seeing this. Here is a simple example You'll see how they get_option, check for the presence of a key and flush rewrite rules if the key isn't present. In our case, the cached version of the option gets stuck and the flush rewrite rules runs on every pageload. |
Interesting. If you find anything, please feel free to make a PR. One thing I forgot to mention was, you can always ignore certain cache groups that are troubling you (at least until WordPress fixes their stuff). |
Here is a good workaround on the race condition on https://github.com/Automattic/vip-mu-plugins-public/blob/master/misc.php#L66-L83 My suggestion, whether for Redis or other object cache, is to point users to this workaround until the issue in core is fixed. Otherwise you get strange/stale results when working with any autoloaded option, of which many can be quite important. |
Great! It probably makes sense to add this to this plugin until WordPress fixes the bug, eh? |
Hi @tillkruss and @kevinlisota, can you guys help me get started to implement this fix before WP core team decides on a solution? @kevinlisota Reading https://core.trac.wordpress.org/ticket/31245 led me to https://wordpress.org/plugins/wp-lcache/ which seems to circumvent this issue. What object cache solution would you currently recommend for high traffic website? |
@tillkruss For some reason I did not recieve notifications on the GH issue response.... That said, here's a ticket that I've opened with wordpress core. https://core.trac.wordpress.org/ticket/40052 The issue isn't related to the theme, but rather how the core handles object caching with stale data. If the data is stale in the cache back end, but is correct in the DB ( due to a failed update to the cache store ) it will never update and will always return stale data..... |
@cookiesowns I tried out lcache with all sorts of problems. The development team responded to me with this: https://wordpress.org/support/topic/heavy-iops-and-database-use/ It looks good in concept, but it is not ready. We use Redis with this plugin with good success. There is a known race condition with object caching, unrelated to the cache, but a problem with WP core that still needs fixing. You can see it when you make settings changes that may save in the database, but don't take effect on the site because of stale cache data, which can make it nearly impossible to configure a site. The workaround I mentioned above is working for us at the moment. Here it is again: https://github.com/Automattic/vip-mu-plugins-public/blob/master/misc.php#L66-L83 |
Hi @kevinlisota I see you around Is it correct that you guys currently use Has Lastly, what about https://github.com/pantheon-systems/wp-redis, have you tried it? Many thanks. |
Don't add the race condition fix in WordPress, put it in a MU-plugin. |
No wonder I kept on having issue with stale data. Thanks everyone :) |
I have the max_ttl set at 10 minutes. Haven't seen this issue yet... |
So we have seen this kind of thing in code that incorrectly checks for the get_option() value before running update_option() to update it (which checks the value of the option before updating it). This triggers a race condition because another request can be updating the option as it is being requested. If the check is done as follows this is no longer an issue:
This is exasperbated by slow disks. |
I've noticed this same issue on multiple sites: users update some settings, that under the hood run With the tips above I wrote this snipped into my mu-plugin to forcefully delete old keys and it works great:
I recommend @tillkruss you insert some extra check like this into your object-cache.php since I don't think WordPress core will fix that race condition any soon. Thanks for a great plugin! PS. Shameless promotion: if you want to automatically sync your plugin from Github to WP.org, check out https://deployer.seravo.com/ |
@ottok possible to share the whole snippet? or mu-plugin file? Thank you 😀 |
@ottok: I'm open to a PR! |
@ottok please |
https://core.trac.wordpress.org/ticket/31245 was moved for 5.3.1 consideration. |
Fixed in WP 5.3.1 |
Sorry @tillkruss, I didn't notice your follow-up question. Here is the whole snippet in context: Seravo/seravo-plugin@6195003 |
No problem. WP core fixed it finally. |
Hi,
Currently we're running redis-cache in production. It appears there is sometimes an edge-case once our redis store has many keys ( 7mill+ ) where once a key is set, eg
user_meta
the key even when updated via update_user_meta, it is never updated in the redis-cache. This is no bueno as even when the user_meta is changed in DB... the redis-cache continues serving stale values.It's only when we forcefully DELETE the key where everything works fine. However this only happens when the redis DB has been operational for a while, and we slam it with requests to get & update_user_meta.
It seems there is some logic when updating user_meta where, if the value isn't updated in the database ( values are the same ), the redis cache is not invalidated, or updated.
The text was updated successfully, but these errors were encountered: