Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with Redis-Cache & Cache Invalidation / Update #58

Closed
cookiesowns opened this issue Mar 4, 2017 · 23 comments
Closed

Issue with Redis-Cache & Cache Invalidation / Update #58

cookiesowns opened this issue Mar 4, 2017 · 23 comments
Assignees

Comments

@cookiesowns
Copy link

Hi,

Currently we're running redis-cache in production. It appears there is sometimes an edge-case once our redis store has many keys ( 7mill+ ) where once a key is set, eg user_meta the key even when updated via update_user_meta, it is never updated in the redis-cache. This is no bueno as even when the user_meta is changed in DB... the redis-cache continues serving stale values.

It's only when we forcefully DELETE the key where everything works fine. However this only happens when the redis DB has been operational for a while, and we slam it with requests to get & update_user_meta.

It seems there is some logic when updating user_meta where, if the value isn't updated in the database ( values are the same ), the redis cache is not invalidated, or updated.

@tillkruss
Copy link
Member

Can you narrow down the code that's setting/reading the user meta that's stale?

@kevinlisota
Copy link

@tillkruss We're actually seeing something similar here. Our cache grows to about 3.5M keys when it's been running for a couple of days.

On a "fresh" cache, everything works as expected. But at some point, the cache begins returning stale results, even though the DB has been updated with different values.

The most obvious place we see it is on the admin console when setting options or meta values. I've tracked a few different code paths in the admin console, and it is simple calls to update_option or update_post_meta that somehow aren't replacing the Redis key.

In a couple cases, I could see via Query Monitor that two plugins were trying to update_option on every pageload. Those plugins would check the value of a particular option, then update it if needed, but the DB saves were never being replaced in Redis, so it kept trying over and over.

I'm trying to do more research here to pinpoint it, but it is not plugin related, as I've seen it across a variety of plugins using update_option and update_post_meta. (and add_post_meta/delete_post_meta)

Could this be related to TTL by chance? We do not have anything set there.

@tillkruss
Copy link
Member

I assume the outdated cache data is plugin/theme related. In that case your issues is most likely caused by the theme or plugin(s) itself, because they don't handle their cached data properly. Invalidating cache is one of the hardest things in computer science.

You can set a MaxTTL for all keys to keep the outdated cache data at bay, but it doesn't solve the problem of having outdated data.

I personally prefer solving it by updating the cache when the data changes, because I dislike stale cache data.

@kevinlisota
Copy link

Actually, the update_option issue I'm seeing could be related to how wordpress stores, caches and loads the alloptions array. Related bug in trac here:

https://core.trac.wordpress.org/ticket/31245

I'm looking at multiple plugins, and they are simply calling functions like update_option, delete_option, add_option and relying on core to do the wp_cache_set/wp_cache_delete functions which puts stuff into the object cache and removes or updates it as needed. For whatever reason, those updates propagate to the DB, but not the cache.

These functions work fine without an object cache and write to the DB properly, and based on the cache_get/cache_delete functions in core, they should be handled correctly by an object cache, but are not. I have run these plugins for years on memcached object cache, and just switched to Redis last week, and began seeing this.

Here is a simple example
http://foliovision.com/seo-tools/wordpress/plugins/fv-top-level-categories

You'll see how they get_option, check for the presence of a key and flush rewrite rules if the key isn't present. In our case, the cached version of the option gets stuck and the flush rewrite rules runs on every pageload.

@tillkruss
Copy link
Member

Interesting. If you find anything, please feel free to make a PR.

One thing I forgot to mention was, you can always ignore certain cache groups that are troubling you (at least until WordPress fixes their stuff).

@kevinlisota
Copy link

Here is a good workaround on the race condition on alloptions

https://github.com/Automattic/vip-mu-plugins-public/blob/master/misc.php#L66-L83

My suggestion, whether for Redis or other object cache, is to point users to this workaround until the issue in core is fixed. Otherwise you get strange/stale results when working with any autoloaded option, of which many can be quite important.

@tillkruss
Copy link
Member

Great! It probably makes sense to add this to this plugin until WordPress fixes the bug, eh?

@tillkruss tillkruss reopened this May 16, 2017
@vincent-lu
Copy link

Hi @tillkruss and @kevinlisota, can you guys help me get started to implement this fix before WP core team decides on a solution?

@kevinlisota Reading https://core.trac.wordpress.org/ticket/31245 led me to https://wordpress.org/plugins/wp-lcache/ which seems to circumvent this issue. What object cache solution would you currently recommend for high traffic website?

@cookiesowns
Copy link
Author

@tillkruss For some reason I did not recieve notifications on the GH issue response....

That said, here's a ticket that I've opened with wordpress core.

https://core.trac.wordpress.org/ticket/40052

The issue isn't related to the theme, but rather how the core handles object caching with stale data. If the data is stale in the cache back end, but is correct in the DB ( due to a failed update to the cache store ) it will never update and will always return stale data.....

@kevinlisota
Copy link

@cookiesowns I tried out lcache with all sorts of problems. The development team responded to me with this:

https://wordpress.org/support/topic/heavy-iops-and-database-use/

It looks good in concept, but it is not ready. We use Redis with this plugin with good success.

There is a known race condition with object caching, unrelated to the cache, but a problem with WP core that still needs fixing. You can see it when you make settings changes that may save in the database, but don't take effect on the site because of stale cache data, which can make it nearly impossible to configure a site.

The workaround I mentioned above is working for us at the moment. Here it is again:

https://github.com/Automattic/vip-mu-plugins-public/blob/master/misc.php#L66-L83

@vincent-lu
Copy link

Hi @kevinlisota I see you around lcache and WP's forums as I'm chasing a solution to the same problem.

Is it correct that you guys currently use tillkruss/redis-cache as your object cache solution on production, the those lines added anywhere in the WP includes (themes, plugins, whatever) https://github.com/Automattic/vip-mu-plugins-public/blob/master/misc.php#L66-L83

Has lcache ever worked for you guys? See my problem with lcache seems to be similar to your experience lcache/wp-lcache#137

Lastly, what about https://github.com/pantheon-systems/wp-redis, have you tried it?

Many thanks.

@tillkruss
Copy link
Member

Don't add the race condition fix in WordPress, put it in a MU-plugin.

@abdusfauzi
Copy link

No wonder I kept on having issue with stale data. Thanks everyone :)

@bdtech
Copy link

bdtech commented Jul 17, 2017

I have the max_ttl set at 10 minutes. Haven't seen this issue yet...

@svandragt
Copy link

svandragt commented Mar 30, 2018

So we have seen this kind of thing in code that incorrectly checks for the get_option() value before running update_option() to update it (which checks the value of the option before updating it). This triggers a race condition because another request can be updating the option as it is being requested. If the check is done as follows this is no longer an issue:

if ( update_option( 'my_option', MY_VERSION ) ) {
        // do whatever needs done on update
}

This is exasperbated by slow disks.

@ottok
Copy link

ottok commented Apr 19, 2018

I've noticed this same issue on multiple sites: users update some settings, that under the hood run update_option. The value changes in the database, but not in the Redis cache managed by this object-cache.php.

With the tips above I wrote this snipped into my mu-plugin to forcefully delete old keys and it works great:

      /**
       * Additional hooks to option updates to ensure they get refreshed in the
       * Redis object-cache when they change.
       */
      add_action( 'added_option',   array( __CLASS__, 'maybe_clear_alloptions_cache' ) );
      add_action( 'updated_option', array( __CLASS__, 'maybe_clear_alloptions_cache' ) );
      add_action( 'deleted_option', array( __CLASS__, 'maybe_clear_alloptions_cache' ) );

    }


    /**
     * Fix a race condition in options caching
     *
     * See https://core.trac.wordpress.org/ticket/31245
     * and https://github.com/tillkruss/redis-cache/issues/58
     *
     */
    public static function maybe_clear_alloptions_cache( $option ) {

      // error_log("added/updated/deleted option: $option");

      if ( wp_installing() === FALSE ) {
        $alloptions = wp_load_alloptions(); // alloptions should be cached at this point

        // If option is part of the alloptions collection then clear it.
        if ( array_key_exists($option, $alloptions) ) {
          wp_cache_delete( $option, 'options' );
          // error_log("deleted from cache group 'options': $option");
        }

      }
     }

I recommend @tillkruss you insert some extra check like this into your object-cache.php since I don't think WordPress core will fix that race condition any soon.

Thanks for a great plugin!

PS. Shameless promotion: if you want to automatically sync your plugin from Github to WP.org, check out https://deployer.seravo.com/

@abdusfauzi
Copy link

@ottok possible to share the whole snippet? or mu-plugin file?

Thank you 😀

@tillkruss
Copy link
Member

@ottok: I'm open to a PR!

@partounian
Copy link

@ottok please

@tillkruss
Copy link
Member

https://core.trac.wordpress.org/ticket/31245 was moved for 5.3.1 consideration.

@tillkruss
Copy link
Member

Fixed in WP 5.3.1

@ottok
Copy link

ottok commented Jan 22, 2020

@ottok possible to share the whole snippet? or mu-plugin file?

Thank you

Sorry @tillkruss, I didn't notice your follow-up question. Here is the whole snippet in context: Seravo/seravo-plugin@6195003

@tillkruss
Copy link
Member

No problem. WP core fixed it finally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants