Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use persistent cache. #93

Open
peterwilsoncc opened this issue Jan 23, 2020 · 11 comments
Open

Use persistent cache. #93

peterwilsoncc opened this issue Jan 23, 2020 · 11 comments

Comments

@peterwilsoncc
Copy link
Contributor

Follow up to discussion in PR #91

Cavalcade uses a non-persistent cache for the cavalcade-jobs group. As the runner triggers WP CLI tasks within the plugin, invalidating on UPDATE, INSERT and DELETE should be a relatively simple task.

Proposal:

  • Remove cavalcade-jobs from the non-persistent groups
  • Consider whether cavalcade-jobs should be a global or per site
@roborourke
Copy link
Contributor

Re. making it a global group I think given that the site ID is part of the SQL query it makes sense. In the rare event anyone would need to query crons for a different site it won't have any issues then.

@peterwilsoncc
Copy link
Contributor Author

The reason I am in two minds about global or single site groups has to do with group invalidation.

If the group is global, any time a site on multisite updates a cron job, the queries and jobs for the other sites would be cleared too.

If the multisite runs a lot of sites, for example a large company with a presence in multiple countries, then the cache could be lost frequently. For the edge case of cross-site searching, I think it's potentially better as a single site cache.

@roborourke
Copy link
Contributor

Makes sense to me, thanks Peter 👍

@roborourke
Copy link
Contributor

Per @rmccue's comment on #91

When it's used and updated in Cavalcade Runner, it doesn't have proper access to the object cache; in particular, we don't know (IIRC) the cache key generation algorithm. Because of this, the cache can't be authoritative.

So, we should keep the cache as non-persistent however we could add a constant that if defined does add it as a persistent global group. This will benefit users not using the runner but just improving cron storage.

The database updates in #95 should give a good enough performance boost for those using the runner.

@peterwilsoncc
Copy link
Contributor Author

In order to use the cache in the runner, is it possible to do any of these:

  • define a PHP constant indicating the cron runner supports persistent caching/needs the cache cleared
  • update the cavalcade last updated cache key to be introduced in #75 WP 5.1 Pre-flight filters #91 via the runner?
  • #75 WP 5.1 Pre-flight filters #91 will change the model of wp_reschedule_event() to use the same model as the runner, so use that to reschedule after running a job?

And do this in a backward compatible way to avoid ruining everything if the plugin is updated without the runner or vice versa?

@rmccue
Copy link
Member

rmccue commented Jan 29, 2020

We'd essentially have to load all of WordPress in to be able to use the object cache, which is highly likely to cause severe memory leaks and undefined behaviour, since WordPress isn't designed for long-running processes.

What we could potentially do is move these operations from the runner into the run wp-cli command, but we'll likely run into problems with detecting fatal errors etc with that, as the job is now monitoring itself rather than being overwatched.

@peterwilsoncc
Copy link
Contributor Author

Could Cavalcade the plugin use the runner's Runner.check_workers.job_completed hook to clear a persistent cache, similar to the method Altis Cloud uses for logging jobs?

https://github.com/humanmade/altis-cloud/blob/baf9587948b77040af58ff6d823fcc932150bd32/inc/cavalcade_runner_to_cloudwatch/namespace.php#L55-L69

@rmccue
Copy link
Member

rmccue commented Feb 1, 2020

Altis Cloud is loaded differently; it's loaded as part of Altis during the very early stages of the bootstrap setup, and crucially, as part of the wp-config loading. Regular plugins (like Cavalcade) are loaded by WordPress, so aren't loaded into the Runner at all.

We could provide a special mode for Cavalcade that says "yes, it's loaded in early and the cache should be enabled", but that seems like a very messy solution that doesn't gain us much.

Curious to know if you're seeing actual perf problems from the lack of cache. I'm yet to see it take any significant amount of time, even on sites with a lot of jobs.

@peterwilsoncc
Copy link
Contributor Author

peterwilsoncc commented Feb 2, 2020

Curious to know if you're seeing actual perf problems from the lack of cache. I'm yet to see it take any significant amount of time, even on sites with a lot of jobs.

We're not seeing any at the moment, but I'm concerned about the affect of the increased number of DB calls by using the new filters. Work has a few hundred journalists in the admin all day so we don't get the benefit of full page caching.

How about introducing a filter in register_cache_groups() to allow for setups similar to Altis's?

wp-config.php:

use HM\Cavalcade\Runner\Runner;

require_once __DIR__ . '/wordpress/wp-includes/plugin.php';

Runner::instance()->hooks->register( 'Runner.check_workers.job_completed', function() {
	wp_cache_set( 'last_changed', microtime(), 'cavalcade-jobs' );
} );

add_filter( 'hm.cavalcade.use-persistent-cache', function() {
	return false;
} );

cavalcade/inc/namespace.php:

/**
 * Register the cache groups
 */
function register_cache_groups() {
	wp_cache_add_global_groups( [ 'cavalcade' ] );

	/**
	 * Docblock with clear warning about how to use this.
	 */
	$persistent_cache = apply_filters( 'hm.cavalcade.use-persistent-cache', true );
	if ( ! $persistent_cache ) {
		return;
	}
	wp_cache_add_non_persistent_groups( [ 'cavalcade-jobs' ] );
}

@peterwilsoncc
Copy link
Contributor Author

WordPress.org was having a particularly high number of database hits recently.

While rubber ducking a fix/work around with Dion I was wondering if a persistent cache could be cleared on the shutdown hook if DOING_CRON === true? That would allow the runner to continue with rescheduling, fatal monitoring, etc.

At this stage, this is a pretty random brain expulsion: do y'all reckon it's worth exploring?

@dd32
Copy link
Contributor

dd32 commented Feb 19, 2021

In the case of WordPress.org, it was a surge of uncached hits that caused issues which resulted in loading of WordPress and causing many cronjobs to be checked for existence on each request.

I've recently added some super simple caching to get_scheduled_event() so that wp_next_scheduled() wouldn't require a DB hit for every call. It's not super significant enough by itself, but it certainly adds to issues when things snowball.

In the case of WordPress.org, caching the wp_next_scheduled() calls between pageloads can be the difference between 10 DB queries or zero (and as a result, no DB connection required) as the majority of page data is available entirely cached via the object cache (For the posts query, we use https://github.com/Automattic/advanced-post-cache )

For reference, here's a bit of code I've deployed on WordPress.org with a super short 10s cache, it's not designed to be super efficient or cover all bases, just to limit the number of repetitive queries for jobs that have been running fine for in some cases literally years.

// Cache wp_next_scheduled() for a short time
add_filter( 'pre_get_scheduled_event', function( $filter, $hook, $args, $timestamp ) {
	static $stack = [];

	// Abort if another plugin has already done something to this filter
	if ( null !== $filter ) {
		return $filter;
	}

	$key = 'job:' . sha1( $hook . ':' . serialize( $args ) . ':' . $timestamp );
	$next = wp_cache_get( $key, 'cron-jobs-cache' );
	if ( $next && $next->timestamp > time() ) {
		return $next;
	}

	// Don't loop.
	if ( isset( $stack[ $key ] ) ) {
		return $filter;
	}

	$stack[ $key ] = true;

	$next = \wp_get_scheduled_event( $hook, $args, $timestamp );

	$cache_time = 10;
	// Don't cache if the event is in the next few seconds.
	if ( $next && $next->timestamp > ( time() + $cache_time ) ) {
		wp_cache_set( $key, $next, 'cron-jobs-cache', $cache_time );
	}

	unset( $stack[ $key ] );

	return $next;
}, 9, 4 );

// Clear the cache before it's unscheduled (TODO: This should potentially be after clearing, on priority 11)
add_filter( 'pre_unschedule_event', function( $filter, $timestamp, $hook, $args ) {
	$key = 'job:' . sha1( $hook . ':' . serialize( $args ) . ':' . $timestamp );
	wp_cache_delete( $key, 'cron-jobs-cache' );
	return $filter;
}, 9, 4 );

// Clear the cache after an event is scheduled
add_filter( 'pre_schedule_event', function( $filter, $event ) {
	$key = 'job:' . sha1( $event->hook . ':' . serialize( $event->args ) . ':' . $event->timestamp );
	wp_cache_delete( $key, 'cron-jobs-cache' );
	return $filter;
}, 11, 2 );

It seemed to do the trick and hasn't caused any visible effects yet (such as duplicate scheduled tasks etc)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants