Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default to network-first caching strategy #265

Closed
westonruter opened this issue Mar 29, 2020 · 3 comments · Fixed by #338
Closed

Default to network-first caching strategy #265

westonruter opened this issue Mar 29, 2020 · 3 comments · Fixed by #338

Comments

@westonruter
Copy link
Collaborator

westonruter commented Mar 29, 2020

The Basic Site Caching plugin contains what have turned out to be the most common baseline caching strategy for sites. In particular, a network-first caching strategy ensures that the freshest content is served when connected to the network with fallbacks for to the cache when the network fails

This is the required code:

<?php
// Enable network-first caching strategy for navigation requests (i.e. clicking around the site).
add_filter(
	'wp_service_worker_navigation_caching_strategy',
	function () {
		return \WP_Service_Worker_Caching_Routes::STRATEGY_NETWORK_FIRST;
	}
);

// Hold on to a certain number of navigated pages in the cache.
add_filter(
	'wp_service_worker_navigation_caching_strategy_args',
	function ( $args ) {
		$args['cacheName'] = 'pages';

		$args['plugins']['expiration']['maxEntries'] = 20;

		return $args;
	}
);

A network-first caching strategy is then also needed for the scripts and styles that a site depends on (coming from core, themes, and plugins). See #264.

A cache-first strategy would also make sense for uploaded files:

<?php
// Add caching for uploaded images.
add_action(
	'wp_front_service_worker',
	function ( \WP_Service_Worker_Scripts $scripts ) {
		$upload_dir = wp_get_upload_dir();
		$scripts->caching_routes()->register(
			'^(' . preg_quote( $upload_dir['baseurl'], '/' ) . ').*\.(png|gif|jpg|jpeg|svg|webp)(\?.*)?$',
			array(
				'strategy'  => \WP_Service_Worker_Caching_Routes::STRATEGY_CACHE_FIRST,
				'cacheName' => 'uploads',
				'plugins'   => array(
					'expiration' => array(
						'maxAgeSeconds' => MONTH_IN_SECONDS,
					),
				),
			)
		);
	}
);

Note that this does not currently account for uploaded files that are hosted on an external CDN, so that should be accounted for as well.

I think this should be the default behavior of the plugin, of course allowing a site to override the default behavior with the filters. Then the PWA plugin would provide for more than just an offline page out of the box. This will in part make #176 obsolete.

@westonruter
Copy link
Collaborator Author

@westonruter westonruter modified the milestones: 0.5, 0.6 May 14, 2020
@westonruter westonruter changed the title Default to network-first caching strategy for navigation requests Default to network-first caching strategy Oct 10, 2020
@westonruter
Copy link
Collaborator Author

westonruter commented Oct 10, 2020

Thanks to @rviscomi, we have some data from HTTP Archive on how many assets are added to WP pages from themes, plugins, and core:

Here's a distribution of the number of requests per page for each of the path names:

percentile plugins themes wp-includes
10 4 4 4
25 8 10 6
50 22 18 10
75 44 34 14
90 74 58 22
100 1948 1244 1314

So the median page would load 22 "plugins" resources, 18 "themes" resources, and 10 "wp-includes" resources. For fun I added the 100th percentile (max) values, and you can see how out of control they get.

Here's the source query in case you want to run it yourself or make any changes:

SELECT
  percentile,
  path,
  APPROX_QUANTILES(freq, 1000)[OFFSET(percentile * 10)] AS freq
FROM (
  SELECT
    page,
    REGEXP_EXTRACT(url, r'/(themes|plugins|wp-includes)/') AS path,
    COUNT(0) AS freq
  FROM
    (SELECT url AS page FROM `httparchive.technologies.2020_09_01_mobile` WHERE app = 'WordPress')
  JOIN
    (SELECT pageid, url AS page FROM `httparchive.summary_pages.2020_09_01_mobile`)
  USING
    (page)
  JOIN
    (SELECT pageid, url FROM `httparchive.summary_requests.2020_09_01_mobile`)
  USING
    (pageid)
  GROUP BY
    page,
    path
  HAVING
    path IS NOT NULL),
  UNNEST([10, 25, 50, 75, 90, 100]) AS percentile
GROUP BY
  percentile,
  path
ORDER BY
  percentile,
  path

So this being said, it appears the default maxEntries we should use for the 75th percentile are:

  • core: 14
  • themes: 34
  • plugins: 44

@westonruter
Copy link
Collaborator Author

Whereas the above indicates the asset counts for plugins, themes, and core. What follows is the byte site of the assets in those categories (in KB):

percentile plugins themes wp-includes
10 9 35 29
25 36 131 84
50 156 309 104
75 449 622 178
90 898 1404 324
100 39063 54667 37636

This is also courtesy of @rviscomi.

SELECT
  percentile,
  path,
  APPROX_QUANTILES(kbytes, 1000)[OFFSET(percentile * 10)] AS kbytes
FROM (
  SELECT
    page,
    REGEXP_EXTRACT(url, r'/(themes|plugins|wp-includes)/') AS path,
    SUM(respSize) / 1024 AS kbytes
  FROM
    (SELECT url AS page FROM `httparchive.technologies.2020_09_01_mobile` WHERE app = 'WordPress')
  JOIN
    (SELECT pageid, url AS page FROM `httparchive.summary_pages.2020_09_01_mobile`)
  USING
    (page)
  JOIN
    (SELECT pageid, url, respSize FROM `httparchive.summary_requests.2020_09_01_mobile`)
  USING
    (pageid)
  GROUP BY
    page,
    path
  HAVING
    path IS NOT NULL),
  UNNEST([10, 25, 50, 75, 90, 100]) AS percentile
GROUP BY
  percentile,
  path
ORDER BY
  percentile,
  path

So if we go with the 75th percentile for maxEntries asset caching (above), then roughly 75% of sites would use use at most:

  • core: 178KB
  • themes: 622KB
  • plugins: 448KB

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant