Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add scheme-based caching #94

Merged
merged 1 commit into from
Aug 14, 2020
Merged

Conversation

coreykn
Copy link
Contributor

@coreykn coreykn commented Aug 14, 2020

Cache HTTP and HTTPS pages separately (#20). This builds upon PR #50. This fixes HTTP resources being requested by a page that is loaded over HTTPS. This occurs when a page is loaded over HTTP, cached, and then later loaded over HTTPS. This also fixes HTTPS resources being requested by a page that is loaded over HTTP. If the website does not have a valid SSL/TLS certificate the requests would fail. This issue can arise when HTTP or HTTPS is not being forced.

Update Cache Enabler HTML comment to be more clear on what cached page is being delivered.

Update clear_home() method to unlink based on glob pattern. This builds upon part of PR #64. This is a better way to clear the home page cache as we are not attempting to unlink files that do not exist. Further, this makes clearing the home page cache more simple when introducing scheme-based caching.

Fixes #20

Cache HTTP and HTTPS pages separately (keycdn#20). This builds upon PR keycdn#50.
This fixes HTTP resources being requested by a page that is loaded over
HTTPS. This occurs when a page is loaded over HTTP, cached, and then
later loaded over HTTPS. This also fixes HTTPS resources being
requested by a page that is loaded over HTTP. If the website does not
have a valid SSL/TLS certificate the requests would fail. This issue
can arise when HTTP or HTTPS is not being forced.

Update Cache Enabler HTML comment to be more clear on what cached page
is being delivered.

Update `clear_home()` method to unlink based on glob pattern. This
builds upon part of PR keycdn#64. This is a better way to clear the home page
cache as we are not attempting to unlink files that do not exist.
Further, this makes clearing the home page cache more simple when
introducing scheme-based caching.

Fixes keycdn#20
@coreykn coreykn requested a review from svenba August 14, 2020 04:25
@svenba svenba merged commit be62c3d into keycdn:master Aug 14, 2020
@coreykn coreykn deleted the add-scheme-based-caching branch August 14, 2020 15:15
coreykn added a commit to coreykn/cache-enabler that referenced this pull request Aug 19, 2020
Update `Cache_Enabler::process_clear_request()` to only clear the home
page cache when using the "Clear URL Cache" admin bar button on the
home page. Previously the complete cache would be cleared.

Fix advanced cache path to cached variants. This allows the advanced
cache to deliver the cached page and bypass PHP as it did in version
1.3.5. This should have been done in PR keycdn#94.

Delete `CE_SETTINGS_PATH` constant that was added in PR keycdn#96. The
advanced cache is not always able to read the value of this constant.
coreykn added a commit to coreykn/cache-enabler that referenced this pull request Aug 19, 2020
Update `Cache_Enabler::process_clear_request()` to only clear the home
page cache when using the "Clear URL Cache" admin bar button on the
home page. Previously the complete cache would be cleared.

Fix advanced cache path to cached variants. This allows the advanced
cache to deliver the cached page and bypass PHP as it did in version
1.3.5. This should have been done in PR keycdn#94.

Delete `CE_SETTINGS_PATH` constant that was added in PR keycdn#96. The
advanced cache is not always able to read the value of this constant.
coreykn added a commit to coreykn/cache-enabler that referenced this pull request Aug 19, 2020
Update `Cache_Enabler::process_clear_request()` to only clear the home
page cache when using the "Clear URL Cache" admin bar button on the
home page. Previously the complete cache would be cleared.

Update how `Cache_Enabler_Disk::_file_scheme()` returns `https` or
`http`. This is required to return the correct scheme if the WordPress
installation is behind a proxy.

Fix advanced cache path to cached variants. This allows the advanced
cache to deliver the cached page and bypass PHP as it did in version
1.3.5. This should have been done in PR keycdn#94.

Delete `CE_SETTINGS_PATH` constant that was added in PR keycdn#96. The
advanced cache is not always able to read the value of this constant.
coreykn added a commit to coreykn/cache-enabler that referenced this pull request Aug 19, 2020
Update `Cache_Enabler::process_clear_request()` to only clear the home
page cache when using the "Clear URL Cache" admin bar button on the
home page. Previously the complete cache would be cleared.

Update how `Cache_Enabler_Disk::_file_scheme()` returns `https` or
`http`. This is required to return the correct scheme if the WordPress
installation is behind a proxy.

Fix advanced cache path to cached variants. This allows the advanced
cache to deliver the cached page and bypass PHP as it did in version
1.3.5. This should have been done in PR keycdn#94.

Delete `CE_SETTINGS_PATH` constant that was added in PR keycdn#96. The
advanced cache is not always able to read the value of this constant.
svenba pushed a commit that referenced this pull request Aug 19, 2020
Update `Cache_Enabler::process_clear_request()` to only clear the home
page cache when using the "Clear URL Cache" admin bar button on the
home page. Previously the complete cache would be cleared.

Update how `Cache_Enabler_Disk::_file_scheme()` returns `https` or
`http`. This is required to return the correct scheme if the WordPress
installation is behind a proxy.

Fix advanced cache path to cached variants. This allows the advanced
cache to deliver the cached page and bypass PHP as it did in version
1.3.5. This should have been done in PR #94.

Delete `CE_SETTINGS_PATH` constant that was added in PR #96. The
advanced cache is not always able to read the value of this constant.
centminmod added a commit to centminmod/centminmod that referenced this pull request Aug 27, 2020
…123.09beta01

- Cache Enabler plugin v1.4.0 has a caching routine breaking change as they switched to scheme based caching so cached files are prefixed with http- or https- depending on how file is requested keycdn/cache-enabler#94. Nginx served caching looking for cached files in format index.html or index-webp.html prior to v1.4.0. But with v1.4.0 the format cached to http-index.html or https-index.html etc.
- Update applies to new centmin.sh menu option 22 cache enabler selected installs and not existing installs. For existing installs, a further update will be made in 123.09beta01 to auto detect existing generated /usr/local/nginx/conf/wpincludes/${vhostname}/wpcacheenabler_${vhostname}.conf include files where ${vhostname} is your wordpress domain name and then auto apply the changes.
coreykn added a commit to coreykn/cache-enabler that referenced this pull request Mar 12, 2021
Update cache handling to get the cache file based on the request,
mostly looking at the Cache Enabler settings and request headers. This
updates the cache handling for both new and potentially cached pages by
generating the required cache file through cache keys. This will be
much better going forward as it simplifies handling current cache
variants and adding additional cache variants in the future. The
previous way of handling this would mean many methods and convoluted
logic would have to be added on each additional cache variant. As each
variant is added that would have only lead to more unnecessary
complexity.

The cache keys are obtained in the following ways:

* `scheme`: Scheme-based caching was introduced in PR keycdn#94 and released
in version 1.4.0. This was then later updated in PR keycdn#98, PR keycdn#109, and
PR keycdn#141. Change this to use what the WordPress `is_ssl()` function uses
with the added ability to also check the `X-Forwarded-Proto` or
`X-Forwarded-Scheme` request headers.

* `device`: Device-based caching will be introduced in this PR and
released in version 1.7.0. This will allow a separate mobile cache to
be made based on the `User-Agent` request header. The values checked
have been taken from the WordPress `wp_is_mobile()` function (keycdn#160).

* `webp`: WebP-based caching was released in version 1.0.2. This still
checks the `Accept` request header, but will now look for `image/webp`
instead of just `webp`.

* `compression`: Compression-based caching for Gzip has been supported
since the initial release. This still checks the `Accept-Encoding`
request header for `gzip`. This setup will make it easy for other
compression algorithms to be introduced in the future.

This change means that only the requested cache file is created if it
does not exist yet instead of all variants at once. This will reduce
unnecessary memory usage, especially if many cache variants are used,
and will reduce storing unnecessary cache variants that may never even
be requested. (This will impact anyone using cache warming techniques
because a request for each variant to cache will now have to be made.)
This also ensures the client gets what it has requested instead of
falling back to the default HTML file if the requested variant does not
exist yet but the default HTML file did. (That was pretty rare edge
case as it would only previously occur if a cache variant setting was
enabled after cached pages were already generated and the cache was not
cleared.)

Getting the request headers has been updated as in some cases
`apache_request_headers()` was a function that existed but returned an
empty array or `false`. This lead to some cache variants not being
delivered even when accepted by the client. This new way will try to
get the request header from `apache_request_headers()` and if not set
then fallback to the `$_SERVER` variable. If neither are set then an
empty string to prevent having to always check if it is set. Only
checking `$_SERVER` variables was not used as some environments do not
provide all of these. This way should get the specific request header
values we need across many different configurations if it is
available.

The cache signature will now include the file name, such as
`https-index-webp.html.gz`, instead of `https webp gzip` because of
this new handling. We can filter this and update it to the old format
if preferred. (I do not have a preference, this way just reduces the
overhead.) The cache signature will now also include the generated date
in HTTP-date format (e.g. `D, d M Y H:i:s GMT`) instead of `d.m.Y
H:i:s`.

Update `Cache_Enabler_Disk::get_settings_file_name()` to parse the
`$_SERVER['REQUEST_URI']` value and get the `PHP_URL_PATH` when in the
subdirectory network fallback to prevent a warning from occurring in
the following `preg_grep()` function when the path contained a query
string. The `?` from the query string would cause a regular expression
error if the path had a trailing slash (preceding token is not
quantifiable, like in `\.(path|to|page|?param=value)`).

For a future note, error handling when creating directories and
compressing page contents needs to be improved. `wp_die()` for failed
cache file directory creation has been removed as it did not work and
would have caused an output buffer error instead. As far as I can tell
this would have always occurred in all past versions. This would not
occur when trying to create the settings file, but it has been removed
for this as well as it is not a graceful way to handle this type of
error. If Gzip compression failed the page will not be created. These
all rarely occur, if they ever do as no issues have been opened around
these scenarios, but in the event they do the user should receive
additional insight to what is occurring so it can be resolved.

Closes keycdn#160
@coreykn coreykn mentioned this pull request Mar 12, 2021
svenba pushed a commit that referenced this pull request Mar 15, 2021
Update cache handling to get the cache file based on the request,
mostly looking at the Cache Enabler settings and request headers. This
updates the cache handling for both new and potentially cached pages by
generating the required cache file through cache keys. This will be
much better going forward as it simplifies handling current cache
variants and adding additional cache variants in the future. The
previous way of handling this would mean many methods and convoluted
logic would have to be added on each additional cache variant. As each
variant is added that would have only lead to more unnecessary
complexity.

The cache keys are obtained in the following ways:

* `scheme`: Scheme-based caching was introduced in PR #94 and released
in version 1.4.0. This was then later updated in PR #98, PR #109, and
PR #141. Change this to use what the WordPress `is_ssl()` function uses
with the added ability to also check the `X-Forwarded-Proto` or
`X-Forwarded-Scheme` request headers.

* `device`: Device-based caching will be introduced in this PR and
released in version 1.7.0. This will allow a separate mobile cache to
be made based on the `User-Agent` request header. The values checked
have been taken from the WordPress `wp_is_mobile()` function (#160).

* `webp`: WebP-based caching was released in version 1.0.2. This still
checks the `Accept` request header, but will now look for `image/webp`
instead of just `webp`.

* `compression`: Compression-based caching for Gzip has been supported
since the initial release. This still checks the `Accept-Encoding`
request header for `gzip`. This setup will make it easy for other
compression algorithms to be introduced in the future.

This change means that only the requested cache file is created if it
does not exist yet instead of all variants at once. This will reduce
unnecessary memory usage, especially if many cache variants are used,
and will reduce storing unnecessary cache variants that may never even
be requested. (This will impact anyone using cache warming techniques
because a request for each variant to cache will now have to be made.)
This also ensures the client gets what it has requested instead of
falling back to the default HTML file if the requested variant does not
exist yet but the default HTML file did. (That was pretty rare edge
case as it would only previously occur if a cache variant setting was
enabled after cached pages were already generated and the cache was not
cleared.)

Getting the request headers has been updated as in some cases
`apache_request_headers()` was a function that existed but returned an
empty array or `false`. This lead to some cache variants not being
delivered even when accepted by the client. This new way will try to
get the request header from `apache_request_headers()` and if not set
then fallback to the `$_SERVER` variable. If neither are set then an
empty string to prevent having to always check if it is set. Only
checking `$_SERVER` variables was not used as some environments do not
provide all of these. This way should get the specific request header
values we need across many different configurations if it is
available.

The cache signature will now include the file name, such as
`https-index-webp.html.gz`, instead of `https webp gzip` because of
this new handling. We can filter this and update it to the old format
if preferred. (I do not have a preference, this way just reduces the
overhead.) The cache signature will now also include the generated date
in HTTP-date format (e.g. `D, d M Y H:i:s GMT`) instead of `d.m.Y
H:i:s`.

Update `Cache_Enabler_Disk::get_settings_file_name()` to parse the
`$_SERVER['REQUEST_URI']` value and get the `PHP_URL_PATH` when in the
subdirectory network fallback to prevent a warning from occurring in
the following `preg_grep()` function when the path contained a query
string. The `?` from the query string would cause a regular expression
error if the path had a trailing slash (preceding token is not
quantifiable, like in `\.(path|to|page|?param=value)`).

For a future note, error handling when creating directories and
compressing page contents needs to be improved. `wp_die()` for failed
cache file directory creation has been removed as it did not work and
would have caused an output buffer error instead. As far as I can tell
this would have always occurred in all past versions. This would not
occur when trying to create the settings file, but it has been removed
for this as well as it is not a graceful way to handle this type of
error. If Gzip compression failed the page will not be created. These
all rarely occur, if they ever do as no issues have been opened around
these scenarios, but in the event they do the user should receive
additional insight to what is occurring so it can be resolved.

Closes #160
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

cache HTTP and HTTPS pages separately
2 participants