New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use mangle cache invalidation for file purges #1029

Merged
merged 8 commits into from Dec 6, 2018

Conversation

Projects
None yet
3 participants
@darthhexx
Copy link
Member

darthhexx commented Nov 21, 2018

Description

If the PURGE_SERVER_TYPE is configured for mangle, use it for files purges instead of direct calls to Varnish. Also, don't fire off deletes to the files service for every image size as this slows down the request needlessly.

Checklist

Please make sure the items below have been covered before requesting a review:

  • This change works and has been tested locally.
  • This change works and has been tested on a Go sandbox.
  • This change works and has been tested in production.

Steps to Test

  1. Set the purge server type constant on you sandbox to 'mangle'.
  2. Delete a file.
  3. Ask systems to check the DC LB host logs to ensure only a single DELETE hit the files service and only a single PURGE took place to the mangle cache invalidation service.

darthhexx added some commits Nov 21, 2018

Don't fire off deletes to the files service for every image size
Use a static variable to record whether we've deleted a file before.

Of course there could be cases when more than one file is deleted in a single request. The assumption here is that all variations of the same image would be deleted in a row, thereby only allowing one through with the next image resetting the static variable on it's first variation.

It should be noted that it is not an issue to let extra deletes through, this is just to reduce unnecessary calls and speed up the transaction.

@darthhexx darthhexx requested a review from sboisvert Nov 21, 2018

@sboisvert
Copy link
Contributor

sboisvert left a comment

I have a few questions that may be simple / irrelevant and are mostly based on the fact that I'm not confident in my ability to give proper feedback on this code based on my knowledge of the VIP Go platform.
While the code looks good to me (with a few questions) I think having someone else from platform also review it could be beneficial.

$url_parts = parse_url( $file_name );
if ( false !== stripos( $url_parts['path'], constant( 'LOCAL_UPLOADS' ) ) )
$file_uri = substr( $url_parts['path'], stripos( $url_parts['path'], constant( 'LOCAL_UPLOADS' ) ) + strlen( constant( 'LOCAL_UPLOADS' ) ) );
else
$file_uri = '/' . $url_parts['path'];
if ( in_array( $file_uri, $deleted_uris ) ) {
// This file has already been successfully deleted from the file service in this request

This comment has been minimized.

@sboisvert

sboisvert Nov 21, 2018

Contributor

Is there a scenario where we'd want the end user to be alerted of this? Would we want to log this somewhere?
I don't see how it could not work but if let's say there's a bug somewhere, would we want to know this early return was triggered?
Another potential benefit is that we may not be aware that we're requesting the same file to be deleted over and over again. Having it logged might be able to surface it to make other scripts potentially faster (skip images that are the same).
What does this path look like? For a multisite, if one were to want to delete /2018/01/01/image.jpg for 5 subsites, would the check here skip the images of subsites after the first one is deleted? From my reading of L469 I think it might, but I'm not 100% sure of this.

This comment has been minimized.

@darthhexx

darthhexx Nov 22, 2018

Author Member

Good catch. It's not possible to do from the media library, but custom calls could definitely do it. I have amended the code to add the multisite path to the deleted file path, to ensure we would delete the files for all sites that deleted the same file path in the same request.

@@ -473,10 +482,13 @@ function delete_file( $file_name ) {
curl_close( $ch );
if ( 200 != $http_code ) {
error_log( sprintf( __( 'Error deleting the file from the remote servers: Code %d' ), $http_code ) );
error_log( sprintf( __( 'Error deleting the file %s from the remote servers: Code %d' ), $file_uri, $http_code ) );

This comment has been minimized.

@sboisvert

sboisvert Nov 21, 2018

Contributor

Do we get an error message that has more description than just the code? Or is the code enough to tell us what went wrong? As in file not found vs unauthorized vs the server is down?

This comment has been minimized.

@darthhexx

darthhexx Nov 22, 2018

Author Member

The only signal we get is the error code. Either 200, 404 (file is already deleted), or 500s for service related issues. At least now we log which file it was :)

$uri = '/';
if ( isset( $parsed['path'] ) )
$uri = $parsed['path'];
if ( isset( $parsed['query'] ) )

This comment has been minimized.

@sboisvert

sboisvert Nov 21, 2018

Contributor

This is more a question based on my ignorance more than anything but would this purge all variations of the file?
If we delete /2018/01/01/image.jpg?q=80&w=100, it seems like if purge_file_cache() was called by something other than delete_file() then it's possible that '/2018/01/01/image.jpg?q=80&w=100' is purged instead of '/2018/01/01/image.jpg'. I don't see another way to call purge_file_cache() but I guess I wonder about the potential different behaviour.

This comment has been minimized.

@darthhexx

darthhexx Nov 22, 2018

Author Member

We only purge the file at /2018/01/01/image.jpg and all variants are purged in Varnish at the same time. This is a key feature of Varnish for us.

return;
}
if ( ! isset( $file_cache_servers ) || empty( $file_cache_servers ) )

This comment has been minimized.

@sboisvert

sboisvert Nov 21, 2018

Contributor

Should this trigger an error visible to the user / developer?

This comment has been minimized.

@darthhexx

darthhexx Nov 22, 2018

Author Member

This would only execute for sandboxes, but it would still make sense to log as an error. I added it.

darthhexx added some commits Nov 22, 2018

Multisite with exactly the same file paths being deleted in one call
Although not possibel through the media manager, there could possibly be custom calls that have exactly the same file path for 2 or more subsites that are being deleted in the same request. To overcome this edge case, prefix the `deleted_uris` with any relevant multisite paths.
Log an error if no file cache servers are defined for cache purging
This should only ever execute on sandboxes, since all production sites use mangle to purge the caches.

@darthhexx darthhexx requested a review from mjangda Nov 22, 2018

@mjangda
Copy link
Member

mjangda left a comment

Overall looks good. Left some minor suggestions (e.g. to fix logs indexing) and a couple other small things.

The other we discussed and is worth noting is that in future PRs, we need to look into consolidating this code with the existing cache manager functionality to reduce the duplication. I'll track a separate issue for that.

Show resolved Hide resolved a8c-files.php Outdated
Show resolved Hide resolved a8c-files.php Outdated
Show resolved Hide resolved a8c-files.php Outdated
}
$uri = '/';
if ( isset( $parsed['path'] ) )

This comment has been minimized.

@mjangda

mjangda Dec 4, 2018

Member

Minor coding standards thing: we should always use braces, even for single line if clauses. Without them, we open up room for easy-to-make mistakes.

This comment has been minimized.

@darthhexx

darthhexx Dec 4, 2018

Author Member

Yeah, I just moved existing code and never changed it. I'll fix it.

return $requests;
}
$uri = '/';

This comment has been minimized.

@mjangda

mjangda Dec 4, 2018

Member

Minor: I don't think there's any major concerns with with this default since this is a private function, but we could end up sending a purge for example.com/ if a non-file URL is passed through. Probably not worth changing but figured I'd flag it just in case.

This comment has been minimized.

@darthhexx

darthhexx Dec 4, 2018

Author Member

It wouldn't cause any issues if we did, so I think it's okay.

mjangda and others added some commits Dec 4, 2018

Switch from `error_log` to `trigger_error`
Co-Authored-By: darthhexx <dnewman@automattic.com>
@mjangda

mjangda approved these changes Dec 5, 2018

Copy link
Member

mjangda left a comment

👍

@darthhexx darthhexx merged commit ae5b927 into master Dec 6, 2018

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details

@mjangda mjangda deleted the add/files-mangle-cache-purges branch Dec 6, 2018

@mjangda

This comment has been minimized.

Copy link
Member

mjangda commented Dec 6, 2018

r123942-deploy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment