Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add garbage collection cleanup admin and cron job #227

Merged

Conversation

markkelnar
Copy link
Contributor

@markkelnar markkelnar commented Jul 11, 2023

Fixes #217

Add admin settings interface to toggle (checkbox) enable/disable a daily wp cron event which when run, delete saved queries meeting criteria of age.

Add a number field in admin settings representing the number of days old, once reached by a saved query, it is deleted.

Screenshot 2023-07-20 at 12 50 03 PM

Add 'Groups' for saved documents. These groups are ignored by garbage collection.

Screenshot 2023-07-20 at 11 11 06 AM
Screenshot 2023-07-20 at 11 11 18 AM

src/Utils.php Outdated Show resolved Hide resolved
@jasonbahl
Copy link
Collaborator

I think my preference for this to be MVP should be:

  • a way for individual queries to opt out of garbage collection
  • batch the garbage collection (instead of posts_per_page -1)

@markkelnar
Copy link
Contributor Author

Some screen shots of adding the 'Garbage Collection' option in the admin editor.
Screenshot 2023-07-18 at 3 24 09 PM

Screenshot 2023-07-18 at 3 24 20 PM

.env.dist Show resolved Hide resolved
src/Admin/Settings.php Outdated Show resolved Hide resolved
src/Admin/Settings.php Outdated Show resolved Hide resolved
src/Admin/Editor.php Outdated Show resolved Hide resolved
src/Admin/Settings.php Outdated Show resolved Hide resolved
src/Admin/Settings.php Outdated Show resolved Hide resolved
src/Document/GarbageCollection.php Outdated Show resolved Hide resolved
src/Document/GarbageCollection.php Show resolved Hide resolved
src/Document/GarbageCollection.php Show resolved Hide resolved
src/Document/Group.php Outdated Show resolved Hide resolved
src/Document/Group.php Outdated Show resolved Hide resolved
wp-graphql-smart-cache.php Show resolved Hide resolved
markkelnar and others added 3 commits July 24, 2023 09:10
Co-authored-by: Jason Bahl <jasonbahl@mac.com>
text description change.
composer.json Outdated Show resolved Hide resolved
Copy link
Collaborator

@jasonbahl jasonbahl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pulled this down to walk through it and check out the code.

I wanted to make sure it played nice with all the PHPStan changes, so I merged main into this branch and ran composer check-cs and composer phpstan.

PHPStan

A couple changes required. Also see: https://github.com/wp-graphql/wp-graphql-smart-cache/actions/runs/5804611542

CleanShot 2023-08-08 at 22 14 55

PHPCS

Passes 🙌🏻

CleanShot 2023-08-08 at 22 16 09


Unexpected Behavior

I went to test some things, and it looks like maybe we have some gaps in logic.

It appears that old posts, not associated with any "group" are not being deleted as expected.

Steps to reproduce:

  • In GraphQL > Settings > Saved Queries, enable garbage collection and set the number of days to 1:

CleanShot 2023-08-08 at 21 47 55

  • Publish 2 posts of the "graphql_document" post type. Add a "group" taxonomy term to 1, but not the other. Set the dates in the past more than 1 day (I set the dates to June 9 and April 9, both far beyond the 1-day setting value)

CleanShot 2023-08-08 at 21 44 16

  • Using WP-CLI, Run the cron event wp cron event run wpgraphql_smart_cache_query_garbage_collect

  • See that neither document is deleted (I expect the one without a group to be deleted)

NOTE: if I generate many posts of the graphql_document type with the following command wp post generate --post_type=graphql_document --post_date="2022-11-14 20:00:00" and then run the cron event, those generated documents delete. But the manually created document with no association to a "group" is still hanging around.

I also generated some persisted queries by pointing wpgraphql.com's Faust codebase with the Persisted Queries Link at my localwp instance with this version of WPGraphQL Smart Cache active, and I got it to generate a query via APQ.

I modified the date of the query to June 9:

CleanShot 2023-08-08 at 22 09 49

Then I ran wp cron event run wpgraphql_smart_cache_query_garbage_collect and the persisted query created by APQ was NOT deleted. I expect this APQ generated document, which has no association with a "Group" tag, and is 2 months old, to be deleted.

So, there's some bugs in the logic here. I've not yet dug into it, but the first place I would probably look is the WP_Query that determines which posts to delete.

(fwiw, I also checked out sha 3a7f1413dd080e5b4761b6510e6a872437b1943d from July 28, the state the PR was in before I merged the phpstan changes to this PR, and the same behavior occurs. I wanted to rule out any regressions that might have happened due to the phpstan changes, and I've confirmed the behavior documented above is the same with or without the phpstan changes merged in)

@markkelnar
Copy link
Contributor Author

@jasonbahl try changing the post_modified_gmt date. This is what determines the age of the saved query, not the publish date. Per the discussion in the original issue.

@jasonbahl
Copy link
Collaborator

@markkelnar ah ha! Updating the modified date does indeed do the trick. 🤦🏻‍♂️

I tested by manually updating the post_modified date in the database for a persisted query that was generated by APQ and running wp cron event run wpgraphql_smart_cache_query_garbage_collect again.

I think it would be good at some point, to have some UIs that show when a persisted query will "expire". 🤔 Will think about it and open a new issue for future discussion.

@jasonbahl
Copy link
Collaborator

I opened this issue as a follow-up to work on at some point in the future: #245

@jasonbahl jasonbahl self-requested a review August 9, 2023 17:24
@markkelnar markkelnar merged commit 61c4d7c into wp-graphql:main Aug 9, 2023
8 checks passed
@markkelnar markkelnar deleted the feature/garbage-collect-aged-queries branch August 9, 2023 17:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Garbage Collect old persisted queries
2 participants