-
Notifications
You must be signed in to change notification settings - Fork 166
Utilize BucketObjectEntry in GC logic #8238
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| /// This allows us to detect files that are present, but not being updated | ||
| /// anymore. We classify such files as _stray files_ and write alerts to logs. | ||
| const _updateValidatedAfter = Duration(days: 1); | ||
| const _updateValidatedAfter = Duration(days: 3); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Idea being that we don't want to update meta-data all the time.
| /// is always indicative of a bug. Nevertheless, we write alerts to logs, so | ||
| /// that these inconsistencies can be detected. | ||
| const _unvalidatedStrayFileAfter = Duration(days: 7); | ||
| const _unvalidatedStrayFileAfter = Duration(days: 12); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's wait a LONG time before we consider something a stray file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering: will we look at the logs or have alerts that fire within this period?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, after 12 days we'll consider a file stray, and then we'll see it in logs.
| /// This will remove all packages from the `<prefix>/` where: | ||
| /// * The name of the package is not in [allPackageNames], and, | ||
| /// * The file is more than one day old. | ||
| Future<void> _gcPrefix(String prefix, Set<String> allPackageNames) async { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change here simplified a bit, just list all files, extract package name from the object name and if the package doesn't exist, we delete it if it's older than _minGarbageAge
| /// It's entirely possible that one process is writing files, while another | ||
| /// process is running garbage collection. | ||
| const _minGarbageAge = Duration(days: 1); | ||
| const _minGarbageAge = Duration(hours: 3); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't be high, if something is published, it'll be at-least 3 hours before we can delete it.
I think that's perhaps somewhat reasonable. But we don't want this to be too high.
However, 3 hour should be more than enough for any eventual consistency issues between datastore and GCS to have been resolved.
| /// is always indicative of a bug. Nevertheless, we write alerts to logs, so | ||
| /// that these inconsistencies can be detected. | ||
| const _unvalidatedStrayFileAfter = Duration(days: 7); | ||
| const _unvalidatedStrayFileAfter = Duration(days: 12); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering: will we look at the logs or have alerts that fire within this period?
No description provided.