-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TTL] Historical data may never get the opportunity to get garbage collected #5438
Comments
In the design now, it indeed depends on user to trigger the compact job to GC the garbage in the bottom level. |
As far as I can see, in the current design, the only way user can trigger the compaction of the bottommost level is to submit a compact job, which will then trigger a full compaction. Actually, full compaction is almost unacceptable in production environment. |
If you have any idea to improve this, any contribution are welcomed. |
My idea is simple. see #5447 |
Not a bug, this is. Removed the bug label. |
Please check the FAQ documentation before raising an issue
Describe the bug (required)
By default, custom_filter_interval_secs is set as 24 * 3600, which means in 24 hours, we will only have one chance to do custom minor compaction. Otherwise, it will go to the default minor compaction.
For historical data, many data will reside in the bottommost level. For the expired data in the bottommost level, it doesn't have many chance to get GC except periodic compaction. However, during daily running, the only custom-compaction chance will most possibly be used by upper level data compaction, such as level0 => level1. So the default 30-days periodic compaction will go through the default minor compaction, without go through custom compaction filter. So the expired data will always be there.
Here is some log I print out in StorageIterator.h:
As you can see, it will read a bunch of expired edges during edge traverse.
After I fixed the compaction logic, the performance got extremely better.
Your Environments (required)
uname -a
g++ --version
orclang++ --version
lscpu
a3ffc7d8
)How To Reproduce(required)
Steps to reproduce the behavior:
Expected behavior
Additional context
The text was updated successfully, but these errors were encountered: