Optimize randomkey on expired keys #13089
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Currently, we have two problems in
dbRandomKey
.client pause
),dbRandomKey
can't delete expired keys and will cause an infinite loop if the db is full of expired keys. I think this is a bug.dbRandomKey
may be very bad. This is becausekvstoreDictGetFairRandomKey
returns only one key every time and the returned key is most likely expired, which means only one key can be deleted each timekvstoreDictGetFairRandomKey
is called. In the worst case, server has to callkvstoreDictGetFairRandomKey
as many times as the number of expired keys, which may be enormous.I think if we allow returning expired keys in
dbRandomKey
like what we did in replica, the two problems are easy to resolve. We can set amaxTry
to make the best effort to provide a key not already expired. AftermaxTry
, we can return the expired key. I think this is a breaking change.If we still return the non-expired key, I think it's hard to resolve the first problem(infinite loop). Besides, we can only alleviate the second problems(performance problem).
In this PR, I just optimize the performance problem.
Before, in
kvstoreDictGetFairRandomKey
, we first calldictGetSomeKeys
to get a set of keys and choose one key from it. If the chose key is expired, we have to repeat this process. Therefore, one key may be visited many times before being selected. Now, I get the whole set of keys by callingkvstoreDictGetSomeKeys
and make effort to guarantee each key is selected only once (kvstoreDictGetSomeKeys
may return duplicated entries).I make a simple performance test. In the db with 10 million expired keys,
randomkey
costs about 28s inunstable
and it costs about 9s in this PR.This PR can only alleviate the performance problems. If we want to resolve the two problems above, I think maybe we need a breaking change that allow returning expired keys in
dbRandomKey
.