Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize randomkey on expired keys #13089

Open
wants to merge 1 commit into
base: unstable
Choose a base branch
from

Conversation

lyq2333
Copy link
Contributor

@lyq2333 lyq2333 commented Feb 23, 2024

Currently, we have two problems in dbRandomKey.

  • When we pause expiration (i.e. call client pause), dbRandomKey can't delete expired keys and will cause an infinite loop if the db is full of expired keys. I think this is a bug.
  • If the db has lots of expired keys, the performance of dbRandomKey may be very bad. This is because kvstoreDictGetFairRandomKey returns only one key every time and the returned key is most likely expired, which means only one key can be deleted each time kvstoreDictGetFairRandomKey is called. In the worst case, server has to call kvstoreDictGetFairRandomKey as many times as the number of expired keys, which may be enormous.

I think if we allow returning expired keys in dbRandomKey like what we did in replica, the two problems are easy to resolve. We can set a maxTry to make the best effort to provide a key not already expired. After maxTry, we can return the expired key. I think this is a breaking change.

If we still return the non-expired key, I think it's hard to resolve the first problem(infinite loop). Besides, we can only alleviate the second problems(performance problem).

In this PR, I just optimize the performance problem.

  • Infinite loop(not solved)
  • Performance problem(just alleviate)

Before, in kvstoreDictGetFairRandomKey, we first call dictGetSomeKeys to get a set of keys and choose one key from it. If the chose key is expired, we have to repeat this process. Therefore, one key may be visited many times before being selected. Now, I get the whole set of keys by calling kvstoreDictGetSomeKeys and make effort to guarantee each key is selected only once (kvstoreDictGetSomeKeys may return duplicated entries).

I make a simple performance test. In the db with 10 million expired keys, randomkey costs about 28s in unstable and it costs about 9s in this PR.

This PR can only alleviate the performance problems. If we want to resolve the two problems above, I think maybe we need a breaking change that allow returning expired keys in dbRandomKey.

src/db.c Dismissed Show dismissed Hide dismissed
@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants