Add Redis-compatible cursors for `SCAN` commands #1489

jihuayu · 2023-06-10T04:20:23Z

Hello everyone, I have completed this PR and below is my explanation of the changes made.

We have added the following steps to the processing of the SCAN, ZSCAN, SSCAN, HSCAN commands:

Before processing the command, convert the numeric cursor sent by the client back to the keyname cursor.
After processing the command, convert the keyname cursor back to a numeric cursor and return it to the client, And store the conversion dictionary in the Server#cursor_dict_.

Since those steps is non-intrusive to the internal implementation of the SCAN command, we can ensure that the behavior of commands such as ZSCAN, SSCAN, and HSCAN is consistent with that of the SCAN command.
In the following, I will only use the SCAN command as an example to describe the new design.

Cursor design

We call the new cursor the numeric cursor and the old one the keyname cursor.
The numeric cursor is composed of 3 parts:

1-16 bit is counter，and the first bit always 1
17-32 bit is timestamp
33-64 bit is hash of keyname

The counter is a 16-bit unsigned integer that is incremented by 1 every time. When the counter overflows, it returns to 0 because it is an unsigned number. Since our cursor_dict_size is a power of 2, the counter is continuous mod cursor_dict_size.
timestap is a 16-bit timestamp in seconds, which can store up to 9 hours.
hash is a 32-bit hash value of the keyname.

Cursor dictionary(`Server#cursor_dict_`)

The cursor dictionary is an array with a length of 16384(1024*16), which is determined at compile time, and occupies about 640KB of memory. Including the length of the referenced keyname strings, its size is about 1-2M.

During convert the keyname cursor back to a numeric cursor, a new cursor is generated based on the above rules, and the index for storing the dictionary is determined based on the counter value (index = counter % DICT_SIZE).

During convert the numeric cursor back to a keyname cursor , we get the counter from the cursor, and calculate the index of the cursor in the cursor_dict_ based on the counter. We only need to compare the cursor value of the item at that index with the input cursor value to determine if they are the same.

Other information about the cursor

This design guarantees the validity of the latest 16384(1024*16) cursors, while cursors that are older or not generated in our system are considered invalid cursors. For invalid cursors, we treat them as a 0 cursor, which means we will start iterating over the collection from the beginning.

Our cursor is globally visible, and we store index information in the cursor. As long as the cursor remains valid, using the same cursor in different connections will produce the same results.

We prevent other users from guessing the data traversed by adjacent cursors by adding the hash value of the keyname to the cursor. If a user tries to obtain adjacent cursor information by traversing the hash, the cursor will become an invalid cursor before the traversal is complete because the size of the 32-bit space is much larger than the length of the cursor_dict_.

We add a timestamp to the cursor to ensure that the same cursor does not appear within a short period of time before and after a restart.

Other behaviors are consistent with the original SCAN implementation.

Configuration file

Added redis-cursor-compatible configuration item.
If enabled, the cursor will be an unsigned 64-bit integer.
If disabled, the cursor will be a string.

Test file

Added tests for redis-cli --bigkey and redis-cli --memkeys commands. We only need to ensure that these commands run correctly, because their correctness is guaranteed by redis-cli on the premise that we ensure the correctness of the scan command.
Modified the scan command test to test for the cases where redis-cursor-compatible is set to yes or no.

Other changes:

Fixed a bug where the cursor did not return 0 when SCAN commands return less than number elements.

fixed #1402
fixed #877

dont check GenerateCursorFromKeyName key_name empty

git-hulk · 2023-06-10T04:54:51Z

Hi @jihuayu, Thanks for your great efforts.

Can you explain a bit about the cursor number design and what if we use the wrong cursor for the hash/set/zset scan(should we store and compare the prefix as well when getting it by the cursor)? And if it's good to use a random or timestamp to avoid conflicts when restarting the server.

PragmaTwice · 2023-06-10T07:28:16Z

Thanks for your contribution!

Several points after a short glance:

seems the ring buffer is shared between all connections, so just be curious that, will the functionality of SCAN get break if we have lots of client connections simultaneously? And could we make it duplicate per connections or able to adjust the buffer size dynamically?
if some keys are deleted while the SCAN operation is executed, will the key properties of SCAN be preserved? (refer to "Scan guarantees" in https://redis.io/commands/scan/#scan%20guarantees)
self-increase cursor may cause informantion leaks: one client can retrieve some informantion of the current executed commands from another client (partially same as the first point)
seems the mutex can be replaced by rwlock or atomic variables

cc @mapleFU @torwig

git-hulk · 2023-06-13T03:41:01Z

seems the ring buffer is shared between all connections, so just be curious that, will the functionality of SCAN get break if we have lots of client connections simultaneously? And could we make it duplicate per connections or able to adjust the buffer size dynamically?

@PragmaTwice It cannot be per connection since we need to allow using the cursor in different connections. That's to say, the connection A got the cursor should can be visible to connection B from the client side.

jihuayu · 2023-06-13T03:47:06Z

I will redesign the cursor.
I'm very busy these days, so I will come back in a few days. Wait for me.

PragmaTwice · 2023-06-13T03:59:25Z

seems the ring buffer is shared between all connections, so just be curious that, will the functionality of SCAN get break if we have lots of client connections simultaneously? And could we make it duplicate per connections or able to adjust the buffer size dynamically?

@PragmaTwice It cannot be per connection since we need to allow using the cursor in different connections. That's to say, the connection A got the cursor should can be visible to connection B from the client side.

Is this property used by some client libraries?

Actually I cannot find how and when these cursors become invalid (e.g. Will different connections lead to invalid cursor? Will adding and deleting elements lead to invalid cursor?) in the official document, which seems they just do not give any guarentee about this. It is sad.

But I find such a statement.

Multiple parallel iterations
It is possible for an infinite number of clients to iterate the same collection at the same time, as the full state of the iterator is in the cursor, that is obtained and returned to the client at every call. No server side state is taken at all.

It may make the implementation of redis cursor harder, if we need to follow it.

git-hulk · 2023-06-13T04:52:07Z

Is this property used by some client libraries?

Most libraries should use the same connection to scan, but the cursor needs to be still valid if the connection was reconnected. And as the scan command is a long-time operation, it'd be better to allow reconnecting behavior.

Yes, agreed that it's impossible to follow the Redis exactly since the underlying storage is different. What we can do is to keep compatible as possible.

jihuayu · 2023-06-14T06:39:55Z

@PragmaTwice @git-hulk If the user uses redis-cli scan 0 and redis-cli scan 10 to scan, the two scan will be in different connections. I think this usage may be common, so it is necessary to ensure that the cursor is available in different connection .

We can make a promise to how long or how many recent cursors are guaranteed to be valid. For example, cursors created within the last hour or the last 100 cursors.

In redis doc:

A full iteration always retrieves all the elements that were present in the collection from the start to the end of a full iteration. This means that if a given element is inside the collection when an iteration is started, and is still there when an iteration terminates, then at some point SCAN returned it to the user.
A full iteration never returns any element that was NOT present in the collection from the start to the end of a full iteration. So if an element was removed before the start of an iteration, and is never added back to the collection for all the time an iteration lasts, SCAN ensures that this element will never be returned.

I think this is necessary, while the others are optional. Our current implementation can ensure these two points.

Yes, agreed that it's impossible to follow the Redis exactly since the underlying storage is different. What we can do is to keep compatible as possible.

I agree as well. We just need to follow redis in common behavior.

git-hulk · 2023-06-14T07:43:23Z

Thanks, @jihuayu.

Keeping the recent N cursors solution is good for me since those cursors won't occupy too much memory. To see if @PragmaTwice has any comments.

jihuayu · 2023-06-14T08:35:12Z

@git-hulk We may need to know how the N value is determined in actual project groups. If N=1024 and the average key name length is 40, the memory usage would be 1024*(40+40)=80KB.

git-hulk · 2023-06-14T09:12:51Z

@git-hulk We may need to know how the N value is determined in actual project groups. If N=1024 and the average key name length is 40, the memory usage would be 1024*(40+40)=80KB.

It's hard to determine which value is suitable, but I think we can give a relatively large one like 10240 or even more. It's fine since the usage of memory is a fixed number.

jihuayu · 2023-06-14T09:58:25Z

It's hard to determine which value is suitable, but I think we can give a relatively large one like 10240 or even more. It's fine since the usage of memory is a fixed number.

Do we need to provide a configuration option for this value?

src/server/server.h

src/server/server.cc

mapleFU

LGTM

jihuayu · 2023-06-22T10:52:04Z

Thanks for the review. Does anyone have any other suggestions?

tests/gocase/unit/scan/scan_test.go

flix cli_test not working, fix cursor_ error

git-hulk

LGTM, thanks for your contribution and patience.

git-hulk · 2023-06-25T02:50:24Z

@torwig @PragmaTwice Can you take a look again while you're free.

src/config/config.h

git-hulk · 2023-06-25T11:51:57Z

Thanks all, merging...

qingping209 · 2023-06-28T04:34:37Z

very neat design !

jihuayu added 19 commits June 5, 2023 13:41

tmp update

e779de8

return 0 when not got enough item

0fefed8

update code

9c0dc30

format code

87b1f01

format code

49f1d16

format code

bc67a1e

update cmd

e975ec6

update

7779dbd

Merge branch 'apache:unstable' into issues/1402

c37b8c6

update code

e1d42ad

update include

608865a

make the code more elegant

ac53d8a

add conf

822b5fb

update code

45e8a99

format code

847bf00

update code

71358a1

Merge branch 'apache:unstable' into issues/1402

56df070

update code

6c61e9e

dont check GenerateCursorFromKeyName key_name empty

update conf

a9529f0

update CursorDictType

777da13

mapleFU reviewed Jun 22, 2023

View reviewed changes

src/server/server.h Outdated Show resolved Hide resolved

jihuayu added 2 commits June 22, 2023 17:23

update

05b6fe7

update

03128fe

mapleFU reviewed Jun 22, 2023

View reviewed changes

src/server/server.cc Show resolved Hide resolved

mapleFU approved these changes Jun 22, 2023

View reviewed changes

PragmaTwice reviewed Jun 22, 2023

View reviewed changes

tests/gocase/unit/scan/scan_test.go Outdated Show resolved Hide resolved

jihuayu and others added 3 commits June 23, 2023 10:48

update scan_test,

89592ab

flix cli_test not working, fix cursor_ error

Merge branch 'unstable' into issues/1402

c1926a2

update cli_test

45cd940

git-hulk previously approved these changes Jun 24, 2023

View reviewed changes

Merge branch 'apache:unstable' into issues/1402

d9b2786

torwig reviewed Jun 25, 2023

View reviewed changes

src/config/config.h Outdated Show resolved Hide resolved

torwig previously approved these changes Jun 25, 2023

View reviewed changes

update redis_cursor_compatible defalut value

4411894

jihuayu dismissed stale reviews from torwig and git-hulk via 4411894 June 25, 2023 07:02

git-hulk approved these changes Jun 25, 2023

View reviewed changes

git-hulk requested review from PragmaTwice, torwig and infdahai June 25, 2023 07:39

torwig approved these changes Jun 25, 2023

View reviewed changes

git-hulk merged commit 431277c into apache:unstable Jun 25, 2023
23 checks passed

git-hulk mentioned this pull request Jun 25, 2023

Add the scan compatible case to the FAQs apache/kvrocks-website#119

Merged

PragmaTwice mentioned this pull request Jun 25, 2023

[BUG] Kvrocks SCAN does not match with Redis SCAN #534

Closed

jihuayu deleted the issues/1402 branch June 26, 2023 03:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Redis-compatible cursors for `SCAN` commands #1489

Add Redis-compatible cursors for `SCAN` commands #1489

jihuayu commented Jun 10, 2023 •

edited

git-hulk commented Jun 10, 2023 •

edited

PragmaTwice commented Jun 10, 2023

git-hulk commented Jun 13, 2023

jihuayu commented Jun 13, 2023

PragmaTwice commented Jun 13, 2023 •

edited

git-hulk commented Jun 13, 2023

jihuayu commented Jun 14, 2023

git-hulk commented Jun 14, 2023

jihuayu commented Jun 14, 2023

git-hulk commented Jun 14, 2023

jihuayu commented Jun 14, 2023

mapleFU left a comment

jihuayu commented Jun 22, 2023

git-hulk left a comment

git-hulk commented Jun 25, 2023

git-hulk commented Jun 25, 2023

qingping209 commented Jun 28, 2023

Add Redis-compatible cursors for SCAN commands #1489

Add Redis-compatible cursors for SCAN commands #1489

Conversation

jihuayu commented Jun 10, 2023 • edited

Cursor design

Cursor dictionary(Server#cursor_dict_)

Other information about the cursor

Configuration file

Test file

Other changes:

git-hulk commented Jun 10, 2023 • edited

PragmaTwice commented Jun 10, 2023

git-hulk commented Jun 13, 2023

jihuayu commented Jun 13, 2023

PragmaTwice commented Jun 13, 2023 • edited

git-hulk commented Jun 13, 2023

jihuayu commented Jun 14, 2023

git-hulk commented Jun 14, 2023

jihuayu commented Jun 14, 2023

git-hulk commented Jun 14, 2023

jihuayu commented Jun 14, 2023

mapleFU left a comment

Choose a reason for hiding this comment

jihuayu commented Jun 22, 2023

git-hulk left a comment

Choose a reason for hiding this comment

git-hulk commented Jun 25, 2023

git-hulk commented Jun 25, 2023

qingping209 commented Jun 28, 2023

Add Redis-compatible cursors for `SCAN` commands #1489

Add Redis-compatible cursors for `SCAN` commands #1489

jihuayu commented Jun 10, 2023 •

edited

Cursor dictionary(`Server#cursor_dict_`)

git-hulk commented Jun 10, 2023 •

edited

PragmaTwice commented Jun 13, 2023 •

edited