feat (hset): Support arguments (count, withvalues) in HRANDFIELD #1804

theyueli · 2023-09-05T10:13:00Z

Fixes: #858

It might possibly fix: #1707

The algorithms that support both encodings (string map and listpack) have been implemented and tested. To use string map requires a larger hset (my tests used 500+ entries)

The random selection algorithms implemented for string map class are the reimplementation of the same algorithms used by Redis' listpack. (therefore same time complexities)

Without this patch:

127.0.0.1:6379> HSET coin heads obverse tails reverse edge null
(integer) 3
127.0.0.1:6379> HRANDFIELD coin 3 WITHVALUES
(error) ERR wrong number of arguments for 'hrandfield' command
127.0.0.1:6379> HRANDFIELD coin -1
(error) ERR wrong number of arguments for 'hrandfield' command
127.0.0.1:6379>

after this patch:

127.0.0.1:6379> HSET coin heads obverse tails reverse edge null
(integer) 3
127.0.0.1:6379> HRANDFIELD coin -5 WITHVALUES
 1) "tails"
 2) "reverse"
 3) "tails"
 4) "reverse"
 5) "heads"
 6) "obverse"
 7) "tails"
 8) "reverse"
 9) "edge"
10) "null"
127.0.0.1:6379> HRANDFIELD coin 3
1) "heads"
2) "tails"
3) "edge"
127.0.0.1:6379> HRANDFIELD coin 3 WITHVALUES
1) "heads"
2) "obverse"
3) "tails"
4) "reverse"
5) "edge"
6) "null"

royjacobson · 2023-09-05T12:51:45Z

redis claim complexity of O(N) with N being the amount of items returned. AFAICT RandomPairsUnique and RandomPairs in this PR are O(M) with M=size(map). Am I missing something?

dranikpg · 2023-09-05T13:48:26Z

@royjacobson

I might be wrong, but it seems like listpacks implementation is also O(M) 😮 They use the same algorithms as here

Actually we can make our implementation O(N) by harnessing stringmaps internals. But I don't think its mandatory as I already answered in the first PR on this issue

src/core/string_map.cc

src/server/hset_family.cc

kostasrim

Good work 👨‍🍳 , some minor comments

src/core/string_map.h

src/core/string_map.cc

src/server/hset_family.cc

src/core/string_map.cc

kostasrim · 2023-09-05T17:04:51Z

src/core/string_map.cc

+  std::vector<RandomPick> picks;
+  unsigned int total_size = Size();
+
+  for (unsigned int i = 0; i < count; ++i) {


you can use a std::map<uint32_t, uint32_t>. The key represents the index, and value represents the number of times you encountered it (we need map, since it's an ordered container). Each time you loop, you check if the index rand() % total_size already exists. If it does you increment it's occurrence (the value) by 1. Otherwise (if the index is not already inserted) you insert it with an associated value of 1 (since it's the first occurrence).

That way you will:

Get rid of std::sort.

Simplify the code below, since now you don't need two while loops. You only need a for loop, and for each element, you output occurrence number of times

Good suggestion, but I don't think it matters that much. Sort only gets slow for a really big number of elements and for it the i/o time is much larger

No no, I do not care about the performance here, I don't expect that std::sort will have any measurable impact since it the count will always be relatively small. But by using a map it will simplify the implemetnaion of this function and make it more readable

I prefer to keep as is. sort() is more implicit when std::map is used. Personally, the current version has better readabliy by making sort obvious.

So a) I think stl associative ordered containers are used exactly for that, so I would argue that vector + sort is an antipattern. b) how is what you wrote simpler than this 3 lines? No pair accesses, no nested while loops no nothing:

for(auto it = begin(); it < end() && picks_sz < count ; ++it) { auto [key, frequency] = *it; keys.insert(keys.end(), frequency, key); picks_sz += frequency; }

I might a have a mistake on the above code but that's the gist of it

src/server/hset_family.cc

theyueli · 2023-09-06T01:46:11Z

@dranikpg @kostasrim addressed all the comments, please take another look, thanks!

src/server/hset_family.cc

dranikpg

But I still encourage you to use a unique_ptr instead of raw arrays
about RandomPairs(), we usually take output arguments by pointer (and returning values is even better) but I don't wanna be picky

src/server/hset_family.cc

theyueli · 2023-09-06T07:18:01Z

But I still encourage you to use a unique_ptr instead of raw arrays

thanks @dranikpg, I just changed to use unique_ptr, please take a second look.

dranikpg

👍🏻

fixes dragonflydb#858

romange mentioned this pull request Sep 5, 2023

feat(server): Support for COUNT, WITHVALUES arguments in HRANDFIELD command #1801

Closed

dranikpg reviewed Sep 5, 2023

View reviewed changes

kostasrim reviewed Sep 5, 2023

View reviewed changes

dranikpg reviewed Sep 6, 2023

View reviewed changes

src/server/hset_family.cc Show resolved Hide resolved

src/server/hset_family.cc Outdated Show resolved Hide resolved

dranikpg previously approved these changes Sep 6, 2023

View reviewed changes

kostasrim previously approved these changes Sep 6, 2023

View reviewed changes

kostasrim reviewed Sep 6, 2023

View reviewed changes

src/server/hset_family.cc Outdated Show resolved Hide resolved

theyueli dismissed stale reviews from dranikpg and kostasrim via 26ef194 September 6, 2023 07:16

theyueli force-pushed the main branch from f68221b to 26ef194 Compare September 6, 2023 07:16

dranikpg approved these changes Sep 6, 2023

View reviewed changes

kostasrim approved these changes Sep 6, 2023

View reviewed changes

feat (hse): Support arguments (count, withvalues) in HRANDFIELD

9a8e520

fixes dragonflydb#858

theyueli force-pushed the main branch from 863a009 to 9a8e520 Compare September 6, 2023 08:50

dranikpg enabled auto-merge (squash) September 6, 2023 08:55

dranikpg merged commit a8e4beb into dragonflydb:main Sep 6, 2023
7 checks passed

BagritsevichStepan mentioned this pull request May 2, 2024

fix(zset): fix random in ZRANDMEMBER command #2994

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat (hset): Support arguments (count, withvalues) in HRANDFIELD #1804

feat (hset): Support arguments (count, withvalues) in HRANDFIELD #1804

theyueli commented Sep 5, 2023 •

edited

royjacobson commented Sep 5, 2023 •

edited

dranikpg commented Sep 5, 2023

kostasrim left a comment

kostasrim Sep 5, 2023 •

edited

dranikpg Sep 5, 2023

kostasrim Sep 5, 2023

theyueli Sep 6, 2023

kostasrim Sep 6, 2023

theyueli commented Sep 6, 2023

dranikpg left a comment

theyueli commented Sep 6, 2023

dranikpg left a comment

feat (hset): Support arguments (count, withvalues) in HRANDFIELD #1804

feat (hset): Support arguments (count, withvalues) in HRANDFIELD #1804

Conversation

theyueli commented Sep 5, 2023 • edited

royjacobson commented Sep 5, 2023 • edited

dranikpg commented Sep 5, 2023

kostasrim left a comment

Choose a reason for hiding this comment

kostasrim Sep 5, 2023 • edited

Choose a reason for hiding this comment

dranikpg Sep 5, 2023

Choose a reason for hiding this comment

kostasrim Sep 5, 2023

Choose a reason for hiding this comment

theyueli Sep 6, 2023

Choose a reason for hiding this comment

kostasrim Sep 6, 2023

Choose a reason for hiding this comment

theyueli commented Sep 6, 2023

dranikpg left a comment

Choose a reason for hiding this comment

theyueli commented Sep 6, 2023

dranikpg left a comment

Choose a reason for hiding this comment

theyueli commented Sep 5, 2023 •

edited

royjacobson commented Sep 5, 2023 •

edited

kostasrim Sep 5, 2023 •

edited