Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

introducing new configuration option auto-convert-intset-encoding #354

Closed
wants to merge 0 commits into from

Conversation

steamboatid
Copy link

Hi,

I'm introducing new boolean configuration option named as "auto-convert-intset-encoding" to prevent auto encoding sets from intset into hashtable. by default, it's yes to retain the old behaviour.
when auto-convert-intset-encoding set to no, set with intset encoding will retain its encoding and ignore "set-max-intset-entries".

also I add SADDINT command to always save sets as intset

thanks

src/t_set.cpp Outdated
if (checkType(c,set,OBJ_SET)) return;

if (set == NULL) {
//set = setTypeCreate(szFromObj(c->argv[2]));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove commented out code

Copy link
Author

@steamboatid steamboatid Oct 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kow it's hard, but forcing Werror and O3 while compiling, it keep our code clean.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re Werror, KeyDB's policy is to have no warnings during build, however because we ship source code we get compiled on all sorts of compilers and different versions of GCC all with their own warnings. Therefore we don't enable this flag as it causes problems for downstream users. If you look in our CI file however you'll notice we do use this flag where its in a controlled situation.

Re -O3 in benchmarks it makes no perceptible difference and introduces a higher risk for bugs.

@@ -12,7 +12,7 @@
*
* Compile with:
*
* cc -O2 ./tracking_collisions.c ../src/crc64.c ../src/sha1.c
* cc -O3 ./tracking_collisions.c ../src/crc64.c ../src/sha1.c
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unrelated

@JohnSully
Copy link
Collaborator

JohnSully commented Sep 29, 2021

Hi @steamboatid

We had a little internal debate about this. The question that came up was why do we need both a configuration option and a specific command? Shouldn't just the auto_convert_intset_encoding setting be sufficient? We have agreed we should take one or the other but can't reach agreement on having both, hopefully your experience will be instructive in resolving this debate.

Also it seems there are quite a few unrelated changes here, would you be able to scope the PR to just the changes necessary for this feature?

@steamboatid
Copy link
Author

steamboatid commented Oct 1, 2021

Hi @JohnSully ,

of course I agreed,
please drop any changes that might be unrelated to the team.

fyi,
I use keydb at production servers, mainly for backend processes. all program written in php8 using phpredis as interface to access keydb that run on another server. The set not so huge, it only consists of around 10K items, but the active concurrent connections may reach 300 in normal cases, with roughly thousand operation per minute. In this condition, keydb along with memcached, consume my CPU quite alot, which is 32 core 64 thread AMD Ryzen Threadripper 3970X with 256GB RAM.
For me, it's quite powerful machine but looks pretty dumb while handling integer set and concurrent connections.

For the new config option named as "auto-convert-intset-encoding"
it's really simple addition to the code, that tackle any automatic response from the software. IMHO, linux offer plenty room for any user with special requirement and condition, to tune their machines to fulfill any special task needed.
Of course, when we can deliver plenty default auto-tuned options, it will help for most users who does not have special requirement and time to tune their machine. But in my experience, default options can not be used for mission critical software.
I love linux because it give me freedom and easiness to tune my machine.

For the SADDINT command.
I only use phpredis to access keydb (and of course redis before I installed keydb). SADDINT offer communication effectiveness between client and server. It may have smaller size in data packets and dropping the need to type casting, both in server and client side. SADDINT mainly delivered for all of language interface creator, so they can provide an effective interface for client to handle integer sets.
For small and medium user, this won't affect much. But for client with plenty concurent active connection along with almost thousand operations per minute, I think it's a big hit.

Despite of having concrete blue print on how to develop keydb further,
my PR is based on daily routine conditions. I just want to cool down my CPU little bit, so I can use it for other things.

best regards,
dwi

@JohnSully
Copy link
Collaborator

But the question is, if you disable auto-convert-intset-encoding then SADDINT is redundant is it not? Do you have a situation where you want some sets converted and some not?

@steamboatid
Copy link
Author

But the question is, if you disable auto-convert-intset-encoding then SADDINT is redundant is it not? Do you have a situation where you want some sets converted and some not?

yes. especially if the sets size reach set-max-intset-entries.
it's hard to predict huge number set-max-intset-entries. i've tried to set set-max-intset-entries with huge number, but some sets still converted to hash

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants