-
Notifications
You must be signed in to change notification settings - Fork 23.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change skip list P value to 1/e, which improves search times #3889
Conversation
It's been a few months, is there anything I can do to have this PR considered? I've tried IRC already and got some positive feedback but nobody with write access noticed. |
Hello, very interesting! First time I see this, sorry and thanks for pinging me about it. I'll review soon, thanks. |
I think when p is 1/4 the space is about 1.33n while p is 1/e the space goes up to about 1.58n。 When adjust that , you should also keep memory in mind |
Yep that was my wonder indeed, p is a tradeoff between space and time, since sorted sets are already extremely heavy I wonder if changing the val is a good tradeoff for us. Another thing that worries me is the graph above, why the red line is so out of the theoretical trajectory? I wonder if the test was affected by errors enough, so that it's hard to tell exactly how faster the new implementation is. However it looks quite a bit faster, and we should take in mind that the sorted set is a skiplist plus an hash table, so the memory footprint of the skiplist itself is not so dominant. TLDR: this could be a good change but I need memory numbers to evaluate it. |
I ran another round of tests. Here are the results. Let me know if I can clarify anything or look deeper into the tradeoffs. ConfigurationHardware is 2016 Macbook Pro 15 touchbar. CPU: 2.7 GHz Intel Core i7. Memory is 16 GB 2133 MHz LPDDR3. OS is MacOS Sierra 10.12.6. I compiled 4 versions from
BenchmarkI wrote another bash script that keeps adding to a sorted set via ResultsFirst, some averages to see how much faster and memory-hungry
The final amount of memory allocated after inserting all 10 million items in all 4 tests:
Here are some charts for Next up are the charts for Conclusions
I think the tradeoff is worth it. It doesn't introduce any additional unpredictability in speed and memory use is very steady with a fixed additional overhead. I also think that it's worth exposing this type of tuning as a user-controlled setting. |
@filipecosta90 I think this one is worth your review. It might be good to verify the results again. And then finally make a call regarding the performance/memory tradeoff this presents. The bottom line here is that given the relatively modest numbers (for the jemalloc case), I don't think it's critical either way. But we should definitely merge or close this already. |
@yoav-steinberg I'm working on seeing if this change can be measured/detected by the current use-case on https://github.com/redis/redis-benchmarks-specification. |
@filipecosta90 Any chance you have an update? |
planning tomorrow to focus on this @madolson. Sorry for the late reply |
@madolson @sean-public from the simple ZADD test ( described here https://github.com/redis/redis-benchmarks-specification/blob/main/redis_benchmarks_specification/test-suites/redis-benchmark-full-suite-1Mkeys-100B.yml ) that only focus on achievable throughput and overall client p50 latency there is no visible change on an oss-standalone setup: However, this is a very simple test to catch the changes/improvements @sean-public worked on. I'll extend the tests to include at least the shared charts on this PR description to have an automated clean improvement %change calculated. and get back with further data. |
That seems correct, as the earlier benchmarks using the scripts I provided show a visually appreciable difference in latency well after 1M keys are inserted (the magnitude varying based on memory allocator used). I think the speed/memory tradeoff is solid and only improves performance as measured in requests per second, especially with pipelining. |
sean-public seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
Not signing the CLA. |
The probability of adding new nodes to each level of a skip list (the node's height) is determined by successively "rolling the dice" at each level until it doesn't meet a probability P or it reaches a maximum level.
I have observed default P values for skip lists "in the wild" ranging from 0.25 to 0.5. In Redis,
ZSKIPLIST_P
is 0.25 with aZSKIPLIST_MAXLEVEL
of 32.The optimal value for P in a general-purpose skip list is 1/e, where e is Euler's number. For a detailed proof, see Analysis of an optimized search algorithm for skip lists by Kirschenhofer et al (1995).
To investigate, I ran the server before and after changing the P value by running a server,
redis-server --save "" --appendonly no
, then ran a bash script to benchmarkZSET
operations. Here is a chart showing the resulting improvement in ZADD operations per second (higher is better):This change should improve the speed of all operations that require searching a skip list of any length. Typical benchmark results show a 4-8% improvement with a tradeoff of slightly more memory used to store the data structure.