Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change skip list P value to 1/e, which improves search times #3889

Closed
wants to merge 2 commits into from

Conversation

sean-public
Copy link

@sean-public sean-public commented Mar 20, 2017

The probability of adding new nodes to each level of a skip list (the node's height) is determined by successively "rolling the dice" at each level until it doesn't meet a probability P or it reaches a maximum level.

I have observed default P values for skip lists "in the wild" ranging from 0.25 to 0.5. In Redis, ZSKIPLIST_P is 0.25 with a ZSKIPLIST_MAXLEVEL of 32.

The optimal value for P in a general-purpose skip list is 1/e, where e is Euler's number. For a detailed proof, see Analysis of an optimized search algorithm for skip lists by Kirschenhofer et al (1995).

To investigate, I ran the server before and after changing the P value by running a server, redis-server --save "" --appendonly no, then ran a bash script to benchmark ZSET operations. Here is a chart showing the resulting improvement in ZADD operations per second (higher is better):

image

This change should improve the speed of all operations that require searching a skip list of any length. Typical benchmark results show a 4-8% improvement with a tradeoff of slightly more memory used to store the data structure.

@sean-public
Copy link
Author

It's been a few months, is there anything I can do to have this PR considered? I've tried IRC already and got some positive feedback but nobody with write access noticed.

@antirez
Copy link
Contributor

antirez commented Jul 14, 2017

Hello, very interesting! First time I see this, sorry and thanks for pinging me about it. I'll review soon, thanks.

@abclirun
Copy link

I think when p is 1/4 the space is about 1.33n while p is 1/e the space goes up to about 1.58n。 When adjust that , you should also keep memory in mind

@antirez
Copy link
Contributor

antirez commented Aug 31, 2017

Yep that was my wonder indeed, p is a tradeoff between space and time, since sorted sets are already extremely heavy I wonder if changing the val is a good tradeoff for us. Another thing that worries me is the graph above, why the red line is so out of the theoretical trajectory? I wonder if the test was affected by errors enough, so that it's hard to tell exactly how faster the new implementation is. However it looks quite a bit faster, and we should take in mind that the sorted set is a skiplist plus an hash table, so the memory footprint of the skiplist itself is not so dominant. TLDR: this could be a good change but I need memory numbers to evaluate it.

@sean-public
Copy link
Author

sean-public commented Sep 20, 2017

I ran another round of tests. Here are the results. Let me know if I can clarify anything or look deeper into the tradeoffs.

Configuration

Hardware is 2016 Macbook Pro 15 touchbar. CPU: 2.7 GHz Intel Core i7. Memory is 16 GB 2133 MHz LPDDR3. OS is MacOS Sierra 10.12.6.

I compiled 4 versions from unstable branch.

  1. ZSKIPLIST_P=0.25 and MALLOC=libc
  2. ZSKIPLIST_P=1/M_E and MALLOC=libc
  3. ZSKIPLIST_P=0.25 and MALLOC=jemalloc
  4. ZSKIPLIST_P=1/M_E and MALLOC=jemalloc

Benchmark

I wrote another bash script that keeps adding to a sorted set via ZADD up to 10 million entries with more granularity than the previous test. It outputs CSV data that I charted afterwards.

Results

First, some averages to see how much faster and memory-hungry P=1/e is.

  % requests/sec faster % more memory
libc malloc 9.55% 3.28%
jemalloc 3.69% 4.46%

The final amount of memory allocated after inserting all 10 million items in all 4 tests:

  P=0.25 P=1/e
libc malloc 1,276,409,376 1,316,910,928
jemalloc 1,037,229,432 1,081,102,856

Here are some charts for libc malloc detailing requests per second (RPS) and memory use as the set is expanded:

image
image

Next up are the charts for jemalloc with the same tests:

image
image

Conclusions

  • jemalloc is much faster (duh)
  • the P=1/e speed benefits are more pronounced with libc malloc (9.55% vs 3.69%)
  • the additional memory used is very consistent but differs for each allocator (3.28% and 4.46%)
  • the jemalloc tests very consistently have a RPS spike at 1,750,000 entries

I think the tradeoff is worth it. It doesn't introduce any additional unpredictability in speed and memory use is very steady with a fixed additional overhead. I also think that it's worth exposing this type of tuning as a user-controlled setting.

@yoav-steinberg
Copy link
Contributor

@filipecosta90 I think this one is worth your review. It might be good to verify the results again. And then finally make a call regarding the performance/memory tradeoff this presents. The bottom line here is that given the relatively modest numbers (for the jemalloc case), I don't think it's critical either way. But we should definitely merge or close this already.

@filipecosta90 filipecosta90 added the action:run-benchmark Triggers the benchmark suite for this Pull Request label Nov 2, 2021
@filipecosta90
Copy link
Contributor

@filipecosta90 I think this one is worth your review. It might be good to verify the results again. And then finally make a call regarding the performance/memory tradeoff this presents. The bottom line here is that given the relatively modest numbers (for the jemalloc case), I don't think it's critical either way. But we should definitely merge or close this already.

@yoav-steinberg I'm working on seeing if this change can be measured/detected by the current use-case on https://github.com/redis/redis-benchmarks-specification.
TLDR will get the numbers, ideally in an automated manner via the action:run-benchmark tag. Will update further until EOW

@madolson
Copy link
Contributor

@filipecosta90 Any chance you have an update?

@filipecosta90
Copy link
Contributor

@filipecosta90 Any chance you have an update?

planning tomorrow to focus on this @madolson. Sorry for the late reply

@filipecosta90 filipecosta90 added action:run-benchmark Triggers the benchmark suite for this Pull Request and removed action:run-benchmark Triggers the benchmark suite for this Pull Request labels Dec 7, 2021
@filipecosta90
Copy link
Contributor

@filipecosta90 Any chance you have an update?

@madolson @sean-public from the simple ZADD test ( described here https://github.com/redis/redis-benchmarks-specification/blob/main/redis_benchmarks_specification/test-suites/redis-benchmark-full-suite-1Mkeys-100B.yml ) that only focus on achievable throughput and overall client p50 latency there is no visible change on an oss-standalone setup:
image

Madelyn you can consult it here:
https://benchmarksredisio.grafana.net/d/1fWbtb7nz/experimental-oss-spec-benchmarks?orgId=1&var-platforms=amd64-ubuntu18.04-m6i.8xlarge-1&var-test_suite=redis-benchmark-full-suite-1Mkeys-100B&var-metric_context_path=ZADD&var-build_variants=gcc%3A8.5.0-amd64-debian-buster-default&from=1638748800000&to=1638921599000

However, this is a very simple test to catch the changes/improvements @sean-public worked on. I'll extend the tests to include at least the shared charts on this PR description to have an automated clean improvement %change calculated. and get back with further data.

@sean-public
Copy link
Author

there is no visible change on an oss-standalone setup

That seems correct, as the earlier benchmarks using the scripts I provided show a visually appreciable difference in latency well after 1M keys are inserted (the magnitude varying based on memory allocator used).

I think the speed/memory tradeoff is solid and only improves performance as measured in requests per second, especially with pipelining.

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


sean-public seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@sean-public
Copy link
Author

Not signing the CLA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
action:run-benchmark Triggers the benchmark suite for this Pull Request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants