-
Notifications
You must be signed in to change notification settings - Fork 23.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance degrade 7.0.3 vs 6.2.7 #10981
Comments
Hi there @academe-01, trying to reproduce as follow:
I was able to measure ~=3.7% overhead, which was described/justified before in #10460 due to the following features:
Heres the benchmark table shared on a run on my local lab (notice that ideally we should run this for longer and multiple time) :
One interesting thing is that the LRANGE tests are presenting a larger drop. I'll check this with more detail during the following days and report back. |
Hi @filipecosta90, |
hi @filipecosta90 @academe-01
6.2.7
|
@sundb notice that on unstable we have #10697, which is not present on v7.0.3 if I'm not mistaken. |
@filipecosta90 #10697 does exist in v7.0.3. |
@filipecosta90 one thing i'm still puzzled about is how it is related to #10987. IIUC it's just that while you investigated this issue you came across that opportunity. |
yes. given I focused on LRANGE benchmarks due to the higher diff I bumped into #10987
The following 7.0 features are know to take some CPU:
but there is still a GAP to understand. I'm working on it |
I run the redis-benchmark with -t lrange for 6.2.7 and 7.0.3 with a customized gprof. It samples variables while profiling. I notice the main difference is in function lookupKey in db.c. The samples on the variable key are dramatically different. All samples from 7.0.3 are zeroed. I attached the files here. Maybe they could help explain the performance degradation. |
@wenglingmei i don't understand what you mean by "customized gprof", and what are "samples on the variable key" |
Hi @oranagra, thank you for the reply. I should have mentioned the customized gprof is my research project implemented based on gprof. To be simple, if you check the file, I recorded the value of the variable "key", which is the parameter in function "lookupKey" defined in src/db.c line 66 in redis-6.2.7. In this test, many more variables including the parameter "key". My tool marks its values as a major difference between the two versions. If you want more detail, I would be happy to post it here. |
@wenglingmei i'm still not sure i understand what it does, |
Just check the benchmark source code and realize the keys are not attached with expiration time. Sorry for the 'stupid' guess. I will explore more components other than db.c. |
Hello, I also observed a performance drop when comparing Redis 6.2.7 with Redis 7.0.5 on MacOS (M1, arm64). I am running my Go application benchmarks:
So overall operation latency drop is significant.
The benchmarking code is the same, the only difference is Redis server version benchrmarks are launched with. In the case above Redis CPU usage was up to 100% during the load. I also decreased the load from the app to keep Redis under 100% CPU – and results are like this:
Also, worse in Redis 7 case. Redis launched like this (6.2.7):
And like this (7.0.5):
|
@FZambia that's a pretty big difference which we didn't see in our own tests, can you maybe help us narrow it down? |
Since I am using LUA scripts then possibly the latency degradation is somehow related to them? Running this:
On Redis 6:
On Redis 7:
Both Redis versions installed from |
@sundb i guess you mean that you already used bisect and found the offensive commit? |
I tried to run my app benchmarks with Redis before #10220 and after, building Redis from source. Results are:
So it definitely affected latency, but it's not the only source of the degradation I suppose, as for Redis 6.2.7 I observed 3.6µs of latency in my benchmarks. Unfortunately, can not build Redis from source from
|
For
Before #10220:
After:
So it's slower, but again – it seems the overall difference between |
@FZambia the reason you failed to build 6.2.7 was that when switching between branches, you often have to do I was also able to reproduce a big performance difference between 7.0 and 6.2 using your benchmark line (on x86_64 Linux). |
This helped, thanks! For my Go app benchmark6.2.7: 3.6µs latency For redis-benchmarkFor 6.2.7:
before #10220:
after #10220:
7.0.5:
As we can see latency just becoming worse as time goes, and #10220 is not the only reason for it. |
How does that sit with the previous statement that the majority of the degradation from 6.2 was before #10220? And the benchmark in the PR that shows the optimization doesn't contribute much? |
@sundb it seems that some of the degradation (half of it) is being caused by us calling 2x the evalCalcFunctionName when not required. It bumped 10% the numbers in unstable.
Indeed it might be worthed to reintroduce if possible that optimization @oranagra / @yoav-steinberg |
@FZambia doing a status point of what was improved since 7.0.5 until now WRT to EVAL/EVALSHA:
by using the following two benchmarks: setup
benchmark commands
we get the following results:
can you confirm on your side that unstable will get you to 6.2.7 levels? |
Hello @filipecosta90, thanks for the update. I just checked my numbers, they do not correspond yours unfortunately, though the improvement is notable (I am on arm M1 Mac btw). Redis-benchmark:
6.2.7: 318572 rps My app bench which executes LUA (with EVALSHA): 6.2.7: 3527 ns/op |
let me confirm the gap you're seeing is now related to expire command and that the rest is ok now. will update back 👍 |
6.2.7: 428082.19 |
@academe-01 / @oranagra / @FZambia as a follow up of the intial PR comment I've re-run the default redis-benchmark test suite comparing the current unstable branch vs 6.2.7.
Looking at the achievable ops/sec we can confirm that there is no regression on unstable vs 6.2. Furthermore we can confirm the opposite given the work that has been applied in optimizing the commands/reply/etc...
I suggest we keep digging on the missing explanation for the EXPIRE overhead and then, ( as soon we resolve that EXPIRE regression raised on the EVAL benchmark ), close this open issue. |
…me (redis#11521) As being discussed in redis#10981 we see a degradation in performance between v6.2 and v7.0 of Redis on the EVAL command. After profiling the current unstable branch we can see that we call the expensive function evalCalcFunctionName twice. The current "fix" is to basically avoid calling evalCalcFunctionName and even dictFind(lua_scripts) twice for the same command. Instead we cache the current script's dictEntry (for both Eval and Functions) in the current client so we don't have to repeat these calls. The exception would be when doing an EVAL on a new script that's not yet in the script cache. in that case we will call evalCalcFunctionName (and even evalExtractShebangFlags) twice. Co-authored-by: Oran Agra <oran@redislabs.com> (cherry picked from commit 7dfd7b9)
Tried the unstable (from 20854cb which is latest commit at this point), again - a notable improvement, thanks a lot for your work on it. I can't say it's now comparable to 6.2.7 though: Redis-benchmark:
6.2.7: 313185 rps My app bench which executes LUA (with EVALSHA): 6.2.7: 3527 ns/op For the reference, LUA script in my app is this one: redis.call("zadd", KEYS[1], ARGV[2], ARGV[3])
redis.call("hset", KEYS[2], ARGV[3], ARGV[4])
redis.call("expire", KEYS[1], ARGV[1])
redis.call("expire", KEYS[2], ARGV[1]) |
…me (#11521) As being discussed in #10981 we see a degradation in performance between v6.2 and v7.0 of Redis on the EVAL command. After profiling the current unstable branch we can see that we call the expensive function evalCalcFunctionName twice. The current "fix" is to basically avoid calling evalCalcFunctionName and even dictFind(lua_scripts) twice for the same command. Instead we cache the current script's dictEntry (for both Eval and Functions) in the current client so we don't have to repeat these calls. The exception would be when doing an EVAL on a new script that's not yet in the script cache. in that case we will call evalCalcFunctionName (and even evalExtractShebangFlags) twice. Co-authored-by: Oran Agra <oran@redislabs.com> (cherry picked from commit 7dfd7b9)
The two versions have heavy code refraction. I investigated more functions ranked by our comparing tools. We find the difference in the function clientHasPendingReplies contributes to the performance regression. ======================================================== "test","rps","avg_latency_ms","min_latency_ms","p50_latency_ms","p95_latency_ms","p99_latency_ms","max_latency_ms" "test","rps","avg_latency_ms","min_latency_ms","p50_latency_ms","p95_latency_ms","p99_latency_ms","max_latency_ms" "test","rps","avg_latency_ms","min_latency_ms","p50_latency_ms","p95_latency_ms","p99_latency_ms","max_latency_ms" ======================================================== "test","rps","avg_latency_ms","min_latency_ms","p50_latency_ms","p95_latency_ms","p99_latency_ms","max_latency_ms" "test","rps","avg_latency_ms","min_latency_ms","p50_latency_ms","p95_latency_ms","p99_latency_ms","max_latency_ms" "test","rps","avg_latency_ms","min_latency_ms","p50_latency_ms","p95_latency_ms","p99_latency_ms","max_latency_ms" ========================================================
"test","rps","avg_latency_ms","min_latency_ms","p50_latency_ms","p95_latency_ms","p99_latency_ms","max_latency_ms" "test","rps","avg_latency_ms","min_latency_ms","p50_latency_ms","p95_latency_ms","p99_latency_ms","max_latency_ms" "test","rps","avg_latency_ms","min_latency_ms","p50_latency_ms","p95_latency_ms","p99_latency_ms","max_latency_ms" I am new to the code base of redis and hope you are willing to verify the findings. |
@wenglingmei Please note that all the optimization for this issue is from 7.0.6, so you should compare between 6.2.7 and 7.0.6(or unstable).
6.2.7
|
@sundb I notice the improvement in the unstable branch. My purpose is to figure out the performance gaps. It could also be potential to optimize more in new versions. Just a quick test of the same patch on the unstable in my machine. unstable: unstable with patch: |
@wenglingmei But the patch doesn't make sense, it just forces it out, but this piece of code is necessary unless we can speed up |
@sundb it is clear to me. The patch is not meant to patch, just for verification. |
…me (redis#11521) As being discussed in redis#10981 we see a degradation in performance between v6.2 and v7.0 of Redis on the EVAL command. After profiling the current unstable branch we can see that we call the expensive function evalCalcFunctionName twice. The current "fix" is to basically avoid calling evalCalcFunctionName and even dictFind(lua_scripts) twice for the same command. Instead we cache the current script's dictEntry (for both Eval and Functions) in the current client so we don't have to repeat these calls. The exception would be when doing an EVAL on a new script that's not yet in the script cache. in that case we will call evalCalcFunctionName (and even evalExtractShebangFlags) twice. Co-authored-by: Oran Agra <oran@redislabs.com>
…me (redis#11521) As being discussed in redis#10981 we see a degradation in performance between v6.2 and v7.0 of Redis on the EVAL command. After profiling the current unstable branch we can see that we call the expensive function evalCalcFunctionName twice. The current "fix" is to basically avoid calling evalCalcFunctionName and even dictFind(lua_scripts) twice for the same command. Instead we cache the current script's dictEntry (for both Eval and Functions) in the current client so we don't have to repeat these calls. The exception would be when doing an EVAL on a new script that's not yet in the script cache. in that case we will call evalCalcFunctionName (and even evalExtractShebangFlags) twice. Co-authored-by: Oran Agra <oran@redislabs.com>
Can we have fresh benchmarks with 7.0.11 to be sure the regression has been fixed? |
@Cherry Can you confirm from which version you have upgraded to 7.0.11 please? |
@filipecosta90 hello! In my comment I pointed that while things were drastically improved the regression is still there - sth like 15%. I am running benches on M1 ARM though. And looks like there were some additional optimizations since then. Could you please run the following on Linux machine with 6.2.7 and 7.0.11?
If the difference is negligible then good to close from my point of view. Many thanks for working on this to all the team. |
…me (redis#11521) As being discussed in redis#10981 we see a degradation in performance between v6.2 and v7.0 of Redis on the EVAL command. After profiling the current unstable branch we can see that we call the expensive function evalCalcFunctionName twice. The current "fix" is to basically avoid calling evalCalcFunctionName and even dictFind(lua_scripts) twice for the same command. Instead we cache the current script's dictEntry (for both Eval and Functions) in the current client so we don't have to repeat these calls. The exception would be when doing an EVAL on a new script that's not yet in the script cache. in that case we will call evalCalcFunctionName (and even evalExtractShebangFlags) twice. Co-authored-by: Oran Agra <oran@redislabs.com>
look forward to the results from a Linux machine |
Finally was able to launch the above benchmark on Linux machine. I.e.:
On Debian 12:
Compared 6.2.14 with 7.2.3. Redis 6.2.14:
Redis 7.2.3:
|
@FZambia, so compared to your previous post, i suppose the difference is that you compare to 7.2 rather than 7.0 (not the M1 architecture or other Macos related build differences). |
@oranagra no concerns from my side anymore, thanks a lot to the team for all the improvements |
Hi,
I tested on different systems, but every time the results are the same, 7.x works worse than 6.x.
(config attached)
redis_conf.txt
The text was updated successfully, but these errors were encountered: