[BUG] Inability to write when reached maxmemory in FLASH mode #645

jianjun126 · 2023-04-26T02:06:25Z

Describe the bug
Hi,
I have tested in centos7 with memtier_benchmark (v1.4.0). When the memory reaches its limit, there may be instances of inability to write or very low write performance, such as:

After the memory limit is reached, if key timeout or starting test queries performance, the write performance will also degrade to a very low value. such as:

To reproduce

keydb command:
./keydb-server ./keydb.conf --storage-provider flash /data1/6333/ --storage-provider-options "use_direct_reads=true;allow_mmap_reads=false;use_direct_writes=true;allow_mmap_writes=false"
keydb config:
keydb.conf.zip

memtier_benchmark command：
taskset -c 26-29,78-84 memtier_benchmark -s 127.0.0.1 -p 6333 -t 4 -c 20 -n 2000000 --distinct-client-seed --command="set __key__ __data__ ex 66000" --key-prefix="testkey_v3_" --key-minimum=100000000 --key-maximum=999000000 -R -d 800

Expected behavior
When reaching maximum memory in FLASH mode, keydb can write normally and maintain good performance

Additional information
I have tried to modify some parameters of keydb, but the above phenomenon still exists.

The text was updated successfully, but these errors were encountered:

paulmchen · 2023-04-28T03:58:02Z

It sounds running out of the replication buffer hard limit and triggers a fast fullsync (check your log for confirmation). Increasing the replication hardlimit in your conf will help if it is running out of the replication hardlimit.

client-output-buffer-limit replica 2gb 2gb 60

In the SSD case, you can also adjust the following parameters to a value larger than 1, to better handle large write loads.

maxmemory-eviction-tenacity 35

jianjun126 · 2023-04-28T09:04:04Z

@paulmchen Thank you very much for your suggestion. After I modify the configuration as your suggestion, there are still same issues with inability to write and low write performance. There are two screenshot of the test results.

jianjun126 · 2023-05-04T13:09:10Z

@paulmchen
Hi paulmchen,
I tried to modify some other parameters, but still couldn't solve the problem. Could you please give me some more advice on this issue. Thanks for your kind attention and look forward your prompt reply.

msotheeswaran-sc · 2023-05-04T21:08:31Z

Hi @jianjun126 are the writes being rejected or hanging, it could be a similar problem to #646, however with the expireset taking up all the memory instead of the slots_to_keys map.

paulmchen · 2023-05-05T13:18:20Z

@jianjun126 as it looks like you have a single master configuration (without a slave), it won't be a replication backlog issue or a fast full sync issue. It is suggested to run FlameGraph to determine where the bottleneck is causing low CPU usage.

Follow these instructions to set up and run FlameGraph to identify your system's performance issue:

Place FlameGraph tool under an empty folder: (e.g. /FlameGraph).
git clone https://github.com/brendangregg/FlameGraph # or download it from github
Start your KeyDB server and run your client workload
Before you see low or zero QPS, do the following:
cd /FlameGraph
perf record --call-graph=dwarf -p [process_id] # where process_id is the process id of the KeyDB server running

record the perf for about 30 seconds (when seeing 0QPs or till the end of the slow cpu period if less than 1 minute), then stop the recording.

perf script > out.perf # send result to out.perf file
./stackcollapse-perf.pl out.perf > out.folded
./flamegraph.pl out.folded > keydb-fg-result.svg
It would be helpful if you could share the svg file so we can check where the CPU boundary is

jianjun126 · 2023-05-06T02:55:54Z

@paulmchen
Hi paulmchen,
I have gotten two cvg results with FlameGraph. It takes about 50 seconds to run perf each time. Here are two relevant screenshots of memtier_benchmark.
"LOW OPS"

"ZERO OPS"

"FlameGraph result while LOW OPS"

"FlameGraph result while ZERO OPS"

paulmchen · 2023-05-06T18:41:13Z

According to the FG diagram, more than 73% of CPU cycles are spent performing evictions. EvictionPoolPopulate also consumes more than 55% of CPU cycles (see evict.cpp). My suspicion is that your volatile-ttl setting for maxmemory-policy may not work well with your benchmark command with ex=66000

memtier_benchmark -s 127.0.0.1 -p 6333 -t 4 -c 20 -n 2000000 --distinct-client-seed --command="set key data ex 66000" --key-prefix="testkey_v3_" --key-minimum=100000000 --key-maximum=999000000 -R -d 800

You may try the following:

Run the benchmark on a separate client server. The benchmark tool is currently running on the same keydb server, and when there is heavy workload, both the client and the server consume CPU/memory resources. (bind 127.0.0.1 -::1 from the configuration is to listen for the loopback network interface. You should change it to use your private IP instead. This enables you to call the keydb server from another client server on the same subnet to run your heavy workload without competing resources between the client and the server.
Try with this maxmemory_policy instead:
maxmemory-policy allkeys-lru
If you have more than 1 core on your keydb server, giving it a bit more threads to handle the workload will help a lot. For example:

server-threads 4

jianjun126 · 2023-05-09T13:31:46Z

@paulmchen
Hi paulmchen,

My test environment is a dual-socket server, where the CPU is 8269CY, and the memory has 6 channels. And almost no other tasks are executed at the same time during the test. So, it's almost certainly not a hardware resource issue.
The other two parameters you suggested, I have also tested many times. Such as server-threads 4, server-threads 5, maxmemory-policy allkeys-lru, allkeys-lfu.

If necessary, I will retest as soon as possible with your suggested parameters and environment.

quwu0820 · 2023-05-10T12:54:57Z

铂金8269CY，26核52线程，主频2.5G；

jianjun126 · 2023-05-19T10:22:51Z

@paulmchen @msotheeswaran

I wrote a script to test the performance of keydb when the write rate is low, but I found that even at a low write rate, there is still the problem of not being able to write.

The test method is to randomly write 5000-8000 times per second for 50-70 seconds within 300 seconds; and write 250 times per second for the rest of the 300 seconds, and the amount of data written each time is 8000 bytes.

During the test, I adjusted the maxmemory, whether it is 1GB, 4GB, or 24GB, this problem will occur. The timing of the occurrence is approximately 1.5 hours or 3.5 hours of the test duration.

paulmchen · 2023-05-19T23:47:25Z

That is a bug, the following commit addresses the 0 and low QPS issue. 0 QPS is caused by eviction right after the maxmemory is reached. However, it may have a side effect on the code which could cause issues after reaching maxmemory, and memory usage may continue to grow. @JohnSully, John, could the following commits be added to main as well, are there any side effects? For example, the memory will keep growing without effectively being evicted on time?

#439

@jianjun126 you can try this commit and see if it helps.

jianjun126 · 2023-05-20T08:28:51Z

@paulmchen
Hi paulmchen,
I tried the code modified in #439 with memtier_benchmark, it does addresses the 0 OPS issue, but there was still low OPS issue.
At the same time, as guessed, the memory keep growing without effectively being evicted.

paulmchen · 2023-05-23T23:20:40Z

@JohnSully @msotheeswaran John and Malavan, it seems to be a pretty serious problem, i am able to reproduce it as well. Note: with the fixes in #439 , Zero QPS problem is gone, however after reaching the maximum memory, the memory continues to grow. . Could it be something related to the GC not taking effect?

msotheeswaran-sc · 2023-05-24T03:56:16Z

I believe it is actually from the expireset, currently when a key is evicted to storage the expire is still in memory in the expireset. There is no mechanism to expire keys in the storage provider, so without keeping the entry in the expire set the key will stay in rocksdb without being expired until it is accessed again. I am working on a bigger change to add support for expiring from rocksdb in the meantime you can try this commit: 6eb595d but I have not tested it.

Edit: There was a mistake so you will also need this commit: 6a32023

hengku · 2023-05-24T22:43:18Z

Hi @jianjun126 Not sure if the following (suggested by @JohnSully) can help your case, I tried and it reduced occurrences of 0 qps a lot, at least from my testing environment.

still use the original code without applying the fix in Enable eviction tenacity feature for storage providers #439
set maxmemory-samples to 5 in keydb.conf. Its default value is 16 in the code if you don't specify it in the conf
use/set maxmemory-eviction-tenacity as 10 as its default value

jianjun126 · 2023-05-25T11:30:04Z

@msotheeswaran Hi Malavan, I tried the code modified in 6a32023, with memtier_benchmark, it alse does addresses the 0 OPS issue and avoids memory growth, but there was still low OPS issue. When the test program first started running, the write rate could reach 40,000 OPS, but after two hours, it was only 1000-2000OPS.

jianjun126 · 2023-05-25T11:51:54Z

@hengku Hi hengku, Thanks very much for your attention and suggestions. I tried the parameters you suggested with keydb v6.3.3. However, during the test, there were still 0OPS issues which lasted for 235 seconds.

Here is the config file and test command.
keydb.zip
memtier_benchmark -s 127.0.0.1 -p 6600 -t 4 -c 20 -n 1000000 --distinct-client-seed --command="set __key__ __data__ ex 66000" --key-prefix="testkey_v1_" --key-minimum=100000000 --key-maximum=999000000 -R -d 800

From my test results, this seems to be quite different from yours. Could you help me to check the config file or share your testing process?

hengku · 2023-05-25T19:47:25Z

oh, I am using 6.2.1 version with some in-house code changes. I also observed 0 qps issue recently and it seems fixed by setting those 2 parameters. Below is my conf and memtier command:

port 6379
bind 192.168.0.2
protected-mode no
daemonize no
timeout 0
server-threads 3
maxmemory 1gb
storage-provider flash /test
maxmemory-policy allkeys-lru
maxmemory-samples 5
maxmemory-eviction-tenacity 10
save ""
appendonly no
min-clients-per-thread 0

memtier_benchmark -s 192.168.0.2 -p 6379 -t 10 -c 100 -n 10000000000 -d 256 --key-minimum=1 --key-maximum=10000000000 --ratio 1:0 --key-pattern=P:P

hengku · 2023-05-26T01:11:38Z

@jianjun126 I tried another way using the same testing environment and memtier command above, which I didn't observe 0/ very low qps or continuous growing used memory. Not sure if you still want to take a try and to see if that works for your case.

apply fix Enable eviction tenacity feature for storage providers #439 on top of your original code
set maxmemory-eviction-tenacity to 35
regarding maxmemory-samples, either 5 or 16 was working for my case

jianjun126 · 2023-05-26T09:26:50Z

@hengku Hi hengku, thanks again for your share and suggestions. I tried the parameters and command you suggested. The test results are summarized below.
a、keydb v6.3.3 + your conf + your memtier command:
----There still was 0 OPS issue, and it dropped to less than 1000 OPS after 360 seconds, and the memory does not grow.
b、keydb v6.3.3 + codes in #439 + your conf + your memtier command:
----I have test for twice, the memory kept growing, and reached to 240GB afer about 2 hours.
c、keydb v6.3.3 + codes in #439 + your conf + your memtier command(changed the params "-d 256" to "-d 800"):
----I have test for three times, there was some OOM issue as it reached the maxmemory, and then it dropped to less than 100 OPS. Not sure if the memory would keep growing, because I stopped the testing as the wrote speed was so low.
d、keydb v6.3.3 + codes in #439 + your conf + my memtier command：
----I have test for twice, there were still low OPS issues which maybe last for some seconds, and the memory kept growing.
e、v6.2.1:
----I don't have a license for the keydb pro, so I can't test it.

msotheeswaran-sc · 2023-05-31T03:08:47Z

@msotheeswaran Hi Malavan, I tried the code modified in 6a32023, with memtier_benchmark, it alse does addresses the 0 OPS issue and avoids memory growth, but there was still low OPS issue. When the test program first started running, the write rate could reach 40,000 OPS, but after two hours, it was only 1000-2000OPS.

@jianjun126 what was the memtier command you used for this? Eventually in memory will be full, and all new keys will require evicting existing keys to FLASH first, which would result in much lower QPS.

jianjun126 · 2023-05-31T07:36:57Z

@msotheeswaran Hi Malavan
The memtier command is:
memtier_benchmark -s 127.0.0.1 -p 6600 -t 4 -c 20 -n 1000000 --distinct-client-seed --command="set __key__ __data__ ex 66000" --key-prefix="testkey_v1_" --key-minimum=100000000 --key-maximum=999000000 -R -d 800

My application scenario is the same as what you said. Memory is always full. New data is continuously written to keydb under a high OPS, the existing data needs to be continuously evicted, and the older data is deleted from disk after some time.

Could you give some suggestions for this application scenario?

JohnSully · 2023-05-31T19:39:49Z

@jianjun126 How come you disable mmap? This results in large buffers getting created and free'd which can exacerbate this problem

jianjun126 · 2023-06-01T01:45:26Z

@JohnSully we want to use direct I/O, so mmap cannot be enabled.

jianjun126 · 2023-11-20T12:47:24Z

@msotheeswaran-sc Through the release note, I found that this issue seems to have been fixed. Therefore, I tested using the same configuration as before (maxmemory was modified to 8GB). Through testing, I found that the performance of the new version has indeed improved, but there are still similar issues as before.

If the test data len is changed from "- d 800" to "- d 8", it will also solve the 0 OPS problem.

If maxstorage is configured, there will be a large number of OOMs.

jianjun126 changed the title ~~[BUG]~~ [BUG] Inability to write when reached maxmemory in FLASH mode Apr 26, 2023

jianjun126 mentioned this issue Apr 28, 2023

Maxmemory key evictions could cause CPU starvation during the heavy workload #436

Open

msotheeswaran-sc added the Priority 1 label May 4, 2023

msotheeswaran-sc mentioned this issue Sep 27, 2023

Merge latest internal to OSS #720

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Inability to write when reached maxmemory in FLASH mode #645

[BUG] Inability to write when reached maxmemory in FLASH mode #645

jianjun126 commented Apr 26, 2023 •

edited

paulmchen commented Apr 28, 2023

jianjun126 commented Apr 28, 2023

jianjun126 commented May 4, 2023

msotheeswaran-sc commented May 4, 2023

paulmchen commented May 5, 2023

jianjun126 commented May 6, 2023 •

edited

paulmchen commented May 6, 2023 •

edited

jianjun126 commented May 9, 2023

quwu0820 commented May 10, 2023

jianjun126 commented May 19, 2023

paulmchen commented May 19, 2023

jianjun126 commented May 20, 2023 •

edited

paulmchen commented May 23, 2023

msotheeswaran-sc commented May 24, 2023 •

edited

hengku commented May 24, 2023

jianjun126 commented May 25, 2023 •

edited

jianjun126 commented May 25, 2023 •

edited

hengku commented May 25, 2023

hengku commented May 26, 2023

jianjun126 commented May 26, 2023

msotheeswaran-sc commented May 31, 2023

jianjun126 commented May 31, 2023

JohnSully commented May 31, 2023

jianjun126 commented Jun 1, 2023

jianjun126 commented Nov 20, 2023 •

edited

[BUG] Inability to write when reached maxmemory in FLASH mode #645

[BUG] Inability to write when reached maxmemory in FLASH mode #645

Comments

jianjun126 commented Apr 26, 2023 • edited

paulmchen commented Apr 28, 2023

jianjun126 commented Apr 28, 2023

jianjun126 commented May 4, 2023

msotheeswaran-sc commented May 4, 2023

paulmchen commented May 5, 2023

jianjun126 commented May 6, 2023 • edited

paulmchen commented May 6, 2023 • edited

jianjun126 commented May 9, 2023

quwu0820 commented May 10, 2023

jianjun126 commented May 19, 2023

paulmchen commented May 19, 2023

jianjun126 commented May 20, 2023 • edited

paulmchen commented May 23, 2023

msotheeswaran-sc commented May 24, 2023 • edited

hengku commented May 24, 2023

jianjun126 commented May 25, 2023 • edited

jianjun126 commented May 25, 2023 • edited

hengku commented May 25, 2023

hengku commented May 26, 2023

jianjun126 commented May 26, 2023

msotheeswaran-sc commented May 31, 2023

jianjun126 commented May 31, 2023

JohnSully commented May 31, 2023

jianjun126 commented Jun 1, 2023

jianjun126 commented Nov 20, 2023 • edited

jianjun126 commented Apr 26, 2023 •

edited

jianjun126 commented May 6, 2023 •

edited

paulmchen commented May 6, 2023 •

edited

jianjun126 commented May 20, 2023 •

edited

msotheeswaran-sc commented May 24, 2023 •

edited

jianjun126 commented May 25, 2023 •

edited

jianjun126 commented May 25, 2023 •

edited

jianjun126 commented Nov 20, 2023 •

edited