Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange behavior of TiKV: performance not increase when adding more load test client while TiKV server is idle #3566

Closed
romiguan opened this issue Sep 4, 2018 · 8 comments
Labels
type/question Type: Issue - Question

Comments

@romiguan
Copy link

romiguan commented Sep 4, 2018

Env: TiKV cluster of 6-node(64Core, 256G Mem, NVME disk 4T) with 3-billion data(key size : 8-bytes, value size < 50Byte), using TiKVRawClient doing BatchGet(300 keys one batch).
Strange behavior found when adding more clients: TiKV server throughputs are not increasing by adding more load test clients(more TiKVRawClient), while TiKV server is less than 50% CPU usage, many threads idle.

Is there any traffic control/throttle in TiKV server? Or some parameter should be tuned?

Thanks.

@romiguan romiguan changed the title Strange behavior of TiKV: performance downgrade when increasing load test client Strange behavior of TiKV: performance not increase when adding more load test client while TiKV server is idle Sep 4, 2018
@Connor1996
Copy link
Member

There maybe a bottleneck in some threads, you can use top -H -p {tikv-pid} to see thread CPU usage. If grpc-server threads close to 100%, you can large grpc-concurrency in config.

@Connor1996 Connor1996 added the type/question Type: Issue - Question label Sep 5, 2018
@romiguan
Copy link
Author

romiguan commented Sep 6, 2018

Below config has benn made:

_grpc-concurrency = 64
normal-concurrency = 16
grpc-stream-initial-window-size = "64MB"

grpc-thread and normal-concurrency thread cpu usage is about 40-60%, and no end-point thread is displayed in top outputs.
Any suggestion for further tuning?

@breezewish
Copy link
Member

What kind of payload is it? Pure write? Pure read? Or mixed?

@romiguan
Copy link
Author

romiguan commented Sep 6, 2018

Pure batchGet operation, details as below:

Env: TiKV cluster of 6-node(64Core, 256G Mem, NVME disk 4T) with 3-billion data(key size : 8-bytes, value size < 50Byte), using TiKVRawClient doing BatchGet(300 keys one batch).

@Connor1996
Copy link
Member

end-point thread has been renamed to readpool-low/normal/high. So you mean you have already set the configs but the performance didn't increase? What is the highest usage thread and how much is it?

@romiguan
Copy link
Author

romiguan commented Sep 7, 2018

For readpool config in which section, readpool.storage or readpool.coprocessor?
Below changes has been made, and performance has been increased.
Could you please indicate which configuration is the key point ?

[readpool.storage]
high-concurrency = 4
normal-concurrency = 16
low-concurrency = 4
max-tasks-per-worker-high = 10000
max-tasks-per-worker-normal = 10000
max-tasks-per-worker-low = 10000
stack-size = "10MB"

[readpool.coprocessor]
high-concurrency = 60
normal-concurrency = 60
low-concurrency = 60

@Connor1996
Copy link
Member

readpool.storage is related to kv api,
readpool.coprocessor is related to coprocessor api.

And which is the key point depends on your former bottleneck. So what is the highest usage thread before?

@breezewish
Copy link
Member

Seems that you are requesting from KV API. Would you like to share us your output of top -H?

I guess the performance is limited by "xxx-concurrency" configuration in readpool.storage section. Your output will help us confirm the cause.

@solotzg solotzg closed this as completed Oct 25, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/question Type: Issue - Question
Projects
None yet
Development

No branches or pull requests

4 participants