Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use batch_raft RPC and batch channel to reduce gRPC CPU usage. #3913

Merged
merged 15 commits into from Jan 10, 2019

Conversation

@hicqu
Copy link
Contributor

commented Dec 11, 2018

Signed-off-by: qupeng qupeng@pingcap.com

What have you changed? (mandatory)

Add a batch_raft RPC for TiKV. Use it with a batch channel to reduce gRPC CPU usage.

What are the type of the changes? (mandatory)

Improvement.

How has this PR been tested? (mandatory)

make dev.

Does this PR affect tidb-ansible update? (mandatory)

No.

Refer to a related PR or issue link (optional)

Please review pingcap/kvproto#291 fisrt

TODO list:

  • RFC
  • benchmark
  • test case for fallback
Signed-off-by: qupeng <qupeng@pingcap.com>
@siddontang

This comment has been minimized.

Copy link
Contributor

commented Dec 11, 2018

please add test

hicqu added 6 commits Dec 12, 2018
Signed-off-by: qupeng <qupeng@pingcap.com>
Signed-off-by: qupeng <qupeng@pingcap.com>
Signed-off-by: qupeng <qupeng@pingcap.com>
Signed-off-by: qupeng <qupeng@pingcap.com>
Signed-off-by: qupeng <qupeng@pingcap.com>
@zhangjinpeng1987

This comment has been minimized.

Copy link
Member

commented Jan 6, 2019

Please fix conflicts.

@ice1000 ice1000 added the C: gRPC label Jan 6, 2019
Signed-off-by: qupeng <qupeng@pingcap.com>
src/server/config.rs Outdated Show resolved Hide resolved
Signed-off-by: qupeng <qupeng@pingcap.com>
@@ -145,6 +146,8 @@ impl Default for Config {
// 100 means gRPC threads are under heavy load if their total CPU usage
// is greater than 100%.
heavy_load_threshold: 100,

This comment has been minimized.

Copy link
@zhangjinpeng1987

zhangjinpeng1987 Jan 8, 2019

Member

From you benchmark, 100 is not a good default value.

@@ -79,6 +79,11 @@ lazy_static! {
"tikv_server_raft_message_recv_total",
"Total number of raft messages received"
).unwrap();
pub static ref RAFT_MESSAGE_BATCH_SIZE: Histogram = register_histogram!(

This comment has been minimized.

Copy link
@zhangjinpeng1987

zhangjinpeng1987 Jan 8, 2019

Member

Please confirm if this global histogram can become a bottle neck.

const PRESERVED_MSG_BUFFER_COUNT: usize = 1024;

const RAFT_MSG_MAX_BATCH_SIZE: usize = 128;
const RAFT_MSG_NOTIFY_SIZE: usize = 8;

This comment has been minimized.

Copy link
@zhangjinpeng1987

zhangjinpeng1987 Jan 8, 2019

Member

Have you tried different values for the above two variables?

This comment has been minimized.

Copy link
@hicqu

hicqu Jan 8, 2019

Author Contributor

I have tried 8/64, 16/64 and 16/128, not found obvious difference between them.

src/server/raft_client.rs Outdated Show resolved Hide resolved
src/server/raft_client.rs Show resolved Hide resolved
.send(msg)
{
error!("RaftClient send fail");
let index = msg.region_id as usize % self.cfg.grpc_raft_conn_num;

This comment has been minimized.

Copy link
@zhangjinpeng1987

zhangjinpeng1987 Jan 8, 2019

Member

Seems you forget to change the default value of grpc_raft_conn_num.

hicqu and others added 5 commits Jan 8, 2019
Signed-off-by: qupeng <qupeng@pingcap.com>
Signed-off-by: qupeng <qupeng@pingcap.com>
Signed-off-by: qupeng <qupeng@pingcap.com>
Signed-off-by: qupeng <qupeng@pingcap.com>
@zhangjinpeng1987

This comment has been minimized.

Copy link
Member

commented Jan 9, 2019

Please provide benchmark result.

Copy link
Member

left a comment

LGTM

Copy link
Member

left a comment

LGTM

@hicqu hicqu merged commit b7b548a into tikv:master Jan 10, 2019
3 checks passed
3 checks passed
DCO All commits are signed off!
Details
idc-jenkins-ci/test Jenkins job succeeded.
Details
jenkins-ci-tikv/build Jenkins job succeeded.
Details
@hicqu hicqu deleted the hicqu:batch-raft-msg branch Jan 10, 2019
@ngaut ngaut referenced this pull request Jan 10, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.