Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use tokio threadpool and thread local metrics for readpool #4486

Conversation

Projects
None yet
6 participants
@fredchenbj
Copy link
Contributor

fredchenbj commented Apr 5, 2019

What have you changed? (mandatory)

Before uses futures-cpupool to implement ReadPool, but tokio-threadpool is faster and more stable under high race condition or workload, so replace it to improve the performance for storage read and coprocessor request. Meanwhile, use thread local variable to replace context struct for metrics.

What are the type of the changes? (mandatory)

Improvement (change which is an improvement to an existing feature).

How has this PR been tested? (mandatory)

Unit tests, integration tests, and partial manual tests.

Does this PR affect documentation (docs) update? (mandatory)

No.

Does this PR affect tidb-ansible update? (mandatory)

No.

Refer to a related PR or issue link (optional)

No.

Benchmark result if necessary (optional)

image
From the pic above, under high wordload and stable qps, the p99 latency of read reduced about 14%, and the p999 latency reduced about 20%.

Add a few positive/negative examples (optional)

breeswish and others added some commits Apr 1, 2019

*:use tokio-threadpool and thread local metrics in Storage
Signed-off-by: Breezewish <breezewish@pingcap.com>
*:modify coprocessor metrics and tests with tokio-thread
Signed-off-by: fredchenbj <cfworking@163.com>
@sre-bot

This comment has been minimized.

Copy link
Collaborator

sre-bot commented Apr 5, 2019

Hi contributor, thanks for your PR.

This patch needs to be approved by someone of admins. They should reply with "/ok-to-test" to accept this PR for running test automatically.

// Keep running stream producer
cpu_future.forget();
// cpu_future.forget();

This comment has been minimized.

Copy link
@siddontang

This comment has been minimized.

Copy link
@hicqu

hicqu Apr 11, 2019

Contributor

You can remove the line directly. It's OK because spawn has polled it internally. BTW I prefer to write

self.read_pool.spawn(...)?;
Ok(rx.then(|r| r.unwrap()))

to make it more clear that spawn returns a Result<()>.

assert_eq!(rx.recv().unwrap(), Ok(7));
assert_eq!(rx.recv().unwrap(), Ok(4));
// the recv order maybe: "Ok(2)Ok(4)Ok(7)Ok(3)" or “Ok(2)Ok(3)Ok(4)Ok(7)” or “Ok(2)Ok(4)Ok(3)Ok(7)”
print!("{:?}", rx.recv().unwrap());

This comment has been minimized.

Copy link
@siddontang

siddontang Apr 5, 2019

Contributor

I think we should use assert_eq, not print

This comment has been minimized.

Copy link
@fredchenbj

fredchenbj Apr 5, 2019

Author Contributor

Before the recv order was certainly Ok(2)Ok(3)Ok(7)Ok(4), but now it's order changes every runs. So I am not sure whether it is a problem.

This comment has been minimized.

Copy link
@breeswish

breeswish Apr 5, 2019

Member

Then let's not check the recv order any more. Let's only check whether or not full is returned, since futurepool already has complete tests. This is a work-stealing pool and the scheduling order is not as predictable as the previous one.

@@ -35,21 +35,21 @@ mod endpoint;
mod error;
pub mod local_metrics;
mod metrics;
mod readpool_context;
mod read_pool_impl;

This comment has been minimized.

Copy link
@siddontang

siddontang Apr 5, 2019

Contributor

em, I prefer readpool_impl

}

#[inline]
fn thread_local_flush(pd_sender: &FutureScheduler<PdTask>) {

This comment has been minimized.

Copy link
@siddontang

siddontang Apr 5, 2019

Contributor

I think using thread_local everywhere is long and redundant. If we want to represent thread_local, mostly, we can use tls instead.

This comment has been minimized.

Copy link
@breeswish

breeswish Apr 5, 2019

Member

Yeah, tls looks to be a good name!

}
}

/// Tried to trigger a tick in current thread.

This comment has been minimized.

Copy link
@siddontang

siddontang Apr 5, 2019

Contributor

Try to

This comment has been minimized.

Copy link
@breeswish

breeswish Apr 5, 2019

Member

Maybe a typo. It should be Tries

@siddontang

This comment has been minimized.

Copy link
Contributor

siddontang commented Apr 5, 2019

Thanks @fredchenbj

It is a very cool feature.

@siddontang

This comment has been minimized.

Copy link
Contributor

siddontang commented Apr 5, 2019

I think we should do more benchmarks

/cc @breeswish please help @fredchenbj do some

@siddontang

This comment has been minimized.

Copy link
Contributor

siddontang commented Apr 5, 2019

after this, I even think we can remove another threadpool, we can use future::lazy to wrap the task and so we can unify the thread pool.

But we should also do the benchmark, IMO, tokio thread pool has a better performance than our thread pool @breeswish

Another thing is to support dynamically changing thread number in the pool, but we must be careful about this, because now we will collect thread metrics and use thread ID as a label value. Dynamic thread means we may send too many label values to Prometheus. So maybe for the thread pool, we can use thread name instead of thread ID. /cc @overvenus

@breeswish
Copy link
Member

breeswish left a comment

Thanks a lot! Mostly fine. How about the metrics? Have you checked that they are working as intended?

.future_execute(priority, move |ctxd| {
tracker.attach_ctxd(ctxd);
.spawn_handle(priority, move || {
tracker.init_current_stage();

This comment has been minimized.

Copy link
@breeswish

breeswish Apr 5, 2019

Member

We can now mark state as initialized when tracker is built, so that this line doesn't need any more.

ReadPoolContext::new(pd_worker.scheduler())
});
let pool =
coprocessor::ReadPoolImpl::build_read_pool(read_pool_cfg, pd_worker.scheduler(), "cop-fix");

This comment has been minimized.

Copy link
@breeswish

breeswish Apr 5, 2019

Member

Is the name really important? I guess most of time default name should be enough because the rest of the usage are in tests.

This comment has been minimized.

Copy link
@breeswish

breeswish Apr 5, 2019

Member

For this one maybe Builder::build_for_test() is enough

storage::ReadPoolContext::new(pd_worker.scheduler())
});
let pd_worker = FutureWorker::new("test-pd-worker");
let storage_read_pool = storage::ReadPoolImpl::build_read_pool(

This comment has been minimized.

Copy link
@breeswish

breeswish Apr 5, 2019

Member

you may try to replace it using Builder::build_for_test as well. It may work.

let read_pool = ReadPool::new(
"readpool",

let read_pool = ReadPoolImpl::build_read_pool(

This comment has been minimized.

Copy link
@breeswish

breeswish Apr 5, 2019

Member

For this one maybe we can use Builder::from_config(..).build() (because we don't need on_tick or before_stop in this test). Similar for others.

static LOCAL_KV_COMMAND_SCAN_DETAILS: RefCell<LocalIntCounterVec> =
RefCell::new(KV_COMMAND_SCAN_DETAILS.local());

static LOCAL_PD_SENDER: RefCell<Option<FutureScheduler<PdTask>>> =

This comment has been minimized.

Copy link
@breeswish

breeswish Apr 5, 2019

Member

Let's remove it since it is not used.

LOCAL_COPR_EXECUTOR_COUNT.with(|m| m.borrow_mut().flush());
}

pub fn collect(region_id: u64, type_str: &str, metrics: ExecutorMetrics) {

This comment has been minimized.

Copy link
@breeswish

breeswish Apr 5, 2019

Member

let's rename it to make it more clear. maybe.. thread_local_collect_executor_metrics?

struct Context;

impl futurepool::Context for Context {}

#[test]
fn test_future_execute() {

This comment has been minimized.

Copy link
@breeswish

breeswish Apr 5, 2019

Member

let's rename it because no more "future_execute".

assert_eq!(rx.recv().unwrap(), Ok(7));
assert_eq!(rx.recv().unwrap(), Ok(4));
// the recv order maybe: "Ok(2)Ok(4)Ok(7)Ok(3)" or “Ok(2)Ok(3)Ok(4)Ok(7)” or “Ok(2)Ok(4)Ok(3)Ok(7)”
print!("{:?}", rx.recv().unwrap());

This comment has been minimized.

Copy link
@breeswish

breeswish Apr 5, 2019

Member

Then let's not check the recv order any more. Let's only check whether or not full is returned, since futurepool already has complete tests. This is a work-stealing pool and the scheduling order is not as predictable as the previous one.

}

#[inline]
fn thread_local_flush(pd_sender: &FutureScheduler<PdTask>) {

This comment has been minimized.

Copy link
@breeswish

breeswish Apr 5, 2019

Member

Yeah, tls looks to be a good name!

}
}

/// Tried to trigger a tick in current thread.

This comment has been minimized.

Copy link
@breeswish

breeswish Apr 5, 2019

Member

Maybe a typo. It should be Tries

fredchenbj added some commits Apr 8, 2019

fixes to make code clear
Signed-off-by: fredchenbj <cfworking@163.com>
little fix
Signed-off-by: fredchenbj <cfworking@163.com>
use crate::coprocessor::dag::executor::ExecutorMetrics;

thread_local! {
pub static LOCAL_COPR_REQ_HISTOGRAM_VEC: RefCell<LocalHistogramVec> =

This comment has been minimized.

Copy link
@siddontang

siddontang Apr 8, 2019

Contributor

oh, we have so many metrics, it is better to use a structure to wrap all so we can only use one thread local var instead? /cc @breeswish

This comment has been minimized.

Copy link
@breeswish

breeswish Apr 8, 2019

Member

I'm fine with both, maybe not much difference.

This comment has been minimized.

Copy link
@siddontang

siddontang Apr 9, 2019

Contributor

em, maybe we can do a benchmark, one local struct vs multi local vars

This comment has been minimized.

Copy link
@fredchenbj

fredchenbj Apr 9, 2019

Author Contributor

ok, I will do a benchmark about this.

fix test thread panic
Signed-off-by: fredchenbj <cfworking@163.com>
@breeswish
Copy link
Member

breeswish left a comment

Good job! I'm fine with this PR, as long as the metrics are working as intended.

@siddontang

This comment has been minimized.

Copy link
Contributor

siddontang commented Apr 9, 2019

@breeswish

please paste your benchmark results too

@breeswish

This comment has been minimized.

Copy link
Member

breeswish commented Apr 11, 2019

/run-integration-tests

fredchenbj added some commits Apr 11, 2019

use struct of thread local
Signed-off-by: fredchenbj <cfworking@163.com>
use struct for thread local storage metrics
Signed-off-by: fredchenbj <cfworking@163.com>
use prometheus::local::*;

use crate::coprocessor::dag::executor::ExecutorMetrics;
pub struct TlsCop {

This comment has been minimized.

Copy link
@siddontang

siddontang Apr 11, 2019

Contributor

add a blank line here

pub struct ReadPoolImpl;

impl ReadPoolImpl {
#[inline]

This comment has been minimized.

Copy link
@siddontang

siddontang Apr 11, 2019

Contributor

seem we don't need inline here

@siddontang

This comment has been minimized.

Copy link
Contributor

siddontang commented Apr 11, 2019

Thanks @fredchenbj

Great work!!!

PTAL @breeswish @hicqu

little fix for review
Signed-off-by: fredchenbj <cfworking@163.com>
@zhouqiang-cl

This comment has been minimized.

Copy link
Contributor

zhouqiang-cl commented Apr 11, 2019

/test

@breeswish

This comment has been minimized.

Copy link
Member

breeswish commented Apr 11, 2019

/run-integration-tests

ReadPool::new("store-read", &cfg.readpool.storage.build_config(), || {
storage::ReadPoolContext::new(pd_sender.clone())
});
let storage_read_pool = storage::ReadPoolImpl::build_read_pool(

This comment has been minimized.

Copy link
@hicqu

hicqu Apr 11, 2019

Contributor

Personally I prefer storage::ReadPool. Impl looks like a private thing.

This comment has been minimized.

Copy link
@fredchenbj

fredchenbj Apr 11, 2019

Author Contributor

ReadPool had been used, maybe use ReadPoolProducer. Is it ok?

This comment has been minimized.

Copy link
@hicqu

hicqu Apr 11, 2019

Contributor

Or ReadPoolContext? It just can build a ReadPool and handle some metrics. It's not a ReadPool indeed.

This comment has been minimized.

Copy link
@breeswish

breeswish Apr 11, 2019

Member

It just "derive"s the common ReadPool to create a specialized ReadPool that attached some name, some lifetime hook functions (like on_tick). That's why it was called ReadPoolImpl. Producer or Builder might not be a very good name because it will be confusing for functions like Producer:: tls_collect_executor_metrics.

This comment has been minimized.

Copy link
@hicqu

hicqu Apr 12, 2019

Contributor

Agree with Producer and Builder are not good enough. How about remove the struct?

pub local_copr_rocksdb_perf_counter: RefCell<LocalIntCounterVec>,
local_copr_executor_count: RefCell<LocalIntCounterVec>,
local_copr_get_or_scan_count: RefCell<LocalIntCounterVec>,
local_cop_flow_stats: RefCell<HashMap<u64, crate::storage::FlowStatistics>>,

This comment has been minimized.

Copy link
@hicqu

hicqu Apr 11, 2019

Contributor

There are too many RefCells. How about put the struct in a RefCell or Mutex? I think it's more clear.

This comment has been minimized.

Copy link
@breeswish

breeswish Apr 11, 2019

Member

Nice catch. You should arrange them as..

pub struct Xxx {
    field: LocalIntCounter,
    field_2: LocalIntCounter,
    ...
}

thread_local! {
    pub static TLS_COP_METRICS: RefCell<TlsCop> = ...;
}

In this way, we only need to check borrow once when updating multiple fields.

@hicqu

This comment has been minimized.

Copy link
Contributor

hicqu commented Apr 11, 2019

Rest LGTM. Thank you very much!

fredchenbj added some commits Apr 12, 2019

merge branch master to resolve confict
Signed-off-by: fredchenbj <cfworking@163.com>
use struct to reduce refcell
Signed-off-by: fredchenbj <cfworking@163.com>
@fredchenbj

This comment has been minimized.

Copy link
Contributor Author

fredchenbj commented Apr 12, 2019

Friendly ping @siddontang @breeswish @hicqu

fredchenbj added some commits Apr 12, 2019

remove ReadPoolImpl struct
Signed-off-by: fredchenbj <cfworking@163.com>
little fix
Signed-off-by: fredchenbj <cfworking@163.com>
merge master branch to resolve conflict
Signed-off-by: fredchenbj <cfworking@163.com>
@hicqu

This comment has been minimized.

Copy link
Contributor

hicqu commented Apr 12, 2019

LGTM.

fix bug of a test case
Signed-off-by: fredchenbj <cfworking@163.com>
@siddontang

This comment has been minimized.

Copy link
Contributor

siddontang commented Apr 12, 2019

PTAL @breeswish

fredchenbj added some commits Apr 14, 2019

4th merge master to resolve conflicts
Signed-off-by: fredchenbj <cfworking@163.com>
modify new files copyright info
Signed-off-by: fredchenbj <cfworking@163.com>
little fixes
Signed-off-by: fredchenbj <cfworking@163.com>

@fredchenbj fredchenbj force-pushed the fredchenbj:fredchenbj/use-tokio-threadpool-and-thread-local-metrics-for-readpool branch from 90c8280 to 4338a7c Apr 14, 2019

@fredchenbj

This comment has been minimized.

Copy link
Contributor Author

fredchenbj commented Apr 14, 2019

ping @siddontang @breeswish @hicqu , please take a look.

@siddontang
Copy link
Contributor

siddontang left a comment

LGTM

PTAL @hicqu @breeswish

@siddontang siddontang merged commit 545e479 into tikv:master Apr 14, 2019

2 checks passed

DCO All commits are signed off!
Details
idc-jenkins-ci/test Jenkins job succeeded.
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.