Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

raftstore: init half split region #2790

Merged
merged 27 commits into from Mar 27, 2018

Conversation

AndreMouche
Copy link
Member

@AndreMouche AndreMouche commented Mar 5, 2018

  • dev

  • test

@overvenus @BusyJay PTAL

@AndreMouche AndreMouche changed the title [WIP] raftstore: init half split region raftstore: init half split region Mar 11, 2018
@overvenus
Copy link
Member

The test failed.

test raftstore_cases::test_split_region::test_node_half_split_region ... key:[122, 48, 48, 48, 48, 48, 48, 48, 48, 49],current_size:74,region_size:1480
key:[122, 48, 48, 48, 48, 48, 48, 48, 48, 50],current_size:148,region_size:1480
key:[122, 48, 48, 48, 48, 48, 48, 48, 48, 51],current_size:222,region_size:1480
key:[122, 48, 48, 48, 48, 48, 48, 48, 48, 52],current_size:296,region_size:1480
key:[122, 48, 48, 48, 48, 48, 48, 48, 48, 53],current_size:370,region_size:1480
key:[122, 48, 48, 48, 48, 48, 48, 48, 48, 54],current_size:444,region_size:1480
key:[122, 48, 48, 48, 48, 48, 48, 48, 48, 55],current_size:518,region_size:1480
key:[122, 48, 48, 48, 48, 48, 48, 48, 48, 56],current_size:592,region_size:1480
key:[122, 48, 48, 48, 48, 48, 48, 48, 48, 57],current_size:666,region_size:1480
key:[122, 48, 48, 48, 48, 48, 48, 48, 49, 48],current_size:740,region_size:1480
key:[122, 48, 48, 48, 48, 48, 48, 48, 49, 49],current_size:814,region_size:1480
key:[122, 48, 48, 48, 48, 48, 48, 48, 49, 49],current_size:74,region_size:666
key:[122, 48, 48, 48, 48, 48, 48, 48, 49, 50],current_size:148,region_size:666
key:[122, 48, 48, 48, 48, 48, 48, 48, 49, 51],current_size:222,region_size:666
key:[122, 48, 48, 48, 48, 48, 48, 48, 49, 52],current_size:296,region_size:666
key:[122, 48, 48, 48, 48, 48, 48, 48, 49, 53],current_size:370,region_size:666
thread 'raftstore_cases::test_split_region::test_node_half_split_region' panicked at 'assertion failed: `(left == right)`
  left: `[48, 48, 48, 48, 48, 48, 48, 49, 53]`,
 right: `[48, 48, 48, 48, 48, 48, 48, 49, 49]`', tests/raftstore_cases/test_split_region.rs:766:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
stack backtrace:
   0: std::sys::unix::backtrace::tracing::imp::unwind_backtrace
             at libstd/sys/unix/backtrace/tracing/gcc_s.rs:49
   1: std::sys_common::backtrace::print
             at libstd/sys_common/backtrace.rs:68
             at libstd/sys_common/backtrace.rs:57
   2: std::panicking::default_hook::{{closure}}
             at libstd/panicking.rs:380
   3: std::panicking::default_hook
             at libstd/panicking.rs:396
   4: tikv::util::panic_hook::track_hook::{{closure}}
             at src/util/panic_hook.rs:59
   5: <std::thread::local::LocalKey<T>>::try_with
             at /checkout/src/libstd/thread/local.rs:377
   6: <std::thread::local::LocalKey<T>>::with
             at /checkout/src/libstd/thread/local.rs:288
   7: tikv::util::panic_hook::track_hook
             at src/util/panic_hook.rs:53
   8: core::ops::function::Fn::call
             at /checkout/src/libcore/ops/function.rs:73
   9: std::panicking::rust_panic_with_hook
             at libstd/panicking.rs:577
  10: std::panicking::begin_panic
             at libstd/panicking.rs:537
  11: std::panicking::begin_panic_fmt
             at libstd/panicking.rs:521
  12: integrations::raftstore_cases::test_split_region::test_half_split_region
             at tests/raftstore_cases/test_split_region.rs:766
  13: integrations::raftstore_cases::test_split_region::test_node_half_split_region
             at tests/raftstore_cases/test_split_region.rs:738
  14: <F as alloc::boxed::FnBox<A>>::call_box
             at libtest/lib.rs:1440
             at /checkout/src/libcore/ops/function.rs:223
             at /checkout/src/liballoc/boxed.rs:817
  15: __rust_maybe_catch_panic
             at libpanic_unwind/lib.rs:102
FAILED

https://circleci.com/gh/pingcap/tikv/8580?utm_campaign=vcs-integration-link&utm_medium=referral&utm_source=github-build-link

}

impl HalfStatus {
pub fn on_split_check(&mut self, key: &[u8], value_size: u64) -> bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should_split seems better.

@@ -123,19 +123,26 @@ impl<'a> MergedIterator<'a> {
/// Split checking task.
pub struct Task {
region: Region,
auto_check: bool,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

auto_check = false can mean a lot of things, I prefer to define an enum explicitly like:

enum SplitType {
    AutoSplit,
    HalfSplit,
}

others => panic!("expect split check result, but got {:?}", others),
}
}
drop(rx);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems it will be dropped automatically?

let region = &task.region;
let mut split_ctx =
self.coprocessor
.new_split_check_status(region, &self.engine, task.is_auto_split());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just pass the split type? It seems pointless to define an enum and convert it to bool again.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd like to use the enum instead of bool until we need to support more split algorithms

assert_eq!(region.get_start_key(), left.get_start_key());
assert_eq!(right.get_start_key(), left.get_end_key());
assert_eq!(region.get_end_key(), right.get_end_key());
assert_eq!(pd_client.get_region(&max_key).unwrap(), right);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear what you are trying to test in this two lines, why not just test the start/end keys of left and right one by one?

@AndreMouche
Copy link
Member Author

@overvenus @BusyJay @huachaohuang PTAL

@@ -27,10 +27,12 @@ pub struct Config {
/// be region_split_size (or a little bit smaller).
pub region_max_size: ReadableSize,
pub region_split_size: ReadableSize,
pub half_split_bucket_size: ReadableSize,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this config used for?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used for computing region's mid_key(split_key), this value has a direct influence on split_key

@AndreMouche
Copy link
Member Author

We will merge master && fix conflicts after pingcap/kvproto#229 merged

@AndreMouche
Copy link
Member Author

@BusyJay @overvenus @huachaohuang PTAL

region_id,
region_epoch,
} => {
let peer = match self.region_peers.get(&region_id) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't handle message here directly, this is a dispatch function.


if !peer.is_leader() {
// region on this store is no longer leader, skipped.
info!(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/info/warn


let region = peer.region();
if util::is_epoch_stale(&region_epoch, region.get_region_epoch()) {
info!("[region {}] receive a stale halfsplit message", region_id);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/info/warn

}

fn test_half_split_region<T: Simulator>(cluster: &mut Cluster<T>) {
let item_len = 74;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment for this magic number.

let mut status = Status::default();
status.auto_split = auto_split;
status
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a new line below.

#[derive(Default)]
pub struct Status {
// For TableCheckObserver
table: Option<TableStatus>,
// For SizeCheckObserver
size: Option<SizeStatus>,
// For HalfCheckObserver TODO:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the TODO?

self.cur_bucket_size += current_len;
}

fn check_and_adjust_buckets_num(&mut self) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It takes me a while to figure out what does this do, maybe a simpler way is to increase the bucket size exponentially.


pub fn split_key(self) -> Option<Vec<u8>> {
let mid = self.buckets.len() / 2;
if mid == 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

buckets.get() will take care of this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used to check if self.buckets.len() <= 1 here

@AndreMouche
Copy link
Member Author

PTAL

@AndreMouche
Copy link
Member Author

friendly ping @overvenus @BusyJay @huachaohuang

if mid == 0 {
None
} else {
self.buckets.get(mid).cloned()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about remove? So we don't need the cloned()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to cost more since remove will shift all elements after it to the left?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about swap_remove ?

}

#[derive(Debug)]
pub enum SplitType {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about making the Task an enum?

pub enum Task {
  Auto(Region),
  Manual(Region),
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so since all tasks will share the same Region struct.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then we only need a bool.

@@ -31,6 +31,7 @@ pub struct Config {

/// Default region split size.
pub const SPLIT_SIZE_MB: u64 = 96;
pub const HALF_SPLIT_BUCKET_SIZE_MB: u64 = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like an unused const.

@@ -522,6 +522,24 @@ impl TestPdClient {
});
}

pub fn half_split_region(&self, src_region: metapb::Region) {
self.set_rule(box move |region: &metapb::Region, _: &metapb::Peer| {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix merge conflicts.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's blocked by pingcap/kvproto#229 @disksing

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this conflict blocked by kvproto?

@@ -522,6 +522,24 @@ impl TestPdClient {
});
}

pub fn half_split_region(&self, src_region: metapb::Region) {
self.set_rule(box move |region: &metapb::Region, _: &metapb::Peer| {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this conflict blocked by kvproto?

if half_split_bucket_size == 0 {
half_split_bucket_size = 1;
} else if half_split_bucket_size > bucket_size_limit {
half_split_bucket_size = bucket_size_limit;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't seem possible that the bucket size will exceed 512MB here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. we do check here in case someday the region's max size increase too much.

@AndreMouche
Copy link
Member Author

@overvenus @BusyJay @huachaohuang PTAL

impl SplitCheckObserver for HalfCheckObserver {
fn new_split_check_status(
&self,
_ctx: &mut ObserverContext,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can get_region_approximate_size and then we can find a split key faster.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In our last test, get_region_approximate_size may not be as accurate as expected.

if mid == 0 {
None
} else {
Some(self.buckets.swap_remove(mid))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why use swap_remove here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

avoid get(mid).clone()

let mut half_split_bucket_size = region_size_limit / BUCKET_NUMBER_LIMIT as u64;
let bucket_size_limit = ReadableSize::mb(BUCKET_SIZE_LIMIT_MB).0;
if half_split_bucket_size == 0 {
half_split_bucket_size = 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bucket size is 1 byte here? too small?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is mainly for tests.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It only happens when max_region_size is smaller than 1K

@siddontang
Copy link
Contributor

can you unify auto_split, auto_check? are they different?

self.buckets.push(key.to_vec());
self.cur_bucket_size = 0;
}
self.cur_bucket_size += current_len;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

current_len is a bit confused here, I think you can just add the length of the key and value here.

let mut half_split_bucket_size = region_size_limit / BUCKET_NUMBER_LIMIT as u64;
let bucket_size_limit = ReadableSize::mb(BUCKET_SIZE_LIMIT_MB).0;
if half_split_bucket_size == 0 {
half_split_bucket_size = 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is mainly for tests.

}
// Approximate size of memtable is inaccurate for small data,
// we flush it to SST so we can use the size properties instead.
engine.flush(true).unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But you don't use approximate size now.

}

impl Operator {
fn make_region_heartbeat_response(
&self,
region_id: u64,
region: &metapb::Region,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need to change this?

@@ -165,6 +170,9 @@ impl Operator {
}
unreachable!()
}
Operator::HalfSplitRegion { ref region_epoch } => {
region.get_region_epoch() != region_epoch
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if it is a stale epoch?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do not send split command @huachaohuang

@@ -123,19 +123,26 @@ impl<'a> MergedIterator<'a> {
/// Split checking task.
pub struct Task {
region: Region,
auto: bool,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

auto_split is more clear.

Copy link
Member

@overvenus overvenus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@huachaohuang huachaohuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

overvenus
overvenus previously approved these changes Mar 26, 2018
@overvenus overvenus merged commit 6784cba into tikv:master Mar 27, 2018
sticnarf pushed a commit to sticnarf/tikv that referenced this pull request Oct 27, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants