New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize approximate region size for region heartbeat #2375

Merged
merged 4 commits into from Oct 12, 2017

Conversation

Projects
None yet
3 participants
@huachaohuang
Member

huachaohuang commented Oct 11, 2017

get_region_approximate_size() can be slow with large data, which may
block the pd worker.

This commit sends back the approximate region size
from the split check worker to raftstore, and cache the approximate
size in the peer, then use the cached size for region heartbeat
directly.

This commit also remove the region_split_check_after_initialization
configuration, because we will always schedule the split check worker
to check the region size after initialization, which will do split
check if the region size satisfiies the condition.

Optimize approximate region size for region heartbeat
`get_region_approximate_size()` can be slow with large data, which may
block the pd worker.

This commit sends back the approximate region size
from the split check worker to raftstore, and cache the approximate
size in the peer, then use the cached size for region heartbeat
directly.

This commit also remove the `region_split_check_after_initialization`
configuration, because we will always schedule the split check worker
to check the region size after initialization, which will do split
check if the region size satisfiies the condition.

@huachaohuang huachaohuang requested review from siddontang and BusyJay Oct 11, 2017

huachaohuang added some commits Oct 11, 2017

# Set it to true will check regions if they need to spilt right after
# initialization. It is useful when we adjust the region size.
# region-split-check-after-initialization = false
# region-split-check-diff = "6MB"

This comment has been minimized.

@zhangjinpeng1987

zhangjinpeng1987 Oct 12, 2017

Member

For keeping the cached region size more accurate?

@zhangjinpeng1987

zhangjinpeng1987 Oct 12, 2017

Member

For keeping the cached region size more accurate?

This comment has been minimized.

@siddontang

siddontang Oct 12, 2017

Contributor

I guess 6MB and 12MB have no big difference.

@siddontang

siddontang Oct 12, 2017

Contributor

I guess 6MB and 12MB have no big difference.

This comment has been minimized.

@siddontang

siddontang Oct 12, 2017

Contributor

Maybe it is better to still use 12MB here.

@siddontang

siddontang Oct 12, 2017

Contributor

Maybe it is better to still use 12MB here.

This comment has been minimized.

@huachaohuang

huachaohuang Oct 12, 2017

Member

A smaller diff makes the cached region size more accurate. Since the split check will always check the approximate region size before scanning the region, a smaller diff will not lead to much overhead.

@huachaohuang

huachaohuang Oct 12, 2017

Member

A smaller diff makes the cached region size more accurate. Since the split check will always check the approximate region size before scanning the region, a smaller diff will not lead to much overhead.

@siddontang

Rest LGTM

@siddontang

This comment has been minimized.

Show comment
Hide comment
@siddontang

siddontang Oct 12, 2017

Contributor

The split check may be started after the split-check interval, do we need to forcibly call split check when we start?

Contributor

siddontang commented Oct 12, 2017

The split check may be started after the split-check interval, do we need to forcibly call split check when we start?

@huachaohuang

This comment has been minimized.

Show comment
Hide comment
@huachaohuang

huachaohuang Oct 12, 2017

Member

If the region's size is bigger enough, it should split sooner or later. I think it makes no big difference to do it at the start or later.

Member

huachaohuang commented Oct 12, 2017

If the region's size is bigger enough, it should split sooner or later. I think it makes no big difference to do it at the start or later.

@huachaohuang

This comment has been minimized.

Show comment
Hide comment
@huachaohuang

huachaohuang Oct 12, 2017

Member

/run-integration-tests

Member

huachaohuang commented Oct 12, 2017

/run-integration-tests

@zhangjinpeng1987 zhangjinpeng1987 merged commit 1b3f0a8 into master Oct 12, 2017

3 checks passed

ci/circleci Your tests passed on CircleCI!
Details
jenkins-ci-tikv/build Jenkins job succeeded.
Details
license/cla Contributor License Agreement is signed.
Details

@zhangjinpeng1987 zhangjinpeng1987 deleted the huachao/region-size branch Oct 12, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment