Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate split region from the distinct value of base table index #1

Merged
merged 1 commit into from
Dec 1, 2020

Conversation

crazycs520
Copy link
Contributor

Signed-off-by: crazycs520 crazycs520@gmail.com

前提:给基础表相关的字段也添加相同的索引定义。
假设基础表的表结构是 create table t (a int, b int, index (b));

1. 取 distinct 并排序后 ( 例如:select distinct b from t order by b; ) 的数据存到数组 array。假设 array 是 150000 ( [1,2,3,4,.... 50000] )
2. 计算要切的 region 数量。假设 new-table-rows 是 1000 w 行
   * 获取基础表 t 中索引 b 的总 region 数: show table t index b regions;  假如是 10 个 region 
   * 获取基础表 t 的总行数,假如是 100 w 行
   * 那么每个 region 能存 10w (100w/10) 个索引数据
   * 用 new-table-rows (新表要写入总行数) / 10w 得到要切的 region 的个数为 1003. 计算要切 100 个 region 的分界点,其实就是从第一步中的 array 中等比分成 100 份,找到分界点即可。

usage:

go run main.go split sampling --new-table-row 100000000 --base-db test --base-table t --base-index idx --new-db test --new-table t1 --new-index idxx

cat /tmp/split/split_by_base.sql

Signed-off-by: crazycs520 <crazycs520@gmail.com>
@wentaojin wentaojin merged commit e1d43a8 into wentaojin:main Dec 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants