Generate split region from the distinct value of base table index #1

crazycs520 · 2020-12-01T07:26:35Z

Signed-off-by: crazycs520 crazycs520@gmail.com

前提：给基础表相关的字段也添加相同的索引定义。
假设基础表的表结构是 create table t (a int, b int, index (b));

1. 取 distinct 并排序后 （ 例如：select distinct b from t order by b; ） 的数据存到数组 array。假设 array 是 1 到 50000 （ [1,2,3,4,.... 50000] ）
2. 计算要切的 region 数量。假设 new-table-rows 是 1000 w 行
   * 获取基础表 t 中索引 b 的总 region 数： show table t index b regions;  假如是 10 个 region 
   * 获取基础表 t 的总行数，假如是 100 w 行
   * 那么每个 region 能存 10w (100w/10) 个索引数据
   * 用 new-table-rows (新表要写入总行数) / 10w 得到要切的 region 的个数为 100 个
3. 计算要切 100 个 region 的分界点，其实就是从第一步中的 array 中等比分成 100 份，找到分界点即可。

usage:

go run main.go split sampling --new-table-row 100000000 --base-db test --base-table t --base-index idx --new-db test --new-table t1 --new-index idxx

cat /tmp/split/split_by_base.sql

Signed-off-by: crazycs520 <crazycs520@gmail.com>

init

e288776

Signed-off-by: crazycs520 <crazycs520@gmail.com>

wentaojin merged commit e1d43a8 into wentaojin:main Dec 1, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generate split region from the distinct value of base table index #1

Generate split region from the distinct value of base table index #1

crazycs520 commented Dec 1, 2020

Generate split region from the distinct value of base table index #1

Generate split region from the distinct value of base table index #1

Conversation

crazycs520 commented Dec 1, 2020