Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: split index region with lower upper syntax #10409

Merged
merged 34 commits into from Jun 6, 2019

Conversation

Projects
None yet
7 participants
@crazycs520
Copy link
Contributor

commented May 9, 2019

What problem does this PR solve?

Add new syntax for split index region.

SPLIT TABLE [table_name] INDEX [index_name] BETWEEN (lower_value...) AND (upper_value...) REGIONS [region_num]

Attention

  • The lower_value and upper_value value count should less than the index column num.
  • The lower_value and upper_value value count should more than 0.
  • The max split index region num currently is 1000.

Eg1: int type.

split table `t` index `idx1` between (0) and (1000) regions 5;

Upper sql will split 5 regions of the table t index idx1 in range 0~1000. Every region range like below:

Region1: -inf ~ 200

Region2: 200 ~ 400

Region3: 400 ~ 600

Region4: 600 ~ 800

Region5: 800 ~ +inf

Eg2: varchar type

split table `t` index `idx1` between ('a') and ('z') regions 26;

Upper sql will split 26 regions.

Region1: -inf ~ b

Region2: b ~ c

...

Region26: y ~ +inf

Eg3: time type.

split table `t` index `idx1` between ('2010-01-01 00:00:00') and ('2020-01-01 00:00:00') regions 10;

Upper sql will split 10 regions.

Region1: -inf ~ 2011-01-01 00:00:00
Region2: 2011-01-01 00:00:00 ~ 2012-01-01 00:00:00

...

Region10: 2011-01-01 00:00:00 ~ +inf

**Eg4: multi-column index. idx1 (col1,col2,col3). **

  • if you only want to split index region by first column col1. You can use SQL like Eg1,Eg2,Eg3. Suppose col1 is int type.

    split table `t` index `idx1` between (-1000) and (1000) regions 5;
  • If you wanto split index regions by first two column. You can use SQL like below.

    split table `t` index `idx1` between (1000, "a") and (1000,"z") regions 26;

What is changed and how it works?

Related Parser PR: pingcap/parser#321

Split num regions between lower and upper value.

Because the key value is byte slice, I cann't do this simple like (max-min) / num .

To solve this problem:

  1. find the longest common prefix of min and max value.
  2. get 8 diff-bytes of min and max from begin, if less than 8, fill with 0xff
  3. convert 8 diff-bytes to uint64. Then use the diff_uint64/num to get the step value.

Check List

Tests

  • Unit test
  • Integration test

Code changes

  • Has exported function/method change

Side effects

Related changes

crazycs520 added some commits May 9, 2019

@zhouqiang-cl

This comment has been minimized.

Copy link
Member

commented May 9, 2019

/run-all-tests

1 similar comment
@zhouqiang-cl

This comment has been minimized.

Copy link
Member

commented May 9, 2019

/run-all-tests

@codecov

This comment has been minimized.

Copy link

commented May 10, 2019

Codecov Report

Merging #10409 into master will decrease coverage by 0.0053%.
The diff coverage is 82.8947%.

@@               Coverage Diff               @@
##             master    #10409        +/-   ##
===============================================
- Coverage   79.5733%   79.568%   -0.0054%     
===============================================
  Files           415       415                
  Lines         87993     88112       +119     
===============================================
+ Hits          70019     70109        +90     
- Misses        12793     12813        +20     
- Partials       5181      5190         +9
@crazycs520

This comment has been minimized.

Copy link
Contributor Author

commented May 10, 2019

/run-all-tests

@crazycs520

This comment has been minimized.

Copy link
Contributor Author

commented May 10, 2019

crazycs520 and others added some commits May 20, 2019

Show resolved Hide resolved executor/split.go Outdated
Show resolved Hide resolved executor/split.go Outdated
Show resolved Hide resolved executor/split.go Outdated
Show resolved Hide resolved executor/split.go Outdated

crazycs520 added some commits Jun 4, 2019

Show resolved Hide resolved executor/split.go Outdated

crazycs520 added some commits Jun 5, 2019

@bb7133
Copy link
Contributor

left a comment

LGTM

@bb7133 bb7133 added the status/LGT2 label Jun 5, 2019

@crazycs520

This comment has been minimized.

Copy link
Contributor Author

commented Jun 5, 2019

/run-all-tests

Show resolved Hide resolved planner/core/planbuilder.go Outdated
Show resolved Hide resolved executor/split.go Outdated
Show resolved Hide resolved executor/split.go Outdated
Show resolved Hide resolved executor/split_test.go Outdated
if err != nil {
return nil, err
}
upperIdxKey, _, err := index.GenIndexKey(e.ctx.GetSessionVars().StmtCtx, e.upper, math.MinInt64, nil)

This comment has been minimized.

Copy link
@zimulala

zimulala Jun 5, 2019

Member

Why we don't usemath.MaxInt64? Please add comments.

This comment has been minimized.

Copy link
@crazycs520

crazycs520 Jun 5, 2019

Author Contributor

Sound reasonable, But I just want to keep consistent with lowerIdxKey here ,I don't like the result of (maxHandleID - minHandleID) != 0 to affect the split point. Use math.MaxInt64 here will get error from TestSplitIndex.

This comment has been minimized.

Copy link
@crazycs520

crazycs520 Jun 5, 2019

Author Contributor

Already add a comment here.

Show resolved Hide resolved executor/split_test.go Outdated

crazycs520 added some commits Jun 5, 2019

@zimulala
Copy link
Member

left a comment

LGTM

@zimulala

This comment has been minimized.

Copy link
Member

commented Jun 6, 2019

/run-all-tests

return nil, ErrWrongValueCountOnRow.GenWithStackByArgs(i + 1)
}
valueList := make([]types.Datum, 0, len(valuesItem))
convertValue2ColumnType := func(valuesItem []ast.ExprNode) ([]types.Datum, error) {

This comment has been minimized.

Copy link
@zz-jason

zz-jason Jun 6, 2019

Member

can you split this function outside the definition of buildSplitRegion()? It's a complicated function with 40 lines, which reduces the readability of buildSplitRegion()

@crazycs520

This comment has been minimized.

Copy link
Contributor Author

commented Jun 6, 2019

/run-all-tests

@zz-jason zz-jason merged commit 7bf3d69 into pingcap:master Jun 6, 2019

14 checks passed

ci/circleci Your tests passed on CircleCI!
Details
idc-jenkins-ci-tidb/build Jenkins job succeeded.
Details
idc-jenkins-ci-tidb/build_check_race Jenkins job succeeded.
Details
idc-jenkins-ci-tidb/check_dev Jenkins job succeeded.
Details
idc-jenkins-ci-tidb/check_dev_2 Jenkins job succeeded.
Details
idc-jenkins-ci-tidb/common-test job succeeded
Details
idc-jenkins-ci-tidb/integration-common-test Jenkins job succeeded.
Details
idc-jenkins-ci-tidb/integration-compatibility-test Jenkins job succeeded.
Details
idc-jenkins-ci-tidb/integration-ddl-test Jenkins job succeeded.
Details
idc-jenkins-ci-tidb/mybatis-test job succeeded
Details
idc-jenkins-ci-tidb/sqllogic-test-1 Jenkins job succeeded.
Details
idc-jenkins-ci-tidb/sqllogic-test-2 Jenkins job succeeded.
Details
idc-jenkins-ci-tidb/unit-test Jenkins job succeeded.
Details
license/cla Contributor License Agreement is signed.
Details

crazycs520 added a commit to crazycs520/tidb that referenced this pull request Jun 11, 2019

crazycs520 added a commit to crazycs520/tidb that referenced this pull request Jun 19, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.