Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create index on a small table is slow #49477

Closed
tangenta opened this issue Dec 14, 2023 · 2 comments · Fixed by #49479
Closed

Create index on a small table is slow #49477

tangenta opened this issue Dec 14, 2023 · 2 comments · Fixed by #49479
Assignees

Comments

@tangenta
Copy link
Contributor

Enhancement

use test;
drop table if exists t;
create table t (a int);
insert into t values (1);
alter table t add index i(a);

Output:

mysql> alter table t add index i(a);
Query OK, 0 rows affected (2.75 sec)

Log:

[WARN] [region_job.go:531] ["meet error and handle the job later"] ["job stage"=needRescan] [error="[Lightning:KV:EpochNotMatch]EpochNotMatch current epoch of region 22 is conf_ver: 1 version: 66"] [] [start=74800000000000006A5F698000000000000001038000000000000001038000000000000001] [end=74800000000000006A5F69800000000000000103800000000000000103800000000000000100]

You will almost certainly encounter an "EpochNotMatch" error when adding an index, and wait at least two seconds.

// max retry backoff time: 2+4+8+16+30*26=810s
sleepSecond := math.Pow(2, float64(job.retryCount))
if sleepSecond > float64(maxRetryBackoffSecond) {
sleepSecond = float64(maxRetryBackoffSecond)
}

Maybe we can improve this by synchronizing region splitting.

@tangenta
Copy link
Contributor Author

TiDB-lightning needs to stop the scheduling of the corresponding region before importing (See PauseSchedulersByKeyRange) to improve the stability.

The method is to post a label rule:

tidb/br/pkg/pdutil/pd.go

Lines 1020 to 1026 in eb69dac

rule := LabelRule{
ID: uuid.New().String(),
Labels: []RegionLabel{{
Key: "schedule",
Value: "deny",
TTL: ttl.String(),
}},

When PD detects a new rule, it will generate a labeler-split-region operator. However, this process is asynchronous, which means that the region may be split during import. If the corresponding range is no longer on the original region, the request may encounter epoch not match error.

@tangenta
Copy link
Contributor Author

A possible solution is to do the synchronization learning from Backend.waitForScatterRegions(): we can fetch the PD operator periodically, if it is labeler-split-region and the status is Success, the region-split is considered complete.

However, there is a corner-case: if the region ID is unchanged, we may get an outdataed PD operator from previous adding index job. This is not an uncommon situation in integration testing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant