Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ddl: support checkpoint for ingest mode #42769

Merged
merged 21 commits into from Apr 12, 2023

Conversation

tangenta
Copy link
Contributor

@tangenta tangenta commented Apr 3, 2023

What problem does this PR solve?

Issue Number: close #42164

Background:

For creating index, most of the time are spent on reading table, writing index and importing. After code refactoring #42472 and #42668, the procedure is as follows:

image

  1. The DDL worker first divides the table data into tasks, each task representing a region, and sends it to the reader, who reads the row data of the region in batches from the TiKV storage. A batch of data read by the reader is called a chunk.
  2. The reader reads a chunk and sends it to the writer. The writer extracts the index column for each row in the chunk, encodes it into an index KV and writes it to the local engine of TiDB-lightning. Moreover,
    • when the writer's memory occupancy reaches the threshold, a flush will be triggered to write the index KVs from the memory buffer to the disk.
    • when the occupancy of the disk reaches the threshold, an unsafe import will be triggered to import the index KV of the disk to the TiKV storage.
  3. After the writer completes the writing of the chunk, it will return the result to the DDL worker from time to time to update the current progress.
  4. The DDL worker waits for all the results to return, and finally triggers an import to write all the remaining index KVs on the disk to the TiKV storage.

image

If we treat the overall process as a progress bar, the start point is the start key of the table, and the end point is the end key of the table. There are two keys that can represent the current progress:

  • Global Checkpoint: all keys smaller than this have been imported into the TiKV store. Even if all TiDB crash, these keys do not need to be re-imported. It is updated by unsafe import.
  • Local Checkpoint: all keys smaller than this have been written to at least the local disk where the TiDB local engine is located. If TiDB is restarted and the data on the local disk can be accessed, these keys do not need to be read and encoded again. It is updated by flush.

As long as the Global/Local Checkpoint is persisted, before reader starts to read, we can compare the end key and checkpoint of the task to determine whether the task can be skipped. Therefore, we need a component to manage checkpoints, including the addition, deletion and modification of checkpoints, called Checkpoint Manager.

What is changed and how it works?

According to the above reading and writing process, we can abstract the interface for Checkpoint Manager:

type CheckpointManager interface {
   IsComplete(taskID int, start, end kv.Key) bool
   UpdateTotal(taskID int, added int, last bool)
   UpdateCurrent(taskID int, added int) error
}

image

  • IsComplete() is called before the reader reads the data and decides whether to skip the current task.
  • UpdateTotal() is called by the reader after reading the data to update the number of rows contained in the current chunk.
  • UpdateCurrent() is called by the writer after writing the local engine to update the current number of rows written.

The checkpoint manager spawns a background goroutine, which is used to update the checkpoint info to the system table mysql.tidb_ddl_reorg periodically.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

@ti-chi-bot
Copy link
Member

ti-chi-bot commented Apr 3, 2023

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • Benjamin2037
  • wjhuang2016

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot
Copy link
Member

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@ti-chi-bot ti-chi-bot added do-not-merge/needs-linked-issue release-note-none do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Apr 3, 2023
@ti-chi-bot ti-chi-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 4, 2023
@ti-chi-bot ti-chi-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 4, 2023
@tangenta tangenta marked this pull request as ready for review April 6, 2023 13:01
@ti-chi-bot ti-chi-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 6, 2023
@hawkingrei
Copy link
Member

/test all

@tangenta
Copy link
Contributor Author

/retest

)

// CheckpointManager is an interface to manage checkpoints.
type CheckpointManager interface {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove this interface since it's not used in the distribution.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can add a no-op implementation for distributed reorg.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

Copy link
Collaborator

@Benjamin2037 Benjamin2037 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LTGM

@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Apr 11, 2023
@@ -110,6 +111,8 @@ func (b *txnBackfillScheduler) setupWorkers() error {
}

func (b *txnBackfillScheduler) sendTask(task *reorgBackfillTask) {
b.taskMaxID++
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why reallocate the task ID?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it needs to be unique during the lifetime of the DDL job, instead of a task batch.

@@ -288,6 +299,12 @@ func (b *ingestBackfillScheduler) setupWorkers() error {
return errors.Trace(errors.New("cannot get lightning backend"))
}
b.backendCtx = bc
mgr, err := ingest.NewCheckpointManager(b.ctx, bc, b.sessPool, job.ID,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shouldn't set the manager in distribute case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me resolve the conflict after #42753 is merged.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

@ti-chi-bot ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Apr 12, 2023
@tangenta
Copy link
Contributor Author

/merge

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: e535400

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Apr 12, 2023
@tangenta
Copy link
Contributor Author

/retest

@ti-chi-bot ti-chi-bot removed the status/can-merge Indicates a PR has been approved by a committer. label Apr 12, 2023
@tangenta
Copy link
Contributor Author

/merge

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: 9ef599f

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Apr 12, 2023
@tangenta
Copy link
Contributor Author

/retest

@ti-chi-bot ti-chi-bot merged commit 7aac6ab into pingcap:master Apr 12, 2023
11 checks passed
@ti-chi-bot
Copy link
Member

@tangenta: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
idc-jenkins-ci-tidb/unit-test 92dc8b5 link true /test unit-test
idc-jenkins-ci-tidb/mysql-test e535400 link unknown /test mysql-test

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note-none size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support checkpoint for ingest mode of adding index
5 participants