Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

log-backup: implement the checkpoint V3 #36114

Merged
merged 13 commits into from Jul 13, 2022
Merged

Conversation

YuJuncen
Copy link
Contributor

@YuJuncen YuJuncen commented Jul 12, 2022

What problem does this PR solve?

Issue Number: close #35164(cherry-pick from feature branch https://github.com/pingcap/tidb/pull/35685/files)

Problem Summary:

Generally, this PR implements the checkpoint V3, which is for solving the consistency problem checkpoint V2 meets.

For more details about the problems we have meet, check tikv/tikv#12715.

We have introduced a central node (embed in TiDB currently) for solving the problem: this central node pulls the checkpoint of each region, once it found that it have collected the checkpoint of the full keyspace, it advances the checkpoint. (So it named advancer).

What is changed and how it works?

This PR(cherry-pick #35685):

  • Added a new package streamhelper, which contains the advancer type and client type.
  • The "advancer" is a central node, which would poll the gRPC service LogBackup for the latest checkpoint of each region, collect them and advancing the global checkpoint.
  • The "advancer" can be enabled by either:
    • Use the br log advancer command.
    • Set the tidb config log-backup.enabled to true.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

YuJuncen and others added 2 commits July 12, 2022 11:26
* Squashed commit of the following:

commit a3b65e6cf6d96a56b1c4b6261aa0972555b2d0eb
Merge: fdba35d03 df9b54b
Author: Yu Juncen <yujuncen@pingcap.com>
Date:   Thu Jun 23 11:32:25 2022 +0800

    Merge branch 'master' of https://github.com/pingcap/tidb into checkpoint-v2

    Signed-off-by: Yu Juncen <yujuncen@pingcap.com>

commit fdba35d03c1074e54ab5d811fa923173f22a2c11
Author: Yu Juncen <yujuncen@pingcap.com>
Date:   Wed Jun 22 17:48:45 2022 +0800

    added the tsheap struct, make abstuction over environment

    Signed-off-by: Yu Juncen <yujuncen@pingcap.com>

commit 11e07089775b3deb51db30b627b48684522bd916
Author: Yu Juncen <yujuncen@pingcap.com>
Date:   Wed Jun 22 17:47:31 2022 +0800

    conn: make StoreManager a package

    Signed-off-by: Yu Juncen <yujuncen@pingcap.com>

commit abf840cd8e40c47dae6ecedb4051a63e6e32a5c0
Author: Yu Juncen <yujuncen@pingcap.com>
Date:   Fri Jun 17 17:00:25 2022 +0800

    implement basic get checkpoint

    Signed-off-by: Yu Juncen <yujuncen@pingcap.com>

commit 314015c3c24d8d76950b049195d90bb0e3bfe426
Author: Yu Juncen <yujuncen@pingcap.com>
Date:   Tue Jun 7 14:47:31 2022 +0800

    fix the retry over wrapped errors

    Signed-off-by: Yu Juncen <yujuncen@pingcap.com>

commit 5c8e77a
Author: Yu Juncen <yujuncen@pingcap.com>
Date:   Mon Jun 6 15:33:14 2022 +0800

    adapt new checkpoint model

    Signed-off-by: Yu Juncen <yujuncen@pingcap.com>

Signed-off-by: Yu Juncen <yujuncen@pingcap.com>

* move StoreManager to utils package

Signed-off-by: Yu Juncen <yujuncen@pingcap.com>

* don't make streamhelper requires stream

Signed-off-by: Yu Juncen <yujuncen@pingcap.com>

* integrated advancer to TiDB

Signed-off-by: Yu Juncen <yujuncen@pingcap.com>

* added more metrics.

Signed-off-by: Yu Juncen <yujuncen@pingcap.com>

* clear when task removed

Signed-off-by: Yu Juncen <yujuncen@pingcap.com>

* add collector

Signed-off-by: Yu Juncen <yujuncen@pingcap.com>

* fix collapse ranges

Signed-off-by: Yu Juncen <yujuncen@pingcap.com>

* fix typo && make bucket larger

Signed-off-by: Yu Juncen <yujuncen@pingcap.com>

* fix stuck when error

Signed-off-by: Yu Juncen <yujuncen@pingcap.com>

* fix inconsistent ranges

Signed-off-by: Yu Juncen <yujuncen@pingcap.com>

* added even more comments

Signed-off-by: Yu Juncen <yujuncen@pingcap.com>

* address comments; added some comments

Signed-off-by: Yu Juncen <yujuncen@pingcap.com>

* make linter happy

Signed-off-by: Yu Juncen <yujuncen@pingcap.com>

* address comments

Signed-off-by: Yu Juncen <yujuncen@pingcap.com>

* make clippy happy

Signed-off-by: Yu Juncen <yujuncen@pingcap.com>

* guard on close store manager

Signed-off-by: Yu Juncen <yujuncen@pingcap.com>

* added even more guards

Signed-off-by: Yu Juncen <yujuncen@pingcap.com>

* Update br/pkg/streamhelper/advancer.go

Co-authored-by: 3pointer <qdlc2010@gmail.com>

Co-authored-by: Zak Zhao <57036248+joccau@users.noreply.github.com>
Co-authored-by: 3pointer <qdlc2010@gmail.com>
Signed-off-by: Yu Juncen <yujuncen@pingcap.com>
Signed-off-by: Yu Juncen <yujuncen@pingcap.com>
@ti-chi-bot
Copy link
Member

ti-chi-bot commented Jul 12, 2022

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • 3pointer
  • joccau

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

Signed-off-by: Yu Juncen <yujuncen@pingcap.com>
domain/domain.go Outdated Show resolved Hide resolved
Signed-off-by: Yu Juncen <yujuncen@pingcap.com>
@3pointer
Copy link
Contributor

please remove bazel file changes in this PR. let's focus on cherry-pick of https://github.com/pingcap/tidb/pull/35685/files

Signed-off-by: Yu Juncen <yujuncen@pingcap.com>
@sre-bot
Copy link
Contributor

sre-bot commented Jul 12, 2022

DEPS.bzl Outdated Show resolved Hide resolved
@3pointer
Copy link
Contributor

/rebuild

2 similar comments
@3pointer
Copy link
Contributor

/rebuild

@3pointer
Copy link
Contributor

/rebuild

YuJuncen and others added 2 commits July 12, 2022 20:56
Signed-off-by: Yu Juncen <yujuncen@pingcap.com>
@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Jul 13, 2022
@ti-chi-bot ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Jul 13, 2022
@3pointer
Copy link
Contributor

/merge

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: d6ca3ac

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Jul 13, 2022
@ti-chi-bot ti-chi-bot merged commit fee2a12 into pingcap:master Jul 13, 2022
@sre-bot
Copy link
Contributor

sre-bot commented Jul 13, 2022

TiDB MergeCI notify

🔴 Bad News! New failing [1] after this pr merged.
These new failed integration tests seem to be caused by the current PR, please try to fix these new failed integration tests, thanks!

CI Name Result Duration Compare with Parent commit
idc-jenkins-ci/integration-cdc-test 🟥 failed 2, success 34, total 36 1 hr 14 min New failing
idc-jenkins-ci-tidb/integration-common-test 🟢 all 11 tests passed 37 min Existing passed
idc-jenkins-ci-tidb/common-test 🟢 all 12 tests passed 16 min Existing passed
idc-jenkins-ci-tidb/tics-test 🟢 all 1 tests passed 8 min 46 sec Existing passed
idc-jenkins-ci-tidb/sqllogic-test-2 🟢 all 28 tests passed 7 min 54 sec Existing passed
idc-jenkins-ci-tidb/integration-ddl-test 🟢 all 6 tests passed 7 min 47 sec Existing passed
idc-jenkins-ci-tidb/sqllogic-test-1 🟢 all 26 tests passed 5 min 43 sec Existing passed
idc-jenkins-ci-tidb/mybatis-test 🟢 all 1 tests passed 3 min 25 sec Existing passed
idc-jenkins-ci-tidb/integration-compatibility-test 🟢 all 1 tests passed 3 min 15 sec Existing passed
idc-jenkins-ci-tidb/plugin-test 🟢 build success, plugin test success 4min Existing passed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note-none size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

log_backup: adapt the new checkpoint model
6 participants