Skip to content
This repository has been archived by the owner on Dec 8, 2021. It is now read-only.

restore: add glue.Glue interface and other function #456

Merged
merged 42 commits into from Nov 16, 2020

Conversation

lance6716
Copy link
Contributor

@lance6716 lance6716 commented Nov 6, 2020

What problem does this PR solve?

To embed lightning into TiDB, a glue interface is needed to use host TiDB's connection. This PR moves some external TiDB usage into a glue interface, to support hiding all call to external TiDB behind glue interface with following PRs.

What is changed and how it works?

  • add glue.Glue and glue.SQLExecutor. glue.SQLExecutor could do some SQL jobs.
  • add some call to Glue.Record to report information to TiDB
  • change RunOnce to enable using lightning as a library more friendly

Check List

Tests

  • Pass original test

Side effects

  • Possible performance regression
  • Increased code complexity

Related changes

lightning/common/util.go Outdated Show resolved Hide resolved
@lance6716 lance6716 added the status/WIP Work in progress label Nov 8, 2020
@lance6716

This comment has been minimized.

@lance6716
Copy link
Contributor Author

lance6716 commented Nov 8, 2020

/run-all-tests

Failed connect to 127.0.0.1:9000; Connection refused
Failed to start minio

@lance6716 lance6716 removed the status/WIP Work in progress label Nov 8, 2020
@lance6716

This comment has been minimized.

@lance6716

This comment has been minimized.

lightning/glue/glue.go Outdated Show resolved Hide resolved
lightning/restore/tidb.go Show resolved Hide resolved
@lance6716
Copy link
Contributor Author

going to let TiDB implement a "glue" CheckpointsDB, here's the change of glue interface follwing this PR.
https://github.com/lance6716/tidb-lightning/pull/1/files

PTAL @overvenus

and PTAL for this PR @kennytm @glorv

@lance6716 lance6716 added the status/WIP Work in progress label Nov 9, 2020
@lance6716 lance6716 added status/PTAL This PR is ready for review. Add this label back after committing new changes status/WIP Work in progress and removed status/WIP Work in progress status/PTAL This PR is ready for review. Add this label back after committing new changes labels Nov 12, 2020
@lance6716 lance6716 added status/PTAL This PR is ready for review. Add this label back after committing new changes and removed status/WIP Work in progress labels Nov 13, 2020
@lance6716
Copy link
Contributor Author

/run-all-tests

cmd/tidb-lightning/main.go Show resolved Hide resolved
lightning/glue/glue.go Outdated Show resolved Hide resolved
lightning/glue/glue.go Show resolved Hide resolved
})

if err := taskCfg.TiDB.Security.RegisterMySQL(); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any problems if we run multi lightning tasks with in the same tidb? Or tidb will refuse to run more than one lightning tasks?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

currently TiDB will only run one task, and RunOnce is blocked thus it's unnatural to concurrently call RunOnce on one lightning (I hope).

For multiple lightning in one process, RegisterMySQL may fail because tlsConfigRegistry is a global variable, I'll be careful for that

@@ -284,6 +312,11 @@ func (l *Lightning) run(taskCfg *config.Config) (err error) {
}

func (l *Lightning) Stop() {
l.cancelLock.Lock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This explicit cancel is not needed? If lightning run in server mod, the cancel task action will call this cancel? If lightning run in once mod, l.shutdown will stop lightning

Copy link
Contributor Author

@lance6716 lance6716 Nov 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A test expects this behaviour, that when lightning shutdown task should be canceled to avoid futhur actions (it's remove checkpoint here)

# Set the failpoint to kill the lightning instance as soon as one chunk is imported, via signal mechanism
# If checkpoint does work, this should only kill $CHUNK_COUNT instances of lightnings.
export GO_FAILPOINTS="$TASKID_FAILPOINTS;github.com/pingcap/tidb-lightning/lightning/restore/KillIfImportedChunk=return($ROW_COUNT)"

@@ -106,6 +106,11 @@ func L() Logger {
return appLogger
}

// SetAppLogger replaces the default logger in this package to given one
func SetAppLogger(l *zap.Logger) {
appLogger = Logger{l.WithOptions(zap.AddStacktrace(zap.DPanicLevel))}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are not the same. #96~98 is the default operation to build an app log. But if we build a logger and set it as global, the caller should take responsibility to set all options and lightning shouldn't override its config anymore.

if err := cfg.LoadFromGlobal(globalCfg); err != nil {
return err
}
return app.RunOnce(context.Background(), cfg, nil, nil)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have thought about it again. Since there is already a context in the Lightning struct RunOnce method can always depend on the app.ctx instead of another ctx as parameter?

Copy link
Contributor Author

@lance6716 lance6716 Nov 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In future usage, lightning may be New once (controlled by inner l.ctx) and that instance RunOnce for many times, like a HTTP server mode. So a task context will help

@lance6716

This comment has been minimized.

Copy link
Collaborator

@kennytm kennytm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rest LGTM

lightning/restore/restore.go Outdated Show resolved Hide resolved
@kennytm kennytm added status/LGT1 One reviewer already commented LGTM (LGTM1) and removed status/PTAL This PR is ready for review. Add this label back after committing new changes labels Nov 16, 2020
Copy link
Contributor

@glorv glorv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@glorv glorv added status/LGT2 Two reviewers already commented LGTM, ready for merge (LGTM2) and removed status/LGT1 One reviewer already commented LGTM (LGTM1) labels Nov 16, 2020
@glorv glorv merged commit a78137f into pingcap:master Nov 16, 2020
lance6716 added a commit to lance6716/tidb-lightning that referenced this pull request Nov 17, 2020
add notes

save work

save work

fix unit test

remove tidbMgr in RestoreController

remove some comments

remove some comments

change logger in SQLWithRetry

revert replace log.Logger to *zap.Logger

dep: update uuid dependency to latest google/uuid (pingcap#452)

* dep: update satori/go.uuid to latest

* fix tests

* change to google/uuid

* fix build

* try fix test

* get familiar with google/uuid

* address comment

tidb-lightning-ctl: change default of -d to 'noop://' (pingcap#453)

also add noop:// to supported storage types (to represent an empty store)

replace tab to space

try another port to fix CI

remove some comment

*: more glue

restore: fix the bug that gc life time ttl does not take effect (pingcap#448)

* fix gc ttl loop

* resolve comment and add tests

fix CI

report info to host TiDB

config: filter out all system schemas by default (pingcap#459)

backend: fix auto random default value for primary key (pingcap#457)

* fix auto generate auto random primary key column

* fix default for auto random primary key

* fix test

* use prev row id for auto random and add a test

* replace chunck with session opt

* fix

* fix

mydumper: fix parquet data parser (pingcap#435)

* fix parquet

* reorder imports

* fix test

* use empty collation

* fix a error and add more test cases

* add pointer type tests

* resolve comments

Co-authored-by: kennytm <kennytm@gmail.com>

address comment

backend/local: use range properties to optimize region range estimate (pingcap#422)

* use range propreties to estimate region range

* post-restore: add optional level for post-restore operations (pingcap#421)

* add optional level for opst-restore operations

* trim leading and suffix '"

* use UnmarshalTOML to unmarshal post restore op level

* resolve comments and fix unit test

* backend/local: do not retry epochNotMatch error when ingest sst (pingcap#419)

* do not retry epochNotMatch error when ingest sst

* add retry ingest for 'Raft raft: proposal dropped' error in ingest

* change some retryable error log level from Error to Warn

* fix nextKey

* add a comment for nextKey

* fix comment and add a unit test

* wrap time.Sleep in select

Co-authored-by: kennytm <kennytm@gmail.com>

* update

* use range properties to optimze region range estimate

* update pebble

* change the default value for batch-size

* add unit tests and reslove comments

* add a comment to range properties test

* add a comment

* add a test for range property with pebble

* rename const variable

Co-authored-by: kennytm <kennytm@gmail.com>

fix pd service id is empty (pingcap#460)

fix s3 parquet reader (pingcap#461)

Co-authored-by: Neil Shen <overvenus@gmail.com>

fix service gc ttl again (pingcap#465)

address comment

mydumper: verify file routing config (pingcap#470)

* fix file routing

* remove useless line

* remove redundant if check

rename a method in interface

save work

try fix CI

could work

change ctx usage

try fix CI

try fix CI

refine function interface

refine some fucntion interface

debug CI

address comment

config: allow four byte-size config to be specified using human-readable units ("100 GiB") (pingcap#471)

* Makefile: add `make finish-prepare` action

* config: accept human-readable size for most byte-related config

e.g. allow `region-split-size = '96M'` in additional to `= 100663296`

(known issue: these values' precisions will be truncated to 53 bits
instead of supporting all 63 bits)

* restore: reduce chance of spurious errors from TestGcTTLManagerSingle

Co-authored-by: glorv <glorvs@163.com>

remove debug log

test: change double type syntax (pingcap#474)

address comment

checkpoint: add glue checkpoint

resolve cycle import

expose Retry

refine

change interface to cope with TiDB

fix SQL string

fix SQL

adjust interface to embedded in TiDB

could import now

reduce TLS

restore: add `glue.Glue` interface and other function (pingcap#456)

* save my work

* add notes

* save work

* save work

* fix unit test

* remove tidbMgr in RestoreController

* remove some comments

* remove some comments

* change logger in SQLWithRetry

* revert replace log.Logger to *zap.Logger

* replace tab to space

* try another port to fix CI

* remove some comment

* *: more glue

* report info to host TiDB

* fix CI

* address comment

* address comment

* rename a method in interface

* save work

* try fix CI

* could work

* change ctx usage

* try fix CI

* try fix CI

* refine function interface

* refine some fucntion interface

* debug CI

* address comment

* remove debug log

* address comment

modify code

add comment

refine some code
lance6716 added a commit to lance6716/tidb-lightning that referenced this pull request Nov 17, 2020
add notes

save work

save work

fix unit test

remove tidbMgr in RestoreController

remove some comments

remove some comments

change logger in SQLWithRetry

revert replace log.Logger to *zap.Logger

dep: update uuid dependency to latest google/uuid (pingcap#452)

* dep: update satori/go.uuid to latest

* fix tests

* change to google/uuid

* fix build

* try fix test

* get familiar with google/uuid

* address comment

tidb-lightning-ctl: change default of -d to 'noop://' (pingcap#453)

also add noop:// to supported storage types (to represent an empty store)

replace tab to space

try another port to fix CI

remove some comment

*: more glue

restore: fix the bug that gc life time ttl does not take effect (pingcap#448)

* fix gc ttl loop

* resolve comment and add tests

fix CI

report info to host TiDB

config: filter out all system schemas by default (pingcap#459)

backend: fix auto random default value for primary key (pingcap#457)

* fix auto generate auto random primary key column

* fix default for auto random primary key

* fix test

* use prev row id for auto random and add a test

* replace chunck with session opt

* fix

* fix

mydumper: fix parquet data parser (pingcap#435)

* fix parquet

* reorder imports

* fix test

* use empty collation

* fix a error and add more test cases

* add pointer type tests

* resolve comments

Co-authored-by: kennytm <kennytm@gmail.com>

address comment

backend/local: use range properties to optimize region range estimate (pingcap#422)

* use range propreties to estimate region range

* post-restore: add optional level for post-restore operations (pingcap#421)

* add optional level for opst-restore operations

* trim leading and suffix '"

* use UnmarshalTOML to unmarshal post restore op level

* resolve comments and fix unit test

* backend/local: do not retry epochNotMatch error when ingest sst (pingcap#419)

* do not retry epochNotMatch error when ingest sst

* add retry ingest for 'Raft raft: proposal dropped' error in ingest

* change some retryable error log level from Error to Warn

* fix nextKey

* add a comment for nextKey

* fix comment and add a unit test

* wrap time.Sleep in select

Co-authored-by: kennytm <kennytm@gmail.com>

* update

* use range properties to optimze region range estimate

* update pebble

* change the default value for batch-size

* add unit tests and reslove comments

* add a comment to range properties test

* add a comment

* add a test for range property with pebble

* rename const variable

Co-authored-by: kennytm <kennytm@gmail.com>

fix pd service id is empty (pingcap#460)

fix s3 parquet reader (pingcap#461)

Co-authored-by: Neil Shen <overvenus@gmail.com>

fix service gc ttl again (pingcap#465)

address comment

mydumper: verify file routing config (pingcap#470)

* fix file routing

* remove useless line

* remove redundant if check

rename a method in interface

save work

try fix CI

could work

change ctx usage

try fix CI

try fix CI

refine function interface

refine some fucntion interface

debug CI

address comment

config: allow four byte-size config to be specified using human-readable units ("100 GiB") (pingcap#471)

* Makefile: add `make finish-prepare` action

* config: accept human-readable size for most byte-related config

e.g. allow `region-split-size = '96M'` in additional to `= 100663296`

(known issue: these values' precisions will be truncated to 53 bits
instead of supporting all 63 bits)

* restore: reduce chance of spurious errors from TestGcTTLManagerSingle

Co-authored-by: glorv <glorvs@163.com>

remove debug log

test: change double type syntax (pingcap#474)

address comment

checkpoint: add glue checkpoint

resolve cycle import

expose Retry

refine

change interface to cope with TiDB

fix SQL string

fix SQL

adjust interface to embedded in TiDB

could import now

reduce TLS

restore: add `glue.Glue` interface and other function (pingcap#456)

* save my work

* add notes

* save work

* save work

* fix unit test

* remove tidbMgr in RestoreController

* remove some comments

* remove some comments

* change logger in SQLWithRetry

* revert replace log.Logger to *zap.Logger

* replace tab to space

* try another port to fix CI

* remove some comment

* *: more glue

* report info to host TiDB

* fix CI

* address comment

* address comment

* rename a method in interface

* save work

* try fix CI

* could work

* change ctx usage

* try fix CI

* try fix CI

* refine function interface

* refine some fucntion interface

* debug CI

* address comment

* remove debug log

* address comment

modify code

add comment

refine some code
glorv pushed a commit that referenced this pull request Nov 23, 2020
* save my work

add notes

save work

save work

fix unit test

remove tidbMgr in RestoreController

remove some comments

remove some comments

change logger in SQLWithRetry

revert replace log.Logger to *zap.Logger

dep: update uuid dependency to latest google/uuid (#452)

* dep: update satori/go.uuid to latest

* fix tests

* change to google/uuid

* fix build

* try fix test

* get familiar with google/uuid

* address comment

tidb-lightning-ctl: change default of -d to 'noop://' (#453)

also add noop:// to supported storage types (to represent an empty store)

replace tab to space

try another port to fix CI

remove some comment

*: more glue

restore: fix the bug that gc life time ttl does not take effect (#448)

* fix gc ttl loop

* resolve comment and add tests

fix CI

report info to host TiDB

config: filter out all system schemas by default (#459)

backend: fix auto random default value for primary key (#457)

* fix auto generate auto random primary key column

* fix default for auto random primary key

* fix test

* use prev row id for auto random and add a test

* replace chunck with session opt

* fix

* fix

mydumper: fix parquet data parser (#435)

* fix parquet

* reorder imports

* fix test

* use empty collation

* fix a error and add more test cases

* add pointer type tests

* resolve comments

Co-authored-by: kennytm <kennytm@gmail.com>

address comment

backend/local: use range properties to optimize region range estimate (#422)

* use range propreties to estimate region range

* post-restore: add optional level for post-restore operations (#421)

* add optional level for opst-restore operations

* trim leading and suffix '"

* use UnmarshalTOML to unmarshal post restore op level

* resolve comments and fix unit test

* backend/local: do not retry epochNotMatch error when ingest sst (#419)

* do not retry epochNotMatch error when ingest sst

* add retry ingest for 'Raft raft: proposal dropped' error in ingest

* change some retryable error log level from Error to Warn

* fix nextKey

* add a comment for nextKey

* fix comment and add a unit test

* wrap time.Sleep in select

Co-authored-by: kennytm <kennytm@gmail.com>

* update

* use range properties to optimze region range estimate

* update pebble

* change the default value for batch-size

* add unit tests and reslove comments

* add a comment to range properties test

* add a comment

* add a test for range property with pebble

* rename const variable

Co-authored-by: kennytm <kennytm@gmail.com>

fix pd service id is empty (#460)

fix s3 parquet reader (#461)

Co-authored-by: Neil Shen <overvenus@gmail.com>

fix service gc ttl again (#465)

address comment

mydumper: verify file routing config (#470)

* fix file routing

* remove useless line

* remove redundant if check

rename a method in interface

save work

try fix CI

could work

change ctx usage

try fix CI

try fix CI

refine function interface

refine some fucntion interface

debug CI

address comment

config: allow four byte-size config to be specified using human-readable units ("100 GiB") (#471)

* Makefile: add `make finish-prepare` action

* config: accept human-readable size for most byte-related config

e.g. allow `region-split-size = '96M'` in additional to `= 100663296`

(known issue: these values' precisions will be truncated to 53 bits
instead of supporting all 63 bits)

* restore: reduce chance of spurious errors from TestGcTTLManagerSingle

Co-authored-by: glorv <glorvs@163.com>

remove debug log

test: change double type syntax (#474)

address comment

checkpoint: add glue checkpoint

resolve cycle import

expose Retry

refine

change interface to cope with TiDB

fix SQL string

fix SQL

adjust interface to embedded in TiDB

could import now

reduce TLS

restore: add `glue.Glue` interface and other function (#456)

* save my work

* add notes

* save work

* save work

* fix unit test

* remove tidbMgr in RestoreController

* remove some comments

* remove some comments

* change logger in SQLWithRetry

* revert replace log.Logger to *zap.Logger

* replace tab to space

* try another port to fix CI

* remove some comment

* *: more glue

* report info to host TiDB

* fix CI

* address comment

* address comment

* rename a method in interface

* save work

* try fix CI

* could work

* change ctx usage

* try fix CI

* try fix CI

* refine function interface

* refine some fucntion interface

* debug CI

* address comment

* remove debug log

* address comment

modify code

add comment

refine some code

* address comment

* add some comments

* fix CI and change CREATE TABLE
@lance6716 lance6716 mentioned this pull request Apr 11, 2023
12 tasks
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
status/LGT2 Two reviewers already commented LGTM, ready for merge (LGTM2)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants