Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ddl: add recover for run ddl job #10981

Merged
merged 6 commits into from Jul 2, 2019

Conversation

Projects
None yet
5 participants
@crazycs520
Copy link
Contributor

commented Jun 28, 2019

What problem does this PR solve?

Fix ddl worker run ddl job panic.

[2019/06/28 15:21:54.379 +08:00] [INFO] [ddl_worker.go:474] ["[ddl] run DDL job"] [worker="worker 1, tp general"] [job="ID:48, Type:create table, State:none, SchemaState:none, SchemaID:1, TableID:47, RowCount:0, ArgLen:0, start time: 2019-06-28 15:19:12.653 +0800 CST, Err:<nil>, ErrCount:0, SnapshotVersion:0"]
[2019/06/28 15:21:54.379 +08:00] [ERROR] [ddl.go:441] ["[ddl] DDL worker meet panic"] [worker="worker 1, tp general"] [ID=8c8b0d58-4238-4f21-bd15-6915af71a983] [stack="github.com/pingcap/tidb/ddl.(*ddl).start.func2
	/Users/cs/code/goread/src/github.com/pingcap/tidb/ddl/ddl.go:441
github.com/pingcap/tidb/util.WithRecovery.func1
	/Users/cs/code/goread/src/github.com/pingcap/tidb/util/misc.go:72
runtime.gopanic
	/usr/local/go1.12/src/runtime/panic.go:522
runtime.panicmem
	/usr/local/go1.12/src/runtime/panic.go:82
runtime.sigpanic
	/usr/local/go1.12/src/runtime/signal_unix.go:390
github.com/pingcap/tidb/infoschema.(*Handle).Get
	/Users/cs/code/goread/src/github.com/pingcap/tidb/infoschema/infoschema.go:302
github.com/pingcap/tidb/ddl.onCreateTable
	/Users/cs/code/goread/src/github.com/pingcap/tidb/ddl/table.go:56
github.com/pingcap/tidb/ddl.(*worker).runDDLJob
	/Users/cs/code/goread/src/github.com/pingcap/tidb/ddl/ddl_worker.go:499
github.com/pingcap/tidb/ddl.(*worker).handleDDLJobQueue.func1
	/Users/cs/code/goread/src/github.com/pingcap/tidb/ddl/ddl_worker.go:402
github.com/pingcap/tidb/kv.RunInNewTxn
	/Users/cs/code/goread/src/github.com/pingcap/tidb/kv/txn.go:50
github.com/pingcap/tidb/ddl.(*worker).handleDDLJobQueue
	/Users/cs/code/goread/src/github.com/pingcap/tidb/ddl/ddl_worker.go:365
github.com/pingcap/tidb/ddl.(*worker).start
	/Users/cs/code/goread/src/github.com/pingcap/tidb/ddl/ddl_worker.go:137
github.com/pingcap/tidb/ddl.(*ddl).start.func1
	/Users/cs/code/goread/src/github.com/pingcap/tidb/ddl/ddl.go:438
github.com/pingcap/tidb/util.WithRecovery
	/Users/cs/code/goread/src/github.com/pingcap/tidb/util/misc.go:80"]
[2019/06/28 15:21:54.380 +08:00] [ERROR] [misc.go:75] ["panic in the recoverable goroutine"] [r="\"invalid memory address or nil pointer dereference\""] ["stack trace"="github.com/pingcap/tidb/util.WithRecovery.func1
	/Users/cs/code/goread/src/github.com/pingcap/tidb/util/misc.go:77
runtime.gopanic
	/usr/local/go1.12/src/runtime/panic.go:522
runtime.panicmem
	/usr/local/go1.12/src/runtime/panic.go:82
runtime.sigpanic
	/usr/local/go1.12/src/runtime/signal_unix.go:390
github.com/pingcap/tidb/infoschema.(*Handle).Get
	/Users/cs/code/goread/src/github.com/pingcap/tidb/infoschema/infoschema.go:302
github.com/pingcap/tidb/ddl.onCreateTable
	/Users/cs/code/goread/src/github.com/pingcap/tidb/ddl/table.go:56
github.com/pingcap/tidb/ddl.(*worker).runDDLJob
	/Users/cs/code/goread/src/github.com/pingcap/tidb/ddl/ddl_worker.go:499
github.com/pingcap/tidb/ddl.(*worker).handleDDLJobQueue.func1
	/Users/cs/code/goread/src/github.com/pingcap/tidb/ddl/ddl_worker.go:402
github.com/pingcap/tidb/kv.RunInNewTxn
	/Users/cs/code/goread/src/github.com/pingcap/tidb/kv/txn.go:50
github.com/pingcap/tidb/ddl.(*worker).handleDDLJobQueue
	/Users/cs/code/goread/src/github.com/pingcap/tidb/ddl/ddl_worker.go:365
github.com/pingcap/tidb/ddl.(*worker).start
	/Users/cs/code/goread/src/github.com/pingcap/tidb/ddl/ddl_worker.go:137
github.com/pingcap/tidb/ddl.(*ddl).start.func1
	/Users/cs/code/goread/src/github.com/pingcap/tidb/ddl/ddl.go:438
github.com/pingcap/tidb/util.WithRecovery
	/Users/cs/code/goread/src/github.com/pingcap/tidb/util/misc.go:80"] [stack="github.com/pingcap/tidb/util.WithRecovery.func1
	/Users/cs/code/goread/src/github.com/pingcap/tidb/util/misc.go:75
runtime.gopanic
	/usr/local/go1.12/src/runtime/panic.go:522
runtime.panicmem
	/usr/local/go1.12/src/runtime/panic.go:82
runtime.sigpanic
	/usr/local/go1.12/src/runtime/signal_unix.go:390
github.com/pingcap/tidb/infoschema.(*Handle).Get
	/Users/cs/code/goread/src/github.com/pingcap/tidb/infoschema/infoschema.go:302
github.com/pingcap/tidb/ddl.onCreateTable
	/Users/cs/code/goread/src/github.com/pingcap/tidb/ddl/table.go:56
github.com/pingcap/tidb/ddl.(*worker).runDDLJob
	/Users/cs/code/goread/src/github.com/pingcap/tidb/ddl/ddl_worker.go:499
github.com/pingcap/tidb/ddl.(*worker).handleDDLJobQueue.func1
	/Users/cs/code/goread/src/github.com/pingcap/tidb/ddl/ddl_worker.go:402
github.com/pingcap/tidb/kv.RunInNewTxn
	/Users/cs/code/goread/src/github.com/pingcap/tidb/kv/txn.go:50
github.com/pingcap/tidb/ddl.(*worker).handleDDLJobQueue
	/Users/cs/code/goread/src/github.com/pingcap/tidb/ddl/ddl_worker.go:365
github.com/pingcap/tidb/ddl.(*worker).start
	/Users/cs/code/goread/src/github.com/pingcap/tidb/ddl/ddl_worker.go:137
github.com/pingcap/tidb/ddl.(*ddl).start.func1
	/Users/cs/code/goread/src/github.com/pingcap/tidb/ddl/ddl.go:438
github.com/pingcap/tidb/util.WithRecovery
	/Users/cs/code/goread/src/github.com/pingcap/tidb/util/misc.go:80"]

What is changed and how it works?

  • Add a recover in runDDLJob function.
  • check nil point.

Check List

Tests

  • Unit test

Code changes

  • Has exported function/method change

Side effects

Related changes

  • Need to cherry-pick to the release branch
@codecov

This comment has been minimized.

Copy link

commented Jun 28, 2019

Codecov Report

Merging #10981 into master will not change coverage.
The diff coverage is n/a.

@@             Coverage Diff             @@
##             master     #10981   +/-   ##
===========================================
  Coverage   81.0665%   81.0665%           
===========================================
  Files           421        421           
  Lines         89577      89577           
===========================================
  Hits          72617      72617           
  Misses        11725      11725           
  Partials       5235       5235
Show resolved Hide resolved ddl/table.go Outdated
Show resolved Hide resolved ddl/serial_test.go Outdated
Show resolved Hide resolved ddl/ddl_worker.go Outdated
Show resolved Hide resolved ddl/serial_test.go Outdated

crazycs520 added some commits Jun 28, 2019

@@ -471,6 +472,18 @@ func chooseLeaseTime(t, max time.Duration) time.Duration {

// runDDLJob runs a DDL job. It returns the current schema version in this transaction and the error.
func (w *worker) runDDLJob(d *ddlCtx, t *meta.Meta, job *model.Job) (ver int64, err error) {
defer func() {
r := recover()

This comment has been minimized.

Copy link
@zimulala

zimulala Jun 28, 2019

Member

Could we use WithRecovery

This comment has been minimized.

Copy link
@crazycs520

crazycs520 Jul 1, 2019

Author Contributor

done.

logutil.BgLogger().Error("run ddl job panic",
zap.Reflect("r", r),
zap.Stack("stack trace"))
job.State = model.JobStateCancelling

This comment has been minimized.

Copy link
@winkyao

winkyao Jun 28, 2019

Member

add some comments, why cancel the job here.

This comment has been minimized.

Copy link
@crazycs520

crazycs520 Jul 1, 2019

Author Contributor

done.

@zimulala
Copy link
Member

left a comment

LGTM

@crazycs520

This comment has been minimized.

Copy link
Contributor Author

commented Jul 1, 2019

/run-all-tests

@winkyao
Copy link
Member

left a comment

LGTM

@bb7133

This comment has been minimized.

Copy link
Contributor

commented Jul 1, 2019

LGTM

@bb7133 bb7133 added the status/LGT2 label Jul 1, 2019

@crazycs520 crazycs520 added status/LGT3 and removed status/LGT2 labels Jul 2, 2019

@crazycs520

This comment has been minimized.

Copy link
Contributor Author

commented Jul 2, 2019

/run-all-tests

@crazycs520

This comment has been minimized.

Copy link
Contributor Author

commented Jul 2, 2019

/run-unit-test

@winkyao

winkyao approved these changes Jul 2, 2019

@winkyao winkyao merged commit 6542e4a into pingcap:master Jul 2, 2019

14 checks passed

ci/circleci Your tests passed on CircleCI!
Details
idc-jenkins-ci-tidb/build Jenkins job succeeded.
Details
idc-jenkins-ci-tidb/build_check_race Jenkins job succeeded.
Details
idc-jenkins-ci-tidb/check_dev Jenkins job succeeded.
Details
idc-jenkins-ci-tidb/check_dev_2 Jenkins job succeeded.
Details
idc-jenkins-ci-tidb/common-test job succeeded
Details
idc-jenkins-ci-tidb/integration-common-test Jenkins job succeeded.
Details
idc-jenkins-ci-tidb/integration-compatibility-test Jenkins job succeeded.
Details
idc-jenkins-ci-tidb/integration-ddl-test Jenkins job succeeded.
Details
idc-jenkins-ci-tidb/mybatis-test job succeeded
Details
idc-jenkins-ci-tidb/sqllogic-test-1 Jenkins job succeeded.
Details
idc-jenkins-ci-tidb/sqllogic-test-2 Jenkins job succeeded.
Details
idc-jenkins-ci-tidb/unit-test Jenkins job succeeded.
Details
license/cla Contributor License Agreement is signed.
Details

crazycs520 added a commit to crazycs520/tidb that referenced this pull request Jul 2, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.