Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

executor: `kill tidb [session id]` can't stop executors and release resources quickly #9844

Merged
merged 16 commits into from Apr 1, 2019

Conversation

@qw4990
Copy link
Contributor

commented Mar 21, 2019

What problem does this PR solve?

fix #9831

What is changed and how it works?

Introduce execCtxWatcher to watch the context specified in Open and stop the wrapped executor when this context is cancelled.

Check List

Tests

  • Unit test
executor/adapter.go Outdated Show resolved Hide resolved
@qw4990 qw4990 force-pushed the qw4990:fix_9831 branch from be07b43 to 94e64e5 Mar 21, 2019
@codecov

This comment has been minimized.

Copy link

commented Mar 21, 2019

Codecov Report

Merging #9844 into master will increase coverage by 0.9053%.
The diff coverage is 25%.

@@               Coverage Diff                @@
##             master      #9844        +/-   ##
================================================
+ Coverage   77.2161%   78.1214%   +0.9053%     
================================================
  Files           405        405                
  Lines         81641      85330      +3689     
================================================
+ Hits          63040      66661      +3621     
+ Misses        13931      13861        -70     
- Partials       4670       4808       +138
@lysu

This comment has been minimized.

Copy link
Member

commented Mar 22, 2019

@qw4990 maybe we can pull this watcher goroutine "up" to connection level~?

for current code, it will create a new goroutine for each stmt but maybe better just create one watcher goroutine per connection?

ctx1, cancelFunc := context.WithCancel(ctx)
seems be little stranger, does we just cancel a "connection ctx" and then close "current Executor" is enough? and no need to withCancel for each request?

cc @tiancaiamao

@tiancaiamao

This comment has been minimized.

Copy link
Contributor

commented Mar 24, 2019

Why we need a additional watcher to do that... each executor should handle its ctx.Done, if not, there must be something wrong.

@qw4990

This comment has been minimized.

Copy link
Contributor Author

commented Mar 25, 2019

@tiancaiamao Each executor can also handle this ctx.Done too, which is not conflicted with this global watcher as long as it's Close is reentrant.
But using a global watcher may be clearer and safer, since it's tedious to make all executors handle the ctx.Done and there is always a risk for forgetting to do it when adding new executors.

executor/adapter.go Outdated Show resolved Hide resolved
executor/adapter.go Outdated Show resolved Hide resolved
@tiancaiamao

This comment has been minimized.

Copy link
Contributor

commented Mar 25, 2019

as long as it's Close is reentrant

I doubt this assumption, could we close the executor multiple times?

But using a global watcher may be clearer and safer

And poor performance if we add a goroutine for each statement

@qw4990

This comment has been minimized.

Copy link
Contributor Author

commented Mar 25, 2019

as long as it's Close is reentrant

I doubt this assumption, could we close the executor multiple times?

But using a global watcher may be clearer and safer

And poor performance if we add a goroutine for each statement

But if all executors watch the context by themselves, there will be more extra goroutines than this one.

So how about we don't let other executors watch it and just use this global goroutine to control them all.

@tiancaiamao

This comment has been minimized.

Copy link
Contributor

commented Mar 25, 2019

The watcher goroutine is not the key point.
We can let the kill tidb [session id] connection goroutine to serve as the watcher and call the executor.Close method (to kill another session).

The fundamental problem is that CAN WE CALL executor.Close ANY TIME WHILE THE EXECUTOR IS RUNNING...

In the old protocol, executor handle the ctx.Done() is optional.
It means that when a executor doesn't obey the rule, the worst case is just as the title says: "kill tidb [session id] can't stop executors and release resources quickly".
After the change, I'm not sure will executor be close twice, will there be data race when two kill request arrive at the same time.

If executor.Close is reentrant, I think your way works. @qw4990
what's your opinion @zz-jason

@zz-jason

This comment has been minimized.

Copy link
Member

commented Mar 25, 2019

I'm afraid that executor.Close() is not thread-safe. Data race may happen when it is called by two session concurrently.

Close() an executor multiple times in a single thread is safe I think. Besides, the way @qw4990 used can prevent the executor being closed more than once.

@zz-jason

This comment has been minimized.

Copy link
Member

commented Mar 25, 2019

CAN WE CALL executor.Close ANY TIME WHILE THE EXECUTOR IS RUNNING

Sure, we can

@qw4990

This comment has been minimized.

Copy link
Contributor Author

commented Mar 25, 2019

We can let the kill tidb [session id] connection goroutine to serve as the watcher and call the executor.Close method (to kill another session).

It's hard to get other sessions' resources like executors in the kill tidb [session id] connection goroutine.
And it may be danger to manipulate a session's resource directly in another session's goroutine.
Since we have a context, just cancelling it and letting its session release resources by itself is more safe and go-style.
So, a watcher goroutine is essential.

If executor.Close is reentrant

All calls of Close are from top to bottom and the execCtxWatcher is wrapped around the root executor, so it will prevent executors from closing multiple times.

@tiancaiamao

This comment has been minimized.

Copy link
Contributor

commented Mar 27, 2019

And it may be danger to manipulate a session's resource directly in another session's goroutine.

The watcher is doing the same thing, right?

So, a watcher goroutine is essential.

No, it's easy to set the cancel function (which is changed to executor.Close now) here:

pi.SetProcessInfo(sql, time.Now(), cmd)

And when one session receive the kill statement, it executes the cancelFunc ( actually executor.Close) to close current session's execution.

@qw4990

@qw4990 qw4990 force-pushed the qw4990:fix_9831 branch from f52fa9d to 9325798 Mar 27, 2019
@qw4990 qw4990 requested a review from tiancaiamao Mar 27, 2019
@qw4990

This comment has been minimized.

Copy link
Contributor Author

commented Mar 27, 2019

According to your comments, the way to solve this problem has been changed.
Now it's clearer and lighter and no the extra goroutine.
PTAL~
@lysu @zz-jason @tiancaiamao @XuHuaiyu

server/server.go Outdated Show resolved Hide resolved
@tiancaiamao

This comment has been minimized.

Copy link
Contributor

commented Mar 29, 2019

Rest LGTM
PTAL @zz-jason

server/server.go Show resolved Hide resolved
server/server.go Show resolved Hide resolved
server/conn.go Show resolved Hide resolved
Copy link
Member

left a comment

LGTM

@zz-jason zz-jason added the status/LGT2 label Apr 1, 2019
@zz-jason

This comment has been minimized.

Copy link
Member

commented Apr 1, 2019

/run-all-tests

@zz-jason zz-jason merged commit 77e91d1 into pingcap:master Apr 1, 2019
14 checks passed
14 checks passed
ci/circleci Your tests passed on CircleCI!
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
idc-jenkins-ci-tidb/build Jenkins job succeeded.
Details
idc-jenkins-ci-tidb/check_dev Jenkins job succeeded.
Details
idc-jenkins-ci-tidb/code_coverage Jenkins job succeeded.
Details
idc-jenkins-ci-tidb/common-test job succeeded
Details
idc-jenkins-ci-tidb/integration-common-test Jenkins job succeeded.
Details
idc-jenkins-ci-tidb/integration-compatibility-test Jenkins job succeeded.
Details
idc-jenkins-ci-tidb/integration-ddl-test Jenkins job succeeded.
Details
idc-jenkins-ci-tidb/mybatis-test job succeeded
Details
idc-jenkins-ci-tidb/sqllogic-test-1 Jenkins job succeeded.
Details
idc-jenkins-ci-tidb/sqllogic-test-2 Jenkins job succeeded.
Details
idc-jenkins-ci-tidb/unit-test Jenkins job succeeded.
Details
license/cla Contributor License Agreement is signed.
Details
@zz-jason

This comment has been minimized.

Copy link
Member

commented Apr 1, 2019

@qw4990 please cherry pick this PR to release 2.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.