Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

domain: close slow query channel after closing session pool #7847

Merged
merged 5 commits into from Oct 11, 2018

Conversation

Projects
None yet
6 participants
@tiancaiamao
Copy link
Contributor

commented Oct 9, 2018

What problem does this PR solve?

If slow query channel is closed before session pool, some session's goroutine may still
writing to the channel. Writing to a closed channel would cause TiDB panic.

image

What is changed and how it works?

  • Change the close order, close slow query channel after closing session pool.
  • Drain the channel data before close the channel.
  • Assume that after domain.Close() is called, there will be no more session executing.

Check List

Related changes

  • Need to cherry-pick to the release branch
domain: close slow query channel after closing session pool
If slow query channel is closed before session pool, some session's goroutine may still
writing to the channel. Writing to a closed channel would cause TiDB panic.
@tiancaiamao

This comment has been minimized.

Copy link
Contributor Author

commented Oct 9, 2018

do.sysSessionPool.Close()
do.slowQuery.Close()

This comment has been minimized.

Copy link
@zimulala

zimulala Oct 9, 2018

Member

Is it enough to put “do.slowQuery.Close()” after line462?

This comment has been minimized.

Copy link
@tiancaiamao

tiancaiamao Oct 9, 2018

Author Contributor

do.slowQuery.Close() will make goroutine exit and WaitGroup count -1.
If this line is moved to line 462,do.wg.Wait() would wait forever.

This comment has been minimized.

Copy link
@zimulala

zimulala Oct 9, 2018

Member

Ignore it.

This comment has been minimized.

Copy link
@zimulala

zimulala Oct 9, 2018

Member

But I think it's useless. After this function is executed, "updateStatsWorker" may still be called by another goroutine. This problem still exists.
Maybe we can replace "ctx"(the updateStatsWorker's argument) with the session created in the sysSessionPool.

This comment has been minimized.

Copy link
@tiancaiamao

tiancaiamao Oct 9, 2018

Author Contributor

Do you mean after session pool is closed, updateStatsWorker will be still executing ? @zimulala

This comment has been minimized.

Copy link
@crazycs520

crazycs520 Oct 9, 2018

Contributor

Ye. Because the session used in statistic bootstrap is not get from session pool. @tiancaiamao

This comment has been minimized.

Copy link
@tiancaiamao

tiancaiamao Oct 10, 2018

Author Contributor

Current session pool implementation will block waiting resource to be put back, so after the pool is closed, there won't be system session running. @crazycs520 @zimulala

done := false
for !done {
select {
case <-q.ch:

This comment has been minimized.

Copy link
@winkyao

winkyao Oct 9, 2018

Member

Why do we need to drain the channel?

This comment has been minimized.

Copy link
@tiancaiamao

tiancaiamao Oct 10, 2018

Author Contributor

Removed.

@winkyao
Copy link
Member

left a comment

LGTM


mu struct {
sync.RWMutex
closed bool

This comment has been minimized.

Copy link
@crazycs520

crazycs520 Oct 11, 2018

Contributor

how about use atomic instead of mutex.

This comment has been minimized.

Copy link
@tiancaiamao

tiancaiamao Oct 11, 2018

Author Contributor

atomic can't protect multiple operations. @crazycs520

modify atomic flag
close(ch)

check atomic flag
ch <-

Take this order for example:

  1. check atomic flag = success
  2. modify atomic flag = closed
  3. close(ch)
  4. <-ch panic!

This comment has been minimized.

Copy link
@crazycs520

crazycs520 Oct 11, 2018

Contributor

great~

@crazycs520
Copy link
Contributor

left a comment

LGTM

@crazycs520

This comment has been minimized.

Copy link
Contributor

commented Oct 11, 2018

@zimulala PTAL

@tiancaiamao

This comment has been minimized.

Copy link
Contributor Author

commented Oct 11, 2018

/run-all-tests

@zz-jason
Copy link
Member

left a comment

LGTM

@ngaut ngaut merged commit 1988307 into pingcap:master Oct 11, 2018

4 checks passed

ci/circleci Your tests passed on CircleCI!
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
jenkins-ci-tidb/build Jenkins job succeeded.
Details
license/cla Contributor License Agreement is signed.
Details

@tiancaiamao tiancaiamao deleted the tiancaiamao:close-slow-query branch Oct 15, 2018

tiancaiamao added a commit to tiancaiamao/tidb that referenced this pull request Oct 20, 2018

ngaut pushed a commit to tiancaiamao/tidb that referenced this pull request Oct 22, 2018

ngaut pushed a commit to tiancaiamao/tidb that referenced this pull request Oct 22, 2018

ngaut pushed a commit to tiancaiamao/tidb that referenced this pull request Oct 22, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.