Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle unreliable gocql close function stuck issue #1803

Merged
merged 2 commits into from
Aug 5, 2021

Conversation

wxing1292
Copy link
Contributor

What changed?

  • Start a new goroutine when closing gocql session

Why?
gocql session.Close is not reliable, sometimes there can be deadlock issue, details see below

  1. stack trace proving that gocql session.Close is stuck
goroutine 700966 [chan send, 61 minutes]:
github.com/gocql/gocql.(*controlConn).close(0xc0026b8c80)
	/go/pkg/mod/github.com/gocql/gocql@v0.0.0-20210621133426-d83b80dfb480/control.go:488 +0xbb
github.com/gocql/gocql.(*Session).Close(0xc0015c2a80)
	/go/pkg/mod/github.com/gocql/gocql@v0.0.0-20210621133426-d83b80dfb480/session.go:464 +0x15f
go.temporal.io/server/common/persistence/nosql/nosqlplugin/cassandra/gocql.(*session).refresh(0xc001daeb60)
	/go/pkg/mod/go.temporal.io/server@v1.11.1-0.20210801055905-ea5caf1c1ea1/common/persistence/nosql/nosqlplugin/cassandra/gocql/session.go:107 +0x5cd
go.temporal.io/server/common/persistence/nosql/nosqlplugin/cassandra/gocql.(*query).handleError(...)
	/go/pkg/mod/go.temporal.io/server@v1.11.1-0.20210801055905-ea5caf1c1ea1/common/persistence/nosql/nosqlplugin/cassandra/gocql/query.go:133
  1. call stack which triggered the above deadlock (from caller to callee)
    a. https://github.com/gocql/gocql/blob/d83b80dfb4800e8761487e2a0c86cf6864824a8e/session.go#L464
    b. https://github.com/gocql/gocql/blob/d83b80dfb4800e8761487e2a0c86cf6864824a8e/control.go#L488

How did you test it?
N/A

Potential risks
N/A

Is hotfix candidate?
No

@wxing1292 wxing1292 requested review from yiminc and a team August 5, 2021 21:41
@wxing1292 wxing1292 enabled auto-merge (squash) August 5, 2021 21:43
@wxing1292 wxing1292 merged commit 46a8766 into temporalio:master Aug 5, 2021
@wxing1292 wxing1292 deleted the gocql branch August 5, 2021 22:12
wxing1292 added a commit that referenced this pull request Aug 6, 2021
* Start a new goroutine when closing gocql session, due to gocql session.Close being not reliable, sometimes there can be deadlock issue.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants