Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix write to closed channel panic() in internal/connection during connection close #539

Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 26 additions & 4 deletions pulsar/internal/connection.go
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,7 @@ type connectionState int32
const (
connectionInit = iota
connectionReady
connectionClosing
connectionClosed
)

Expand All @@ -98,6 +99,8 @@ func (s connectionState) String() string {
return "Initializing"
case connectionReady:
return "Ready"
case connectionClosing:
return "Closing"
case connectionClosed:
return "Closed"
default:
Expand Down Expand Up @@ -142,6 +145,7 @@ type connection struct {

requestIDGenerator uint64

incomingRequestsWG sync.WaitGroup
incomingRequestsCh chan *request
incomingCmdCh chan *incomingCmd
closeCh chan interface{}
Expand Down Expand Up @@ -331,10 +335,15 @@ func (c *connection) waitUntilReady() error {
}

func (c *connection) failLeftRequestsWhenClose() {
// wait for outstanding incoming requests to complete before draining
// and closing the channel
c.incomingRequestsWG.Wait()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little confused why are we waiting here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cckellogg ,
At this point we are sure that the closeCh is closed and that the state of the connection is either connectionClosing or connectionClosed. We wait to be sure that there will be no further writes to the incomingRequestsCh before attempting to drain and close it. It is possible for the sending go-routine to be executing either of the following lines at the time this function is executed:

As mentioned in the motivation for this MR:

When two cases of a select are valid, the case executed is chosen at random; see https://tour.golang.org/concurrency/5

Whenever I see a channel being closed by a reader go-routine I am suspicious of writers attempting to write to the channel after it is closed.


reqLen := len(c.incomingRequestsCh)
for i := 0; i < reqLen; i++ {
c.internalSendRequest(<-c.incomingRequestsCh)
}

close(c.incomingRequestsCh)
}

Expand Down Expand Up @@ -546,8 +555,13 @@ func (c *connection) Write(data Buffer) {

func (c *connection) SendRequest(requestID uint64, req *pb.BaseCommand,
callback func(command *pb.BaseCommand, err error)) {
if c.getState() == connectionClosed {
c.incomingRequestsWG.Add(1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this waitgroup needed for both send requests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned #539 (comment) we need to track all writes to the incomingRequestsCh

defer c.incomingRequestsWG.Done()

state := c.getState()
if state == connectionClosed || state == connectionClosing {
callback(req, ErrConnectionClosed)

} else {
select {
case <-c.closeCh:
Expand All @@ -563,7 +577,11 @@ func (c *connection) SendRequest(requestID uint64, req *pb.BaseCommand,
}

func (c *connection) SendRequestNoWait(req *pb.BaseCommand) error {
if c.getState() == connectionClosed {
c.incomingRequestsWG.Add(1)
defer c.incomingRequestsWG.Done()

state := c.getState()
if state == connectionClosed || state == connectionClosing {
return ErrConnectionClosed
}

Expand All @@ -586,7 +604,8 @@ func (c *connection) internalSendRequest(req *request) {
c.pendingReqs[*req.id] = req
}
c.pendingLock.Unlock()
if c.getState() == connectionClosed {
state := c.getState()
if state == connectionClosed || state == connectionClosing {
c.log.Warnf("internalSendRequest failed for connectionClosed")
if req.callback != nil {
req.callback(req.cmd, ErrConnectionClosed)
Expand Down Expand Up @@ -755,6 +774,8 @@ func (c *connection) UnregisterListener(id uint64) {
// broadcasting the notification on the close channel
func (c *connection) TriggerClose() {
c.closeOnce.Do(func() {
c.setState(connectionClosing)

cnx := c.cnx
if cnx != nil {
cnx.Close()
Expand All @@ -775,9 +796,10 @@ func (c *connection) Close() {
}

c.log.Info("Connection closed")
c.TriggerClose()
// do not use changeState() since they share the same lock
c.setState(connectionClosed)
c.TriggerClose()

c.pingTicker.Stop()
c.pingCheckTicker.Stop()

Expand Down