Fix logic of command for sendError #622

wolfstudy · 2021-09-28T03:56:05Z

Motivation

As shown in the figure above, the ServerError returned by the broker is UnknownError when the client receives it. In fact, we handled the wrong command here. Here we should deal with CommandSendError instead of CommandError. Correspondingly, we should deal with the listener map used to cache the producer instead of the corresponding pendingRequest map.

Modifications

Signed-off-by: xiaolongran <xiaolongran@tencent.com>

freeznet

LGTM

wolfstudy · 2021-09-28T05:46:24Z

--- FAIL: TestNamespaceTopicsNamespaceDoesNotExit (54.17s)
    client_impl_test.go:387: 
        	Error Trace:	client_impl_test.go:387
        	Error:      	Expected nil, but got: &errors.errorString{s:"server error: AuthorizationError: Exception occurred while trying to authorize GetTopicsOfNamespace"}
        	Test:       	TestNamespaceTopicsNamespaceDoesNotExit

cckellogg · 2021-09-28T14:00:53Z

pulsar/internal/connection.go

 	case pb.ServerError_TopicTerminatedError:
-		request, ok := c.deletePendingRequest(requestID)
+		_, ok := c.deletePendingProducers(producerID)


I thought all commands sent to the broker will have a request id? Do we still need to clean those up from the the pending request queue?

In CommandSendError, just only ProducerId and SequenceId

type CommandSendError struct { ProducerId *uint64 `protobuf:"varint,1,req,name=producer_id,json=producerId" json:"producer_id,omitempty"` SequenceId *uint64 `protobuf:"varint,2,req,name=sequence_id,json=sequenceId" json:"sequence_id,omitempty"` Error *ServerError `protobuf:"varint,3,req,name=error,enum=pulsar.proto.ServerError" json:"error,omitempty"` Message *string `protobuf:"bytes,4,req,name=message" json:"message,omitempty"` XXX_NoUnkeyedLiteral struct{} `json:"-"` XXX_unrecognized []byte `json:"-"` XXX_sizecache int32 `json:"-"` }

In fact, we only need to deal with the map of listeners responsible for managing the producer objects.

I see. The thing I'm still confused about is. Each request sent on the connection can get added into the pending request map right?

https://github.com/apache/pulsar-client-go/blob/master/pulsar/internal/connection.go#L620

func (c *connection) internalSendRequest(req *request) { if c.closed() { c.log.Warnf("internalSendRequest failed for connectionClosed") if req.callback != nil { req.callback(req.cmd, ErrConnectionClosed) } } else { c.pendingLock.Lock() if req.id != nil { c.pendingReqs[*req.id] = req } c.pendingLock.Unlock() c.writeCommand(req.cmd) } }

If a command is sent and gets added to the pending requests map and then we get this response from the broker pb.ServerError_TopicTerminatedError will we end up leaving/leaking commands in the pending requests map? If there is no request id maybe it can't be avoided. Am I missing something?

Yes, Agree with your ideas.

The first point: Here we really should deal with SendError Command, not ErrorCommand. This should be determined. But the requestID is not included in SendError.

The second point: This requestID should be obtained from the Protobuf protocol. Processing PendingRequest needs to rely on requestID, so now I am also a bit confused. After receiving SendError, what should we do here?

cckellogg · 2021-09-28T15:52:46Z

pulsar/internal/connection.go

 	default:
 		// By default, for transient error, let the reconnection logic
 		// to take place and re-establish the produce again
-		c.Close()


Why don't we need to close the connection here anymore?

Refer to the above ideas, if we need to clean the pendingRequest cache, then we'd better close the connection here. If we only need to clean up the map of the producer of listeners, then here we only trigger the logic of reconnection should be enough. Because of this connection, there may be other producers in use.

I think the java client closes the connection in this case? What does it do for the other cases above like pb.ServerError_TopicTerminatedError?

OK, will fix this.

cckellogg · 2021-09-28T15:54:30Z

pulsar/internal/connection.go

 			return
 		}

-		errMsg := fmt.Sprintf("server error: %s: %s", cmdError.GetError(), cmdError.GetMessage())
-		request.callback(nil, errors.New(errMsg))


Does the producer still need to be notified somehow?

cckellogg · 2021-09-28T16:03:24Z

pulsar/internal/connection.go

 	case pb.ServerError_TopicTerminatedError:
-		request, ok := c.deletePendingRequest(requestID)
+		_, ok := c.deletePendingProducers(producerID)


I see. The thing I'm still confused about is. Each request sent on the connection can get added into the pending request map right?

https://github.com/apache/pulsar-client-go/blob/master/pulsar/internal/connection.go#L620

func (c *connection) internalSendRequest(req *request) { if c.closed() { c.log.Warnf("internalSendRequest failed for connectionClosed") if req.callback != nil { req.callback(req.cmd, ErrConnectionClosed) } } else { c.pendingLock.Lock() if req.id != nil { c.pendingReqs[*req.id] = req } c.pendingLock.Unlock() c.writeCommand(req.cmd) } }

If a command is sent and gets added to the pending requests map and then we get this response from the broker pb.ServerError_TopicTerminatedError will we end up leaving/leaking commands in the pending requests map? If there is no request id maybe it can't be avoided. Am I missing something?

Signed-off-by: xiaolongran <xiaolongran@tencent.com> Fixes apache#623 ### Motivation As apache#623 said, when the topic is deleted forced, we don't should trying to reconnect, instead of giving up reconnection. ### Modifications - Fix prodcuer reconnetion logic - Fix consumer reconnection logic

Signed-off-by: xiaolongran <xiaolongran@tencent.com>

wolfstudy · 2021-10-11T03:20:06Z

As @cckellogg said, the current method may cause the leak of pendingRequest resources, here we first merge the current pull request, and then create a new issue to track the problem here. And the issue is: #636.

wolfstudy added 2 commits September 27, 2021 16:59

Fix logic of command for sendError

63488f3

Signed-off-by: xiaolongran <xiaolongran@tencent.com>

fix a little

e948c3b

Signed-off-by: xiaolongran <xiaolongran@tencent.com>

wolfstudy added this to the 0.7.0 milestone Sep 28, 2021

wolfstudy requested review from merlimat and cckellogg September 28, 2021 03:56

wolfstudy self-assigned this Sep 28, 2021

wolfstudy requested review from zymap, jiazhai, sijie and freeznet September 28, 2021 03:56

zymap approved these changes Sep 28, 2021

View reviewed changes

freeznet approved these changes Sep 28, 2021

View reviewed changes

cckellogg reviewed Sep 28, 2021

View reviewed changes

Merge branch 'master' into xiaolong/fix-send-error-cmd

4488019

cckellogg reviewed Sep 28, 2021

View reviewed changes

wolfstudy and others added 5 commits September 29, 2021 17:47

Fix logic of command for sendError

039c879

Signed-off-by: xiaolongran <xiaolongran@tencent.com>

fix a little

01ee360

Signed-off-by: xiaolongran <xiaolongran@tencent.com>

fix comments

029b536

Signed-off-by: xiaolongran <xiaolongran@tencent.com>

fix comments

a8ef72b

Signed-off-by: xiaolongran <xiaolongran@tencent.com>

wolfstudy mentioned this pull request Oct 11, 2021

[Bug] The pendingRequest resource may leak #636

Open

wolfstudy merged commit 791d342 into apache:master Oct 11, 2021

sijie mentioned this pull request Oct 11, 2021

ISSUE-636: [Bug] The pendingRequest resource may leak streamnative/pulsar-client-go#229

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix logic of command for sendError #622

Fix logic of command for sendError #622

wolfstudy commented Sep 28, 2021

freeznet left a comment

wolfstudy commented Sep 28, 2021

cckellogg Sep 28, 2021

wolfstudy Sep 28, 2021

wolfstudy Sep 28, 2021

cckellogg Sep 28, 2021

wolfstudy Sep 29, 2021

wolfstudy Sep 29, 2021

cckellogg Sep 28, 2021

wolfstudy Sep 29, 2021

cckellogg Sep 30, 2021

wolfstudy Sep 30, 2021

cckellogg Sep 28, 2021

cckellogg Sep 28, 2021

wolfstudy commented Oct 11, 2021

Fix logic of command for sendError #622

Fix logic of command for sendError #622

Conversation

wolfstudy commented Sep 28, 2021

Motivation

Modifications

freeznet left a comment

Choose a reason for hiding this comment

wolfstudy commented Sep 28, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wolfstudy commented Oct 11, 2021