Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Will handling response this way result in subsequent requests all being mishandled? #2918

Closed
AlexanderJLiu opened this issue May 29, 2024 · 1 comment

Comments

@AlexanderJLiu
Copy link

sarama/broker.go

Lines 1164 to 1171 in 3fad210

for response := range b.responses {
if dead != nil {
// This was previously incremented in send() and
// we are not calling updateIncomingCommunicationMetrics()
b.addRequestInFlightMetrics(-1)
response.handle(nil, dead)
continue
}

In the code above, if a response processing fails in the connection, subsequent responses will not be correctly handled.
Because "dead" is assigned the value "error," it will not be set to nil again.

Why is it designed this way here, or is my understanding incorrect?

@AlexanderJLiu
Copy link
Author

FROM: Evan Huus

image

Make dead brokers die harder

When a broker gets an error trying to receive a response (either from the
network layer, or from failing to parse the minimal global header), it should
just abandon ship and die. Save that error and return it immediately for any
further requests we might have made.

- The vast majority of the time the connection is going to be hosed anyways, if
nothing else by being out-of-sync on correlation IDs (which we don't handle
and which doesn't seem particularly urgent).
- All of Sarama's built-in callers (producer/consumer/offset-manager)
immediately `Close` a broker when they receive one of these errors anyways, so
all this does is speed up that in the common case.

*If* one of these errors is recoverable, and *if* there is user-space code
somewhere which actually tries to recover in one of those cases, then that code
would break.

This neatly satisfies one of the XXX comments I left in about this issue from
way back in 2013. The TODOs about correlation ID matching are still present.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant