Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MINOR: add logs when connection count exceeds configured count #12559

Open
wants to merge 1 commit into
base: trunk
Choose a base branch
from

Conversation

showuon
Copy link
Contributor

@showuon showuon commented Aug 26, 2022

Currently, when client tried to connect to broker, and got throttled or exceeding connection count (waiting for available connection slot), there will be no logs output in broker. All the client can see is connection timeout exception. This PR adds logs for this case.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

Comment on lines 1568 to 1578
private def connectionSlotAvailable(listenerName: ListenerName): Boolean = {
if (listenerCounts(listenerName) >= maxListenerConnections(listenerName))
if (listenerCounts(listenerName) >= maxListenerConnections(listenerName)) {
warn(s"The connection count ${listenerCounts(listenerName)} for listener:$listenerName exceeds max listener connection count ${maxListenerConnections(listenerName)}")
false
else if (protectedListener(listenerName))
} else if (protectedListener(listenerName)) {
true
else
totalCount < brokerMaxConnections
} else if (totalCount >= brokerMaxConnections) {
warn(s"The total connection count $totalCount exceeds broker max connection count $brokerMaxConnections")
false
} else true
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adds log for different cases without available connection slot

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these be info or debug level logs? Saying this because a rogue client will get throttled but still have a potential to overwhelm the server with filled up logs. Also, warn is usually used to indicate an error which the system has recovered/ could recover from but needs attention, whereas, getting throttled is a client side error in the same bucket as HTTP 4xx series.

What do you think?

Comment on lines 1548 to 1556
do {
if (remainingThrottleTimeMs == 0) {
warn(s"The connection will wait until there's an available connection slot due to current connection count exceeds configured count.")
} else {
info(s"The connection will be delayed for for ${remainingThrottleTimeMs}ms due to the connection rate exceeds configured rate.")
}
counts.wait(remainingThrottleTimeMs)
remainingThrottleTimeMs = math.max(endThrottleTimeMs - time.milliseconds, 0)
} while (remainingThrottleTimeMs > 0 || !connectionSlotAvailable(listenerName))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if remainingThrottleTimeMs == 0, it means there's no available connection slot, and we'll wait forever until there's connection terminated(Object.wait(0) means wait until notified[1]).

[1] https://docs.oracle.com/javase/8/docs/api/java/lang/Object.html#wait-long-

If timeout is zero, however, then real time is not taken into consideration and the thread simply waits until notified.

@showuon
Copy link
Contributor Author

showuon commented Aug 26, 2022

@mimaison @rajinisivaram , please take a look. Thanks.

@dajac
Copy link
Contributor

dajac commented Aug 26, 2022

I am not sure about it. In highly loaded system with a lot of connections, this will result in flooding the logs. This is even worst if there are rogue clients. It would be better to rely on metrics to diagnose such issues. I don't remember, do we have metrics for this?

@showuon
Copy link
Contributor Author

showuon commented Aug 29, 2022

@dajac, thanks for the comment. Yes, I agree with you that this could result in flooding the logs in highly loaded system. But still, if the client got blocked due to exceeding max connection limit, it's bad that there's no log in broker side. Expose the info via metrics is a good suggestion. At quick look, I can't find we have any metrics recording this info. I'll check it again later, if no, I'm going to propose a KIP for it.

But back to this PR, it is still not good when this issue happened, but broker keeps silent, even if we have metric's help. Maybe we can log with DEBUG level like @divijvaidya 's suggestion, and user can see this log when troubleshooting issues. WDYT?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants