Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KAFKA-9893: Configurable TCP connection timeout and improve the initial metadata fetch #8683
KAFKA-9893: Configurable TCP connection timeout and improve the initial metadata fetch #8683
Changes from all commits
f2025d7
b795a82
57661e5
d19e5d7
b9118c9
06bdd5d
6daba41
22b1b4c
de0c5af
b1a62f8
9829579
27fa1cf
3e92a0f
3a945cf
36c9cdd
a160400
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we keep this method to perform reset (perhaps rename the method) and include all types of reset?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Including all types of reset together is probably not a good choice because the reset of failed attempts and the reset of the connection timeout may happen in different places.
However, I agree we should have some abstraction on the update and reset logic. I'll put the logic in new class methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we ensure that
nodeState
is notnull
here?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. The caller will ensure that the node is in the connecting state. I'll add an IllegalStateException here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather prefer to handle this like we did in
lastConnectAttemptMs
in order to remain consistent. IfnodeState
isnull
, we can return0
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When
NetworkClient
initializes a connection to a given node (NetworkClient::initiateConnect
), it's guaranteed that thenodeState
will get initialized and won't benull
. I think it's probably not reasonable if the caller wants to get the connection timeout of a given node before the connection initialization, which is the reason I prevent this kind of calling by throwing the exception.However, it might be reasonable for a caller to get the
lastConnectAttemptMs
before initializing the connection. For example, the node provider wants to provide a node with the least recent connection attempt. For those nodes haven't been connected yet, theirNodeConnectionState
does not exist. However, this implies that the node has the highest priority and we may assume theirlastConnectAttemptMs
is 0.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the clarification. In this case, let's reuse the
nodeState
method which check null and throws anIllegalStateException
as you do here. We may be able to use it inisConnectionSetupTimeout
as well.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Now
isConnectionSetupTimeout
is also using this checker.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we also check that the node is in connecting state?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we don't need to check if the node is in connecting state because the caller is only applying this test to all the nodes in the connecting state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is indeed true today as the caller only calls with nodes in
connectingNodes
but that may not be true forever. I would add the check as suggested by Rajini here to make the implementation safe.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense. Checker added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@guozhangwang Thanks for reporting the exception in this code.
@d8tltanc @dajac This code segment is unsafe, we are removing
node
fromconnectingNodes
in pprocessDisconnection()
while iterating over the set. We must be missing a test too (or we have a test with only one connection).