-
Notifications
You must be signed in to change notification settings - Fork 13.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KAFKA-16047: Use REQUEST_TIMEOUT_MS_CONFIG in AdminClient.fenceProducers #16151
Conversation
Use REQUEST_TIMEOUT_MS_CONFIG in AdminClient.fenceProducers as transaction timeout. No transaction will be started with this timeout, but ReplicaManager.appendRecords uses this value as its timeout. Use REQUEST_TIMEOUT_MS_CONFIG like a regular producer append to allow for replication to take place. Co-Authored-By: Adrian Preston <prestona@uk.ibm.com>
@@ -4569,7 +4569,7 @@ public ListTransactionsResult listTransactions(ListTransactionsOptions options) | |||
public FenceProducersResult fenceProducers(Collection<String> transactionalIds, FenceProducersOptions options) { | |||
AdminApiFuture.SimpleAdminApiFuture<CoordinatorKey, ProducerIdAndEpoch> future = | |||
FenceProducersHandler.newFuture(transactionalIds); | |||
FenceProducersHandler handler = new FenceProducersHandler(logContext); | |||
FenceProducersHandler handler = new FenceProducersHandler(logContext, requestTimeoutMs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a requestTimeoutMs in options, can you use that if specified?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as an override to the AdminClient REQUEST_TIMEOUT_MS_CONFIG ?
The options doc says it's on override on API timeout
but yes we could use that too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked at it, and the option timeoutMs is used already as an API timeout - ie to calc a deadline when retrying.
The requestTimeoutMs is instead used for a single append, so I'd actually prefer not to mix the two and leave the PR as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @gharris1727 I'd appreciate if you can take a 2nd look or approve as IMHO without this small fix, exactly once MM2/connect is unusable in production, so I'd like to have this in 3.8.0 and 3.7.1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the AdminClient is no longer waiting for the fence request to complete, does it make sense for the underlying append to continue?
If the AdminClient is willing to wait longer for the fence request to complete, should the underlying append be aborted early?
If someone wants to raise the timeout for this one operation, I don't think that we should require them to increase the client-global request.timeout.ms.
I agree to that. Hopefully the commit addresses your concerns. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @edoardocomar and @prestona for the fix, and @akaltsikis for finding this originally!
Jenkins PR Jdk8 connect failure retested manually
|
…ers (#16151) Use REQUEST_TIMEOUT_MS_CONFIG in AdminClient.fenceProducers, or options.timeoutMs if specified, as transaction timeout. No transaction will be started with this timeout, but ReplicaManager.appendRecords uses this value as its timeout. Use REQUEST_TIMEOUT_MS_CONFIG like a regular producer append to allow for replication to take place. Co-Authored-By: Adrian Preston <prestona@uk.ibm.com>
…ers (#16151) Use REQUEST_TIMEOUT_MS_CONFIG in AdminClient.fenceProducers, or options.timeoutMs if specified, as transaction timeout. No transaction will be started with this timeout, but ReplicaManager.appendRecords uses this value as its timeout. Use REQUEST_TIMEOUT_MS_CONFIG like a regular producer append to allow for replication to take place. Co-Authored-By: Adrian Preston <prestona@uk.ibm.com>
merged in trunk, cherry-picked to 3.7.1 and 3.8.0. thanks @gharris1727 |
…ers (apache#16151) Use REQUEST_TIMEOUT_MS_CONFIG in AdminClient.fenceProducers, or options.timeoutMs if specified, as transaction timeout. No transaction will be started with this timeout, but ReplicaManager.appendRecords uses this value as its timeout. Use REQUEST_TIMEOUT_MS_CONFIG like a regular producer append to allow for replication to take place. Co-Authored-By: Adrian Preston <prestona@uk.ibm.com>
I think this also fixes https://issues.apache.org/jira/browse/KAFKA-16488 |
…ers (apache#16151) Use REQUEST_TIMEOUT_MS_CONFIG in AdminClient.fenceProducers, or options.timeoutMs if specified, as transaction timeout. No transaction will be started with this timeout, but ReplicaManager.appendRecords uses this value as its timeout. Use REQUEST_TIMEOUT_MS_CONFIG like a regular producer append to allow for replication to take place. Co-Authored-By: Adrian Preston <prestona@uk.ibm.com>
…ers (apache#16151) Use REQUEST_TIMEOUT_MS_CONFIG in AdminClient.fenceProducers, or options.timeoutMs if specified, as transaction timeout. No transaction will be started with this timeout, but ReplicaManager.appendRecords uses this value as its timeout. Use REQUEST_TIMEOUT_MS_CONFIG like a regular producer append to allow for replication to take place. Co-Authored-By: Adrian Preston <prestona@uk.ibm.com>
Use REQUEST_TIMEOUT_MS_CONFIG in AdminClient.fenceProducers as transaction timeout.
No transaction will be started with this timeout, but ReplicaManager.appendRecords uses this value as its timeout. Use REQUEST_TIMEOUT_MS_CONFIG like a regular producer append to allow for replication to take place.
Co-Authored-By: Adrian Preston prestona@uk.ibm.com
Committer Checklist (excluded from commit message)