Skip to content

[fix][txn] Set TC client state update and mistake completed future exceptionally.#19273

Closed
liangyepianzhou wants to merge 3 commits intoapache:masterfrom
liangyepianzhou:xiangying/txn/fix/setReadyAfterConnectComplete
Closed

[fix][txn] Set TC client state update and mistake completed future exceptionally.#19273
liangyepianzhou wants to merge 3 commits intoapache:masterfrom
liangyepianzhou:xiangying/txn/fix/setReadyAfterConnectComplete

Conversation

@liangyepianzhou
Copy link
Contributor

@liangyepianzhou liangyepianzhou commented Jan 18, 2023

Motivation

  1. Fix the logic flow in startAsync.
    1. change the state of the TCClient to STARTING
    2. Connecting to the transaction coordinators
    3. Set the state to READY
    4. Connect fails but the state is READY
  2. Fix the future completed logic flow.
    1. connect to the transaction coordinator.
    2. get a retryable exception
    3. complete the connected future with the exception
    4. do retry the operation
    5. complete the future again.

Modification

  1. Update the state after all connections are completed or catch exceptions.
  2. Do not complete the future if the exception is a retryable future.

Verifying this change

  • Make sure that the change passes the CI checks.

(Please pick either of the following options)

This change is a trivial rework / code cleanup without any test coverage.

(or)

This change is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

  • Added integration tests for end-to-end deployment with large payloads (10MB)
  • Extended integration test for recovery after broker failure

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

  • Dependencies (add or upgrade a dependency)
  • The public API
  • The schema
  • The default values of configurations
  • The threading model
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • The metrics
  • Anything that affects deployment

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository:

### Motivation
logic flow in `startAsync`
 1. change the state of the TCClient to `STARTING`
 2. Connecting to the transaction coordinators
 3. Set the state to `READY`
 4. Connect fails but the state is READY
 ### Modification
 Set the state to ready after all connections are complete.
@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Jan 18, 2023
@liangyepianzhou liangyepianzhou changed the title [fix][txn] Set Tc client ready after all connections ready [fix][txn] Set TC client ready after all connections ready Jan 18, 2023
Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have two questions/points:

  1. what happens if some of the coordinators are not available ?

  2. is this change adding more latency in the creation of the PulsarClient ?

@liangyepianzhou
Copy link
Contributor Author

liangyepianzhou commented Jan 18, 2023

I have two questions/points:

  1. what happens if some of the coordinators are not available ?
  2. is this change adding more latency in the creation of the PulsarClient ?

The TC client is created synchronized, so this change will not add any latency.
If the coordinators are unavailable, the client will try to connect later.
This modification will not bring any effects on these two questions.

if (conf.isEnableTransaction()) {
tcClient = new TransactionCoordinatorClientImpl(this);
try {
tcClient.start();
} catch (Throwable e) {
log.error("Start transactionCoordinatorClient error.", e);
throw new PulsarClientException(e);
}
}

@liangyepianzhou liangyepianzhou changed the title [fix][txn] Set TC client ready after all connections ready [fix][txn] Set TC client state update and mistake completed future exceptionally. Jan 19, 2023
transactionCoordinatorId, exception);
if (!this.connectFuture.isDone()) {
//If the exception is a retry exception, we should not complete exceptionally at this time.
if (!this.connectFuture.isDone() || !isRetryException(exception)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How the connect request can retry?

Copy link
Contributor Author

@liangyepianzhou liangyepianzhou Jan 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will do a retry operation after getting an exception.

}).exceptionally((e) -> {
internalPinnedExecutor.execute(() -> {
LOG.error("Transaction coordinator client connect fail! tcId : {}",
transactionCoordinatorId, e.getCause());
if (getState() == State.Closing || getState() == State.Closed
|| e.getCause() instanceof PulsarClientException.NotAllowedException) {
setState(State.Closed);
cnx.channel().close();
} else {
connectionHandler.reconnectLater(e.getCause());
}
});
return null;
});
} else {
registerToConnection(cnx);
}
});
}
private boolean registerToConnection(ClientCnx cnx) {
if (changeToReadyState()) {
connectionHandler.setClientCnx(cnx);
cnx.registerTransactionMetaStoreHandler(transactionCoordinatorId, this);
connectFuture.complete(null);
return true;
} else {
State state = getState();
cnx.channel().close();
connectFuture.completeExceptionally(
new IllegalStateException("Failed to change the state from " + state + " to Ready"));

@github-actions
Copy link

github-actions bot commented Mar 1, 2023

The pr had no activity for 30 days, mark with Stale label.

@github-actions github-actions bot added the Stale label Mar 1, 2023
@Technoboy- Technoboy- added this to the 3.2.0 milestone Jul 31, 2023
@liangyepianzhou liangyepianzhou closed this by deleting the head repository Sep 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

doc-not-needed Your PR changes do not impact docs Stale

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants

Comments