Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ExecDriver] Launch execution driver with AuthorityState #6690

Merged
merged 3 commits into from
Dec 9, 2022

Conversation

mwtian
Copy link
Member

@mwtian mwtian commented Dec 9, 2022

After Gateway deprecation, since we plan to execute certificates with effects in execution driver, there should not be a case where AuthorityState is launched but execution driver is not needed. Keeping execution driver in activate authority no longer brings in benefits. But merging execution driver into AuthorityState has additional values:

  1. More straight forward to wire TransactionManager to execution driver.
  2. Less boilerplate related to metrics.

Since execution driver requires an Arc<> reference to AuthorityState, AuthorityState::new() is refactored to return Arc<AuthorityState> instead. This removes the need for callers to wrap AuthorityState wit Arc themselves.

This change is a prerequisite for #6652, which uses TransactionManager for handle_certificate_with_effects() by default. Without running execution driver, a number of tests would fail.

An alternative approach is to launch ActiveAuthority when execution driver is needed, which is going to be almost all cases for tests and production code. The additional boilerplate seems unnecessary.

@mwtian mwtian force-pushed the authority-exec branch 2 times, most recently from f7a47a9 to ca10be0 Compare December 9, 2022 08:25
@mwtian mwtian marked this pull request as ready for review December 9, 2022 08:45
@mwtian mwtian requested a review from lxfind December 9, 2022 16:37
// Use owned object certificate for the LimitedPoll test.
// Shared object certificate requires more setup before execution. And if we rely on
// send_consensus() for setup, the certificate will also get executed as output in
// handle_consensus_transaction(), making the LimitedPoll test ineffective.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, but the shared object case was added very deliberately in #4579 - I think we need to leave the test as is to prevent a regression.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted the test to use certificate with shared objects. Added a function to run the preparation logic for executing certificates out of consensus, without actual execution.

@lxfind
Copy link
Contributor

lxfind commented Dec 9, 2022

I am concerned to have AuthorityState to depend on execution_driver. AuthorityState is just a state, while execution_driver is a task. Even if we get rid of the concept of authority_active, execution_driver should still be an independent task that gets spawned independently.

@mwtian
Copy link
Member Author

mwtian commented Dec 9, 2022

It seems persisting and scheduling certificates for execution until their inputs are available are preferable to trying to execute certificates and error out if inputs are unavailable, in the majority of places that call AuthorityState::handle_certificate_with_effects() and AuthorityState::handle_certificate() directly today. Also this async execution is necessary for AuthorityState::handle_consensus_transaction() to actually execute the transaction (with node sync / TransactionManager), and it is very weird that the method does not execute with just AuthorityState until a corresponding ActiveAuthority is started.

I see two potential ways to organize the code here.

Support scheduled execution of transactions via AuthorityState

We can commit fully to supporting AuthorityState to schedule certificate execution until input objects are available.

Personally I see the task for execution as just another data structure, not much different from a coroutine, that is ok to be managed by AuthorityState. But we may able to avoid the execution driver task and Arc<AuthorityState>, by extracting out the AuthorityState members needed by AuthorityState::process_certificate() into a new object e.g. Arc<AuthorityCore>, and running the logic in the exec driver loop directly after each certificate is ready. The actual execution that calls AuthorityState::process_certificate() still needs to be in a separate task with a copy of Arc<AuthorityCore>.

Support scheduled execution of transactions via ActiveAuthority

Instead, we can take AuthorityState::handle_consensus_transaction() and other schedule transaction for execution methods to another layer. Let's say it is ActiveAuthority, although we can create yet another component. TransactionManager will also get to move to this component. This component will replace the usage of AuthorityState for callsites that need async execution. There will be a lot of callsites and test helper ultilities that need to be updated if we make AuthorityState::handle_certificate_with_effects() and AuthorityServer::handle_certificate() execute asynchronously.

My preference is the 1st approach because it is more straightforward. AuthorityState already have many methods like commit_cert_and_notify() that are not pure state manipulations. And I cannot think of a use case where AuthorityState is needed but not async execution, so putting async execution in a separate component seems not very worthwhile. But I can give the 2nd approach a try. Let me know what you think.

@lxfind
Copy link
Contributor

lxfind commented Dec 9, 2022

Ok indeed. I am convinced.

@mwtian
Copy link
Member Author

mwtian commented Dec 9, 2022

Great! I understand that we want to keep AuthorityState lean. I think we can revisit refactoring AuthorityState after checkpoint v2 and deprecation of batch service.

/// Depending on the type of the VerifiedSequencedConsensusTransaction wrapper,
/// - Verify and initialize the state to execute the certificate.
/// Returns a VerifiedCertificate only if this succeeds.
/// - Or update the state for checkpoint or epoch change protocol. Returns None.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you comment that this method is only needed for tests?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nm this is clear enough i suppose

@mwtian mwtian merged commit 3fc6bcb into MystenLabs:main Dec 9, 2022
@mwtian mwtian deleted the authority-exec branch December 10, 2022 00:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants