refactor: consensus state to fix deadlock#491
Merged
Conversation
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
…h state change task as a separate command, similar to state machine implementation
* refactor: set and decide proposal operations move to the separated component proposaler
* refactor: separate a gossip logic on implementation and a standard way of iterative handling a gossip condition * chore: remove unless commented line * refactor: remove custom "peerSyncErr" error * refactor: add Gossiper interface, generate Gossiper mock struct, add unit test to queryMaj23GossipHandler * refactor: revert changes in types.MakeBlock * refactor: remove unused methods/functions, change a way how a vote should be picked for a gossip * refactor: peerGossipWorker implements service.Service * refactor: add synchronization for blockExecutor.committedState
…nto td-115-fix-consensus-deadlock
* refactor: blocksync module is refactored on using worker-pool * test: add unit test for worker-pool * test: replace custom assertError on tmrequire.Error * refactor: add Now function into flowrate package, to obtain the current time * fix: TestHandshakeErrorsIfAppReturnsWrongAppHash in replay_test.go * refactor: move the logic wait-for blockchain sync from reactor to BlockPool * fix: TestReactor_SyncTime * test: change timing for test TestBlockPoolBasic * chore: replace clock on clockwork * refactor: replace reactor's consuming on separate consumer function in Channel, rename SetPeerRange on AddPeer * test: add TestConsumeError * chore: remove unused mustHexToBytes * chore: add examples in promise/example_test.go and docs comments * chore: change a package for MockValidatorSet from types on factory * Update internal/blocksync/channel.go
* refactor: add internal/p2p/client implementation * refactor: change blocksync module to support p2p client
lklimek
previously approved these changes
Mar 15, 2023
* test: fix TestMakeHTTPDialerURL and rename on TestDialParamsFromURL
lklimek
previously approved these changes
Mar 16, 2023
* refactor: introduce p2p client in mempool module
lklimek
approved these changes
Mar 17, 2023
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Issue being fixed or feature implemented
The consensus state is designed as a component that stores state data and behavior at the same logical component. Since the state shares its data with several goroutines, a mutex must be used to access the data.
Every time the state receives an event, the state locks the data until the operation is completed. This approach has a potential deadlock.
This refactoring is aimed at eliminating the deadlock issue and separating the modification logic across independent handlers (state transitions or state commands)
What was done?
Key design changes:
Previously, to read the events for state-update operation, we used
internalandpeerqueues (actually each queue is a Go channel). This approach was changed by using the “Fan In” concurrency pattern. That means, we read the messages from both channels concurrently and put them into the output queue, which is available for a read in state.msgInfoDispatcherEach received message is dispatched to a proper handler based on the message type.
The dispatcher has handlers for a following messages:
To allow application state data to be independent of state implementation, introduced a few new units:
AppStateStoreis used to store and load data. All operations are safe and use mutex.AppStateis a copy of stored data inAppStateStore. By analogy with initial implementation, AppState has a copy ofconsensus.RoundStateandstate.Statedata. Along with providing a copy of data,AppStatealso has some methods (or behavior) likeupdateToStateorisValidatoretc. Another key point is theversionfield which is the version number of read data.AppStateStoreexpects thatAppState.versionmatches with the same number. Once the data updates,AppStateStoreincrements version number, and the next load ofAppStatewill have that version. Since the application can updateAppStateconcurrently, It is necessary to allow updating only the correctAppStateversion.The biggest part of this refactoring is the introduction of a new state machine component.
This state machine represents of
FMSis a registry of all possible state commands (or state transitions) and dispatcher of an event to a correct state handler. It is used to registry a command by an event type and execute the command for with an event (StateEvent)StateEventis used to execute a state transition. It consists of event type, copy of AppState, and the arbitrary data.StateEventprovides event data with a pointer toStateData,FMS(to be able to dispatch event to the next state) andData.CommandHandleris an interface that must be implemented by a state transition. To execute state transition you must provide StateEvent (desc. below)Supported state transitions:
VoteSignercontains methods to sign a vote and sign and add the vote.Observeris a simple implementation of the Observer design pattern. It uses to inform subscribers or state dependent components, about updating fields at state.Supported subscriptions
blockExecutoris a wrapper overstate.BlockExecutorto provide some functionality to guarantee that prepare or process proposal calls only once.How it works
NewStatefunction.AppStatefromAppStateStore, and handles the message with thatAppState.AppStateStore.Getreturns a copy of data, but theAppStateis passed as a pointer to a dispatcher and commands. This usage of a pointer is safe and necessary to mutate data using the same code as we have currently.msgInfoQueueare dispatched to a message-handler. The dispatcher routes the message based on a message type. The handler execute the state transition.AppStateshould be updated inAppStateStore. Also, for some casesAppStatecan be updated inside the state-transition handler, it is required for unit-tests which are checking theAppStatedata on an event (need to eliminate it and get data for checking from an event).So the mutex is only used to get a copy of the
AppStateand update the current versionAppStateon the new one.AppStateis changed by state commands (state transition), so as consuming messages pass through Go channels, it happens synchronously.How Has This Been Tested?
Unit / E2E tests
Need to add unit tests for the new components
Breaking Changes
The consensus protocol hasn't changed, only how the code is executed has changed.
Checklist:
For repository code-owners and collaborators only