Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load and save event sourced state #1000

Merged
merged 18 commits into from Aug 1, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
4 changes: 4 additions & 0 deletions CHANGELOG.md
Expand Up @@ -10,6 +10,10 @@ changes.

## [0.12.0] - UNRELEASED

- **BREAKING** Change persisted state to be a sequence of events instead. This
increases the performance of the `hydra-node` as less data needs to be written
and prepares internal architecture for more event-sourced improvements.

- `Greetings` message now contains also the hydra-node version.

- New HTTP endpoint (`GET /protocol-parameters`) which provides the current protocol parameters
Expand Down
Expand Up @@ -3,53 +3,56 @@ slug: 24
title: |
24. Persist state changes incrementally
authors: [abailly]
tags: [Proposed]
tags: [Accepted]
---

## Status

Proposed
Accepted

## Context

* The state of a Hydra Head is currently persisted as a whole upon each `NewState` _outcome_ from the `update` function: The new state is serialised and the `state` file is overwritten with the corresponding bytes. While this is a straightforward strategy to implement, it has a huge impact on the performance of a Hydra Head as serialising a large data structure like the `HeadState` and completely overwriting a file is costly
* We revisited our benchmarks and [found](https://github.com/input-output-hk/hydra/issues/186#issuecomment-1584292265) that persistence was the major bottleneck when measuring roundtrip confirmation time,e g. the time it takes from a client's perspective to submit a transaction and observe in a `ConfirmedSnapshot`
* Furthermore, the way we currently handle changes to the `HeadState` in the hydra-node, while conceptually being an `Effect` is handled differently from other `Effect`s: The state is updated transactionally through a dedicated `modifyHeadState` function in the core loop of processing events, and _then_ effects are processed.
- The state of a Hydra Head is currently persisted as a whole upon each `NewState` _outcome_ from the `update` function: The new state is serialised and the `state` file is overwritten with the corresponding bytes. While this is a straightforward strategy to implement, it has a huge impact on the performance of a Hydra Head as serialising a large data structure like the `HeadState` and completely overwriting a file is costly
- We revisited our benchmarks and [found](https://github.com/input-output-hk/hydra/issues/186#issuecomment-1584292265) that persistence was the major bottleneck when measuring roundtrip confirmation time,e g. the time it takes from a client's perspective to submit a transaction and observe in a `ConfirmedSnapshot`
- Furthermore, the way we currently handle changes to the `HeadState` in the hydra-node, while conceptually being an `Effect` is handled differently from other `Effect`s: The state is updated transactionally through a dedicated `modifyHeadState` function in the core loop of processing events, and _then_ effects are processed.

## Decision

Implement state persistence using [_Event Sourcing_](https://thinkbeforecoding.com/post/2013/07/28/Event-Sourcing-vs-Command-Sourcing). Practically, this means:

1. Replace the `NewState` outcome with a `StateChanged` _event_ which can be part of the `Outcome` of `HeadLogic`'s `update` function, representing the _change_ to be applied to the current state.
2. Add a dedicated [handle](/adr/4) to manage `StateChanged` events, applying them, and maintaining current `HeadState`
3. Persist `StateChanged`s in an append-only log
4. Upon node startup, reread `StateChanged` events log and reapply those to reset the `HeadState`
3. _(Optional)_: Make `StateChanged` _invertible_
2. Add an `aggregate` function to manage applying `StateChanged` events on top of the current `HeadState` to keep it updated in-memory.
3. Persist `StateChanged`s in an append-only log using a dedicated [handle](/adr/4).
4. Upon node startup, reread `StateChanged` events log and reapply those to reset the `HeadState`.

The following sequence diagram illustrates new event handling in the `HeadLogic`:

```mermaid
sequenceDiagram
EventQ ->> Node : nextEvent
Node ->> State : getState
activate State
State -->> Node : state: HeadState
Node ->> HeadLogic: update(state)
activate HeadLogic
HeadLogic -->> Node : [st:StateChanged, ot:OtherEffect]
Node ->> State: setState(st)
activate State
State ->> State: persist(st)
State -->> Node: ok
Node -) Other: handle(ot)
Node ->> Node : nextEvent : event
critical modifyHeadState : state -> state';
activate Node
Node ->>+ HeadLogic: update(state, event)
HeadLogic -->>- Node : Outcome (sc: StateChanged, oe: OtherEffect)
Node ->>+ HeadLogic: aggregate(state, sc)
HeadLogic -->- Node : state'
end
deactivate Node
Node ->> Persistence: append(sc)
Node ->> Node: processEffect(oe)
```

## Consequences

- :racehorse: The main expected consequence of this change is an increase of the overall performance of Hydra Head network
- While _Effect handlers_ might be asynchronous, `StateChanged` handler _must_ be synchronous to ensure consistency and durability of state changes
- An longer term consequence is the possibilities this change introduces with respect to `ServerOutput` handling and client's access to a head's state:
- Instead of having the `HeadLogic` emits directly a `ClientEffect`, the latter could be the result of a client-centric _interpretation_ of a `StateChanged`
- Pushing this a little further, we could maintain a _Query Model_ for clients with a dedicated [Query API](https://github.com/input-output-hk/hydra/discussions/686) to ease implementation of stateless clients
- Crashing nodes could catch-up with the Head's state by requesting `StateChanged` changes they are missing
- Calling `StateChanged` an _event_ while treating it in the code alongside _effects_ might introduce some confusion as we already use the word [Event](https://github.com/input-output-hk/hydra/blob/45913954eb18ef550a31017daa443cee6720a00c/hydra-node/src/Hydra/HeadLogic.hs#L64) to designate the inputs to the Head logic state machine. We might want at some later point to unify the terminology
- :racehorse: The main expected consequence of this change is an increase of the overall performance of Hydra Head network.

- Need to pattern match twice on the `HeadState`, once in `update` and once in `aggregate`.

- Terms from the specification are distributed over `update` and `aggregate` function. For example, the statements about updating all seen transactions would now be in `aggregate` and not anymore in `update`.

- New possibilities this change introduces with respect to `ServerOutput` handling and client's access to a head's state:

- Instead of having the `HeadLogic` emits directly a `ClientEffect`, the latter could be the result of a client-centric _interpretation_ of a `StateChanged`.
- Pushing this a little further, we could maintain a _Query Model_ for clients with a dedicated [Query API](https://github.com/input-output-hk/hydra/discussions/686) to ease implementation of stateless clients.

- Calling `StateChanged` an _event_ while treating it in the code alongside _effects_ might introduce some confusion as we already use the word [Event](https://github.com/input-output-hk/hydra/blob/45913954eb18ef550a31017daa443cee6720a00c/hydra-node/src/Hydra/HeadLogic.hs#L64) to designate the inputs (a.k.a. commands) to the Head logic state machine. We might want at some later point to unify the terminology.
115 changes: 67 additions & 48 deletions hydra-cluster/src/HydraNode.hs
Expand Up @@ -37,6 +37,7 @@ import System.FilePath ((<.>), (</>))
import System.IO.Temp (withSystemTempDirectory)
import System.Process (
CreateProcess (..),
ProcessHandle,
StdStream (..),
proc,
withCreateProcess,
Expand Down Expand Up @@ -284,52 +285,75 @@ withHydraNode ::
(HydraClient -> IO a) ->
IO a
withHydraNode tracer chainConfig workDir hydraNodeId hydraSKey hydraVKeys allNodeIds hydraScriptsTxId action = do
withLogFile logFilePath $ \out -> do
withSystemTempDirectory "hydra-node" $ \dir -> do
let cardanoLedgerProtocolParametersFile = dir </> "protocol-parameters.json"
readConfigFile "protocol-parameters.json" >>= writeFileBS cardanoLedgerProtocolParametersFile
let hydraSigningKey = dir </> (show hydraNodeId <> ".sk")
void $ writeFileTextEnvelope hydraSigningKey Nothing hydraSKey
hydraVerificationKeys <- forM (zip [1 ..] hydraVKeys) $ \(i :: Int, vKey) -> do
let filepath = dir </> (show i <> ".vk")
filepath <$ writeFileTextEnvelope filepath Nothing vKey
let ledgerConfig =
CardanoLedgerConfig
{ cardanoLedgerProtocolParametersFile
}
let p =
( hydraNodeProcess $
RunOptions
{ verbosity = Verbose "HydraNode"
, nodeId = NodeId $ show hydraNodeId
, host = "127.0.0.1"
, port = fromIntegral $ 5000 + hydraNodeId
, peers
, apiHost = "127.0.0.1"
, apiPort = fromIntegral $ 4000 + hydraNodeId
, monitoringPort = Just $ fromIntegral $ 6000 + hydraNodeId
, hydraSigningKey
, hydraVerificationKeys
, hydraScriptsTxId
, persistenceDir = workDir </> "state-" <> show hydraNodeId
, chainConfig
, ledgerConfig
}
)
{ std_out = UseHandle out
}
withCreateProcess p $
\_stdin _stdout _stderr processHandle -> do
result <-
race
(checkProcessHasNotDied ("hydra-node (" <> show hydraNodeId <> ")") processHandle)
(withConnectionToNode tracer hydraNodeId action)
case result of
Left err -> absurd err
Right a -> pure a
withLogFile logFilePath $ \logFileHandle -> do
withHydraNode' chainConfig workDir hydraNodeId hydraSKey hydraVKeys allNodeIds hydraScriptsTxId (Just logFileHandle) $ do
\_ _err processHandle -> do
result <-
race
(checkProcessHasNotDied ("hydra-node (" <> show hydraNodeId <> ")") processHandle)
(withConnectionToNode tracer hydraNodeId action)
case result of
Left e -> absurd e
Right a -> pure a
where
logFilePath = workDir </> "logs" </> "hydra-node-" <> show hydraNodeId <.> "log"

-- | Run a hydra-node with given 'ChainConfig' and using the config from
-- config/.
withHydraNode' ::
ChainConfig ->
FilePath ->
Int ->
SigningKey HydraKey ->
[VerificationKey HydraKey] ->
[Int] ->
-- | Transaction id at which Hydra scripts should have been published.
TxId ->
-- | If given use this as std out.
Maybe Handle ->
(Handle -> Handle -> ProcessHandle -> IO a) ->
IO a
withHydraNode' chainConfig workDir hydraNodeId hydraSKey hydraVKeys allNodeIds hydraScriptsTxId mGivenStdOut action = do
withSystemTempDirectory "hydra-node" $ \dir -> do
let cardanoLedgerProtocolParametersFile = dir </> "protocol-parameters.json"
readConfigFile "protocol-parameters.json" >>= writeFileBS cardanoLedgerProtocolParametersFile
let hydraSigningKey = dir </> (show hydraNodeId <> ".sk")
void $ writeFileTextEnvelope hydraSigningKey Nothing hydraSKey
hydraVerificationKeys <- forM (zip [1 ..] hydraVKeys) $ \(i :: Int, vKey) -> do
let filepath = dir </> (show i <> ".vk")
filepath <$ writeFileTextEnvelope filepath Nothing vKey
let ledgerConfig =
CardanoLedgerConfig
{ cardanoLedgerProtocolParametersFile
}
let p =
( hydraNodeProcess $
RunOptions
{ verbosity = Verbose "HydraNode"
, nodeId = NodeId $ show hydraNodeId
, host = "127.0.0.1"
, port = fromIntegral $ 5000 + hydraNodeId
, peers
, apiHost = "127.0.0.1"
, apiPort = fromIntegral $ 4000 + hydraNodeId
, monitoringPort = Just $ fromIntegral $ 6000 + hydraNodeId
, hydraSigningKey
, hydraVerificationKeys
, hydraScriptsTxId
, persistenceDir = workDir </> "state-" <> show hydraNodeId
, chainConfig
, ledgerConfig
}
)
{ std_out = maybe CreatePipe UseHandle mGivenStdOut
, std_err = CreatePipe
}
withCreateProcess p $ \_stdin mCreatedHandle mErr processHandle ->
case (mCreatedHandle, mGivenStdOut, mErr) of
(Just out, _, Just err) -> action out err processHandle
(Nothing, Just out, Just err) -> action out err processHandle
(_, _, _) -> error "Should not happen™"
where
peers =
[ Host
{ Network.hostname = "127.0.0.1"
Expand Down Expand Up @@ -357,11 +381,6 @@ withConnectionToNode tracer hydraNodeId action = do
sendClose connection ("Bye" :: Text)
pure res

-- | Runs an action with a new connection to given Hydra node.
withNewClient :: HydraClient -> (HydraClient -> IO a) -> IO a
withNewClient HydraClient{hydraNodeId, tracer} =
withConnectionToNode tracer hydraNodeId

hydraNodeProcess :: RunOptions -> CreateProcess
hydraNodeProcess = proc "hydra-node" . toArgs

Expand Down