Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When restarting a node, allow the client to restore their state #580

Closed
1 of 2 tasks
pgrange opened this issue Oct 25, 2022 · 2 comments · Fixed by #618
Closed
1 of 2 tasks

When restarting a node, allow the client to restore their state #580

pgrange opened this issue Oct 25, 2022 · 2 comments · Fixed by #618
Labels
green 💚 Low complexity or well understood feature 💬 feature A feature on our roadmap

Comments

@pgrange
Copy link
Contributor

pgrange commented Oct 25, 2022

Why

With the current implementation of the backup and restore feature, a restarted hydra node will successfully restore its state but will not emit to the client the events which happened before the restart of the node.

This is a problem for the clients which can't restore their own state when being restarted themselves.

What

When a new client connects to the node, it receive all the events that happened, for this node, in this head, no matter if the did happen before or after a potential restart of this node.

With the events, we provide a way for the client to figure out if it has already seen this event in the past so that the client can ensure idempotent behavior of replayed events. For instance, it can be the case that the client already seen this event in the past and performed some side effect and does not want to perform this side effect twice. Or maybe the client already maintains its own persistence of state and must filter between events already present in its state and new events from its point of view.

How

  • Persist & replay server outputs, such that

    • The network or the chain shall not see any effect of the hydra node replaying the events.
    • If several clients connect to the same hydra node, each of them receive the same events.
    • After restart, clients can determine that no peers are connected anymore.
  • Associate some sort of monotonically increasing id to the events so that the client can notice that it has already viewed it in the past.

Acceptance Criteria

When using the TUI, if we open a head and then restart the hydra node and then the TUI, the same state is displayed in the TUI as before the restarting.

Out of Scope

It can be the case that, after some amount of running the head, we get a big amount of events to replay. The question of what to do about that, how to deal with problem related to too many events being replayed is out of scope of this specific issue.

@pgrange pgrange changed the title Replay stored events when restarting a hydra node When restarting a node, allow the customer to restore their state Oct 25, 2022
@pgrange pgrange changed the title When restarting a node, allow the customer to restore their state When restarting a node, allow the client to restore their state Oct 25, 2022
@Quantumplation
Copy link
Contributor

It's worth noting that not all events are equal in this space from a consumers perspective. That is, there are some events which we cannot recover from missing, and there are others which we effectively just ignore during playback. From our perspective, (and at first blush, may have miscategorized one or two) here are the events we do/don't care about:

Must replay:
 - ReadyToCommit
 - Committed
 - HeadIsOpen
 - HeadIsClosed
 - HeadIsContested
 - ReadyToFanout
 - HeadIsAborted
 - HeadIsFinalized
 - TxValid
 - SnapshotConfirmed

Unimportant:
 - Greetings
 - PeerConnected
 - PeerDisconnected
 - GetUTXOResponse
 - RolledBack
 - InvalidInput
 - PostTxOnChainFailed
 - CommandFailed
 - TxInvalid
 - TxExpired
 - TxSeen

If you replay all events, it's not a problem, but I thought I would provide that flavor just in case.

As for a monotonic timer for idempotence, that works perfectly.

@pgrange
Copy link
Contributor Author

pgrange commented Nov 15, 2022

You raising an excellent point @Quantumplation.

Thinking about is, we realize that It's not obvious to figure out which events should not be sent on restart and which should be. For some, it could seem obvious, of course, but for other, it can be a bit tricky.

Also, we're not sure what the client application does about the events regarding its own state. I mean, it might be the case that the restart of a node would have subtle impact on the client state that would not be possible to implement by just sending it the right events.

Another approach would be to notify the client application about the restart and when it happened so that it can decide what to do with its current state when it sees it. And we could do it with the Greetings event which embark some data that could change between two restart, by the way.

That's the first approach will we try here: send a Greetings message after each restart in the events history.

For instance that could look like that:

  1. Greetings
  2. PeerConnected
  3. Greetings -- restart occurred before event 3 clean your state if needed
  4. PeerConnected
  5. ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
green 💚 Low complexity or well understood feature 💬 feature A feature on our roadmap
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants