{6}Restore a node from backup #33

benjamin-actyx · 2021-07-21T07:46:29Z

Job story

When:

An Actyx node has died or broken

I want to:

Restore it from a backup

So that:

I can follow my standard admin operating procedure

->
The problem is that peers may have "newer" versions of the restored node’s chain than the node itself.
It would then re-use the same offsets to potentially write different data.
An easy way to work around this is to switch to a new node-id.
A more complicated way would be to try and restore state via the peers, then continue with the old node-id.

Acceptance criteria

Script or guide or automatic behavior that makes restoring from backups easy and fail-safe

mcamou · 2021-07-21T07:58:16Z

See https://actyx.mantishub.io/view.php?id=494#c3470

mcamou · 2021-07-21T08:00:04Z

The issue with a new node-id is that local subscriptions would break (as would happen in https://actyx.mantishub.io/view.php?id=494#c3470). I think the second solution is better.

I think that this would mean that a node should not start emitting events until it receives at least one root map from the swarm. However, this brings up the problem of how a node would know that it's the first node in the swarm.

rkuhn · 2021-07-21T12:56:57Z

@rklaehn Someone found a way to break your “unbreakable” crypto offset reuse prevention scheme.

Waiting for foreign heartbeats before emitting events sounds reasonable to me, apart from the need of a setting to enable “lonely mode”. If we add that, then previously emitted state can be recovered, including crypto offsets. This is not bullet proof (e.g. against deliberately caused restores by someone who can also control the network), but it would make “restore from backup” quite a bit nicer for the admin.

rkuhn · 2023-10-09T10:15:19Z

I think we shouldn’t support this feature: when a node fails, make a new node with the same settings but a new keypair. Retrieving and restoring the settings is already possible with ax in a scripted fashion, so maybe the only thing needed is a documentation page explaining that this is the approach.

mhaushofer changed the title ~~Restore a node from backup~~ {6}Restore a node from backup Jul 30, 2021

mhaushofer transferred this issue from another repository Aug 10, 2021

mhaushofer added Actyx This issue leads to a version bump of Actyx. Feature labels Aug 10, 2021

benjamin-actyx mentioned this issue Aug 20, 2021

[RFC] Owned Record System #186

Closed

jmg-duarte mentioned this issue Oct 13, 2023

Document the node replacement procedure #550

Open

jmg-duarte closed this as completed Oct 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

{6}Restore a node from backup #33

{6}Restore a node from backup #33

benjamin-actyx commented Jul 21, 2021

mcamou commented Jul 21, 2021

mcamou commented Jul 21, 2021

rkuhn commented Jul 21, 2021

rkuhn commented Oct 9, 2023

{6}Restore a node from backup #33

{6}Restore a node from backup #33

Comments

benjamin-actyx commented Jul 21, 2021

Job story

Acceptance criteria

mcamou commented Jul 21, 2021

mcamou commented Jul 21, 2021

rkuhn commented Jul 21, 2021

rkuhn commented Oct 9, 2023