Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

{6}Restore a node from backup #33

Closed
1 task
benjamin-actyx opened this issue Jul 21, 2021 · 4 comments
Closed
1 task

{6}Restore a node from backup #33

benjamin-actyx opened this issue Jul 21, 2021 · 4 comments
Labels
Actyx This issue leads to a version bump of Actyx. Feature

Comments

@benjamin-actyx
Copy link
Contributor

Job story

When:

  • An Actyx node has died or broken

I want to:

  • Restore it from a backup

So that:

  • I can follow my standard admin operating procedure

->
The problem is that peers may have "newer" versions of the restored node’s chain than the node itself.
It would then re-use the same offsets to potentially write different data.
An easy way to work around this is to switch to a new node-id.
A more complicated way would be to try and restore state via the peers, then continue with the old node-id.

Acceptance criteria

  • Script or guide or automatic behavior that makes restoring from backups easy and fail-safe
@mcamou
Copy link
Contributor

mcamou commented Jul 21, 2021

@mcamou
Copy link
Contributor

mcamou commented Jul 21, 2021

The issue with a new node-id is that local subscriptions would break (as would happen in https://actyx.mantishub.io/view.php?id=494#c3470). I think the second solution is better.

I think that this would mean that a node should not start emitting events until it receives at least one root map from the swarm. However, this brings up the problem of how a node would know that it's the first node in the swarm.

@rkuhn
Copy link
Member

rkuhn commented Jul 21, 2021

@rklaehn Someone found a way to break your “unbreakable” crypto offset reuse prevention scheme.

Waiting for foreign heartbeats before emitting events sounds reasonable to me, apart from the need of a setting to enable “lonely mode”. If we add that, then previously emitted state can be recovered, including crypto offsets. This is not bullet proof (e.g. against deliberately caused restores by someone who can also control the network), but it would make “restore from backup” quite a bit nicer for the admin.

@mhaushofer mhaushofer changed the title Restore a node from backup {6}Restore a node from backup Jul 30, 2021
@mhaushofer mhaushofer transferred this issue from another repository Aug 10, 2021
@mhaushofer mhaushofer added Actyx This issue leads to a version bump of Actyx. Feature labels Aug 10, 2021
@rkuhn
Copy link
Member

rkuhn commented Oct 9, 2023

I think we shouldn’t support this feature: when a node fails, make a new node with the same settings but a new keypair. Retrieving and restoring the settings is already possible with ax in a scripted fashion, so maybe the only thing needed is a documentation page explaining that this is the approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Actyx This issue leads to a version bump of Actyx. Feature
Projects
None yet
Development

No branches or pull requests

5 participants