You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This means that alice is still sending Reliability layer messages as if she only had 3 peers. Investigating further the issue we realised this was caused by the acknowledgments persistence mechanism we have put in place as part of #1101
In its first run alice's node save the acknowledged messages' vector for 4 parties
In the second run, alice's node load the saved vector which is still 4, but number of parties should now be 5.
Expected behaviour
The user should be warned that there's an inconsistency between the currently configured network peers and the saved state. It should probably not be possible to start a node in such a situation unknowingly as this situation could come not from a misconfiguration but from an unsuspecting party forming a new head with a different configuration, and reusing inadvertently persisted state from a previous run.
In general, this issue highlights the need for a better strategy on how the hydra-node persists its state and what the user can do about it.
The text was updated successfully, but these errors were encountered:
This problem was reproduced by having a node missing --hydra-verification-key and --cardano-verification-key for a peer.
In the code we check that the number of hydra-verification-keys and cardano-verification-keys matches, but we do not check they also match the list of --peer configured.
So here we have 2 different things going on:
we should verify the number of peers, hydra-verification-keys and cardano-verification-keys matches.
when we restart the network, we need to check that its configuration is consistent with what is persisted on acks.
ghost
changed the title
Network produces MalformedAcks even after fixing --peer list
Restarting a node with different number of peers prevents it to connect to the cluster
Nov 24, 2023
Context & versions
All recent versions, seen on 9abb099
Analyses
Misbehaviour
This bug was observed through the following sequence of operations:
alice
) inadvertently misconfigure their node by "forgetting" another party's (bob
) configurationPing
notificationsalice
's node is not seen bybob
'salice
stops her node, reconfigure it, then restartsalice
is not seen by any other party's nodeTroubleshooting
Looking at their logs,
alice'
s peers notice they are seeing the following message repeatedly:This means that
alice
is still sendingReliability
layer messages as if she only had 3 peers. Investigating further the issue we realised this was caused by the acknowledgments persistence mechanism we have put in place as part of #1101alice
's node save the acknowledged messages' vector for 4 partiesalice
's node load the saved vector which is still 4, but number of parties should now be 5.Expected behaviour
The user should be warned that there's an inconsistency between the currently configured network peers and the saved state. It should probably not be possible to start a node in such a situation unknowingly as this situation could come not from a misconfiguration but from an unsuspecting party forming a new head with a different configuration, and reusing inadvertently persisted state from a previous run.
In general, this issue highlights the need for a better strategy on how the hydra-node persists its state and what the user can do about it.
The text was updated successfully, but these errors were encountered: