Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible lost deposits during a whole network snapshot-restore #7751

Closed
11 tasks
wwestgarth opened this issue Mar 3, 2023 · 1 comment · Fixed by #7774
Closed
11 tasks

Possible lost deposits during a whole network snapshot-restore #7751

wwestgarth opened this issue Mar 3, 2023 · 1 comment · Fixed by #7774

Comments

@wwestgarth
Copy link
Contributor

wwestgarth commented Mar 3, 2023

Problem encountered

As mentioned here: https://vegaprotocol.slack.com/archives/CEE07A084/p1677836123037479?thread_ts=1677825165.254859&cid=CEE07A084

there were reports of deposits getting lost around the time of the testnet outage.

Theres not much to go on but my hunch is that when the entire network was started from a snapshot the EEF didn't restore correctly. So any deposits made during the downtime were not getting picked up upon restore.

If true, this bug would also present during a protocol upgrade.

The snapshot and some block data from the point at which the network restarted is here: https://jenkins.ops.vega.xyz/blue/organizations/jenkins/private%2FSnapshots%2FFairground/detail/Fairground/15665/pipeline/111/

I've also attached the artefacts to the ticket incase jenkins cleans them up.

Two deposits that got lost:
https://sepolia.etherscan.io/tx/0x5f7fe077a21db8bc758505e1d8d3c1c00f74efe5f87c4bc6fee444c9e5678d15
10c36272e050adafe1b274a6194939e22d356a0ed8e3bacb7e02510abb9818ee

https://sepolia.etherscan.io/tx/0x9597a9e9af0f002020b01a1d9b3f69ae94dc9541e178a4447bdfbcb3ad3bb16a
58aaf8e18fcc89ad099cbdeaba658830330499e7e6f4ca7b1bfa0820f1daa670

Observed behaviour

When testnet went down, any deposits made during/near were not seen by the vega chain after it was restored via snapshots.

Expected behaviour

All deposits are always picked up.

System response

Describe what the system response was, include the output from the command, automation, or else.

Steps to reproduce

Manual

Steps to reproduce the behaviour manually:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Automation

Link to automation and explanation on how to run it to reproduce the problem/bug

Evidence

Logs

If applicable, add logs and/or screenshots to help explain your problem.

Additional context

Add any other context about the problem here including; system version numbers, components affected.

Definition of Done

ℹ️ Not every issue will need every item checked, however, every item on this list should be properly considered and actioned to meet the DoD.

Before Merging

  • Code refactored to meet SOLID and other code design principles
  • Code is compilation error, warning, and hint free
  • Carry out a basic happy path end-to-end check of the new code
  • All APIs are documented so auto-generated documentation is created
  • All bug recreation steps can be followed without presenting the original error/bug
  • All Unit, Integration and BVT tests are passing
  • Implementation is peer reviewed (coding standards, meeting acceptance criteria, code/design quality)
  • Create front end or console tickets with feature labels (should be done when starting the work if dependencies known i.e. API changes)

After Merging

  • Move development ticket to Done if there is NO requirement for new system-tests
  • Resolve any issues with broken system-tests
  • Create documentation tickets with feature labels if functionality has changed, or is a new feature
@wwestgarth
Copy link
Contributor Author

wwestgarth commented Mar 3, 2023

When we restore the EEF from a snapshot we restore the last-seen block height of the multisig and staking contracts but we do not do the same for the collateral contract. It will just start looking from the height of the ethereum chain when the node was restarted.

Interestingly the same is probably also true for checkpoints. We have this in the spec:

ERC-20 collateral:

    last block height of a confirmed ERC-20 deposit on the Ethereum chain with number_of_confirmations. [Ethereum bridge](https://github.com/vegaprotocol/specs/blob/master/protocol/0031-ETHB-ethereum_bridge_spec.md#network-parameters)
    all pending ERC-20 deposits (not confirmed before this block) [Ethereum bridge](https://github.com/vegaprotocol/specs/blob/master/protocol/0031-ETHB-ethereum_bridge_spec.md#deposits)

but no AC for it.

I might spin that off as another ticket though after running my protocol-design, because maybe its not something we want to actually do?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

1 participant