Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling reorgs #38

Open
lrettig opened this issue Jun 2, 2020 · 6 comments
Open

Handling reorgs #38

lrettig opened this issue Jun 2, 2020 · 6 comments

Comments

@lrettig
Copy link
Member

lrettig commented Jun 2, 2020

Up to now, @avive and I have been operating under the assumption that we don't need any special handling of reorgs. The various streams can just re-send all updated data and let the client sort it out.

However, we need to think this out a little more thoroughly as there are likely some corner cases we're not currently accounting for. It's a little easier to reason about the mesh, but what happens wrt global state? What if, e.g., an account with a balance disappears after a reorg? Etc.

@talm proposed that a stream sends a special "reorg" token, then dies, as an indication of a reorg. @noamnelke has some ideas for how to handle reorgs that he will share.

@avive
Copy link
Contributor

avive commented Apr 17, 2021

This might be a good way to this - when a 'reorg' event is sent, explorer-like clients should likely try to sync from genesis to get old blocks and transactions possible new state and wallet-like clients should query the api for fresh data for all the entities they store and cache locally. e.g. transactions and rewards.

@noamnelke
Copy link
Member

Not sure re-syncing from genesis is required (it might become very expensive after a while).

For immutable objects (transactions, ATXs, etc.) we just need to tag them with the layer in which they were added to the database. When a re-org happens we should be able to know the depth of the re-org (how many layers back it affects) and then invalidate all objects created after this layer and recreate them as we re-sync from that layer to the current layer.

For mutable objects, like accounts (balance, nonce, etc.) and contracts, we must be able to recreate the updated state. Intuitively, it seems possible to me, but I need to look at the exact data we collect to know for sure.

@lrettig
Copy link
Member Author

lrettig commented May 5, 2021

For immutable objects (transactions, ATXs, etc.) we just need to tag them with the layer in which they were added to the database. When a re-org happens we should be able to know the depth of the re-org (how many layers back it affects) and then invalidate all objects created after this layer and recreate them as we re-sync from that layer to the current layer.

With respect to the API, though, and especially streams, this would require sending some sort of special "token" on the stream to indicate that a reorg happened, and the depth of the reorg. We then have two options:

  • restream all updated data since the point of the reorg: this has the downside that it may overwhelm downstream clients that are slow to consume the data
  • instead, continue to stream only new data, and expect downstream clients to use a historical "query" endpoint to resync all data since the point of the reorg

@noamnelke
Copy link
Member

Streaming only new data would be considerably harder to implement. The first option is what would happen "automatically".

Correct me if I'm wrong, but as I imagine it when a re-org happens, the node internally invalidates everything back to a certain point and then starts to work back to the "present" from there. This means all the normal calls to the processing methods happen and, unless we change anything, all the streams get everything along the way.

Since this isn't instantaneous (the node has to process everything) I don't see why this is different than when the node is syncing.

@lrettig
Copy link
Member Author

lrettig commented May 6, 2021

You're probably right. It depends how we implement it. That sounds like the most reasonable design to me. In any case, the API design should be isolated from the lower-level implementation. Let's go with this plan for now: we will restream things after a reorg.

Do you think we need the "token" indicating the reorg, and its depth? Or should downstream clients be expected to figure this out for themselves when they see old data being restreamed?

@lrettig
Copy link
Member Author

lrettig commented Apr 26, 2024

Revisiting this as it's come up in the API v2 design and implementation (#319 (comment)), and adding @kacpersaw. I see two potential approaches here:

  1. As discussed above, send a special token in the stream (which streams exactly? just LayerStream?) indicating that a reorg has occurred, and the layer as of which it occurred. Then end the stream. The client can re-establish the stream and re-download/verify content beyond the reorg point.

  2. Make it easy for the client to detect that a reorg has occurred, and the layer as of which it occurred. I think we need this independent of bullet point (1), i.e., we need both anyway. The issue is that there's no straightforward way to check this today without a cumulative state hash (v2alpha1: Add LayerService with transaction and block definition #319 (comment)).

Perhaps a quick-and-dirty solution here is to implement a cumulative hash that's just a hash of the chain of all previous layer/block hashes, i.e., it doesn't include a state hash yet since we don't yet know how we want to do that (spacemeshos/go-spacemesh#5677) and it may depend upon the VM design anyway. Thoughts?

CC @kacpersaw , @dshulyak

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants