Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rpc: rollback #29565

Open
Sjors opened this issue Mar 5, 2024 · 9 comments
Open

rpc: rollback #29565

Sjors opened this issue Mar 5, 2024 · 9 comments
Labels

Comments

@Sjors
Copy link
Member

Sjors commented Mar 5, 2024

Please describe the feature you'd like to see added.

bitcoin-cli rollback

Arguments:

  1. height or hash
  2. delete blocks from storage
  3. delete headers

This should be used with networking turned off. As soon as you connect to a peer, it will sync back to the tip as normal.

It should rollback indexes too.

It should probably warn users if they have a wallet loaded, especially if the rollback is more than 10 (?) block; that's the point where the rollback process stops placing transactions back into the mempool.

Is your feature related to a problem, if so please describe it.

We currently rely on invalidateblock to roll back the chain to an arbitrary height. And then reconsiderblock to go back to the tip. Rolling back the chain is used when creating snapshots for assume utxo.

This works but the RPC methods are not designed for this. It has annoying side-effects like disconnecting peers because they have "invalid" blocks. If there are stale blocks near the height you want to roll back to, you have to make sure to invalidate those too. IIRC there's also no way to abort the process.

For functional tests it can be useful to forget a block and a header, e.g. to test behavior when a header is received multiple times from different peers, when receiving an unsolicited or mutated block. Currently such a test has to carefully sequence events, or use multiple unconnected nodes. This RPC would effectively add "undo" for such tests.

Describe the solution you'd like

See above

Describe any alternatives you've considered

Status quo.

Please leave any additional context

I'm not advocating to drop invalidateblock entirely. For example it's still useful if you want to fully validate a stale block.

@maflcko
Copy link
Member

maflcko commented Mar 12, 2024

Not sure about adding a heavy and complicated feature such as this to real code, when this feature will only be used in tests. AU will likely use #29553, IIUC?

@Sjors
Copy link
Member Author

Sjors commented Mar 12, 2024

My suggestion in #29553 was to use a proper rollback mechanism (for which you might as well add an RPC call). Right now that PR effectively uses invalidateblock and reconsiderblock internally, as the shell scripts do, which is brittle.

@maflcko
Copy link
Member

maflcko commented Mar 13, 2024

Deleting block files and headers from storage (and memory) seems more brittle and unclean than, let's say, simply rolling back a view of the utxo set (similar to validatechain). I guess this requires RAM (not sure how easy it would be to change), but otherwise this seems cleaner.

@Sjors
Copy link
Member Author

Sjors commented Mar 18, 2024

@maflcko maybe "clean" in terms of code, but not in terms of behaviour. Our validation and p2p code behaves different when blocks are stored or headers are known. We also disconnect peers that send us "invalid" blocks. This is why the UTXO snapshot generation script has to turn off the network. But if anything goes wrong, you might end up with blocks marked invalid.

If we want e.g. 50 people to generate and attest their own snapshot, it seems quite likely that at least some of them will kill the node halfway because it takes too long. It seems safer to me if the node can recover from that with regular IBD behavior, and we don't have to tell users to call getchaintips and call reconsiderblock on various blocks.

Also, the only way at the moment to test if a given snapshot is valid, is to start a new node. With a rollback RPC you can test it on an existing node.

I'd like to see if the code for such an RPC is actually messy and complicated. Perhaps it can be done in a straight-forward way. I haven't tried (yet).

@sipa
Copy link
Member

sipa commented Mar 18, 2024

Could this be accomplished using a (test-only) RPC that marks block data of an inactive block deleted, and/or one that deletes the block index entry of an inactive block?

Then you could use invalidateblock + deleteblockdata to simulate the behavior you want?

Because internally, that's how I'd imagine the functionality you're asking for anyway.

@maflcko
Copy link
Member

maflcko commented Mar 18, 2024

Then you could use invalidateblock + deleteblockdata to simulate the behavior you want?

I don't think this works, in the unlikely case, as explained by sjors, where a stale block (a sibling of the invalidated block) happens to exist and is activated.

@maflcko
Copy link
Member

maflcko commented Mar 18, 2024

@maflcko maybe "clean" in terms of code, but not in terms of behaviour. Our validation and p2p code behaves different when blocks are stored or headers are known. We also disconnect peers that send us "invalid" blocks. This is why the UTXO snapshot generation script has to turn off the network. But if anything goes wrong, you might end up with blocks marked invalid.

Why would it be cleaner in terms of behavior to delete blocks when rolling back the utxo set? And the p2p issues you mention would remain present with your suggested solution, unless you also disable the network.

@Sjors
Copy link
Member Author

Sjors commented Mar 18, 2024

It would still be better to disable the network during the rollback, because otherwise you'd immediately get the headers again and start downloading. We would not disconnect peers, though they might replace us for not having recent stuff (or for not delivering blocks at a height that we previously announced).

Why would it be cleaner in terms of behavior to delete blocks when rolling back the utxo set?

For the use case of assume utxo, there might be alternative ways to roll back the UTXO set. Doing it purely in memory (as a cache) - as you suggested above - would have the advantage of making it safe and quick to abort.

For the use case of p2p test code (compact blocks, modified blocks, header spam, etc), you need to delete blocks and headers.

I don't know a priori which approach is easier to implement.

in the unlikely case

I agree it seems unlikely on mainnet, unless someone does it on purpose to frustrate the process. That could happen if we were to pick a predictable interval. That could be useful if we share (partial) snapshots via the p2p network, where nodes know in advance at which height the next snapshot will be.

@maflcko
Copy link
Member

maflcko commented Mar 18, 2024

For the use case of p2p test code (compact blocks, modified blocks, header spam, etc), you need to delete blocks and headers.

For p2p tests, my preference would to just use what has been used previously: Spin up a node and sync the blocks/headers from scratch, when invalidateblock doesn't work. There needs to be a substantial benefit, to warrant writing a complex, new, test-only RPC for an alternative solution to an already solved edge case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants