Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
multi: Migration for utxo set semantics reversal. #1520
As of this PR, the expected behavior is that there is a single migration that takes an hour or two to complete, depending on enabled indexes, after which it will no longer be possible to downgrade.
As the warning above notes, if you try to run an older software version after this migration has completed, you will get an error message similar to
This adds code to migrate the database to the new schema required by the recent utxo set semantics reversal changes.
It involves removing the existing utxo set, spend journal, ticket database, various indexes, resetting the best chain tip back to the genesis block, and finally performing a full reindex. The process can be interrupted at any point and future invocations will resume from the point it was interrupted.
In addition, some new infrastructure to remove the ticket database and support for index versions is added to support the migration.
This is work towards #1145.
The only error condition I could trigger is if you interrupt the migration right at the start (so that the utxo set is cleared but the actual migration hasn't started yet), then go back to the older version: you don't get the migration version error, but some reorg errors (0bin.net log)
I don't think this needs to be specifically tested against, since it should be pretty hard state to get to, but if we ever see some startup reorg errors in the future, this might be it.
Besides this, I tested TERMing/QUITing/KILLing the process a bunch of times throughout the migration but didn't hit any other errors.
Results for mainnet migration:
@matheusd Thanks for the thorough testing. I'm not sure there is really a way to fix the "go back to old version mid upgrade" condition, because the older versions don't look for an "upgrade in progress" flag of any sort. It probably makes sense to introduce something of that sort going forward, but since the code doesn't already exist in the older version, I'm not sure anything can be done there.
Nov 12, 2018
The migration took 48 minutes on my server (mainnet) and the entire process went smoothly. The only concern (which may not actually be an issue) is that once the migration completed and peers were being served and blocks connected, the median time appears to have been created using a time sample from before the migration occurred:
This may create issues depending on how long the migration actually took, but in my case, it did not cause any problems on a non-mining node.
Regarding the time offset, I looked into both the code and performed the migration a few times now. I've confirmed that no time samples are added before the reindex and the listeners for peers to connect is not started until after the reindex is complete. It was a time sample from a peer with a bad time.