-
Notifications
You must be signed in to change notification settings - Fork 4
Pausing and unpausing
Connor Mendenhall edited this page Dec 20, 2021
·
2 revisions
The key principle to keep in mind in order to pause and unpause the running system is that reading storage diffs from the head of a syncing chain is faster than backfilling them. The goal and sequencing in pausing and unpausing is to avoid disrupting syncing storage diffs.
- Shut down Postgraphile and the public API.
- Stop any
vdb-execute
containers (Oasis and MCD). This will pause the data transformation layer. - Stop the
vdb-header-sync
container. This will stop syncing headers to the synchronization layer.- If the node this container is using is not shared infrastructure, it's OK to shut it down as well. (However, this is not currently the case in production—the header sync service uses a shared node).
- Pause the statediffing geth nodes.
- I suspect this is already automated through EBS snapshots or similar, but it's wise to ensure there's a recent storage volume backup for these nodes before pausing them. In a failure scenario, it would be possible to restore geth chaindata from an EBS volume snapshot and sync back up to the head of the chain.
- Stop the
vdb-extract-diffs
containers. - Stop Postgres and save an RDS snapshot.
This procedure is not quite the opposite of pausing, because transforming storage diffs relies in some cases on event data. To restart the system, we'll want to catch up to events and headers first, then re-enable diff processing.
- Restore Postgres from the saved RDS snapshot.
- Start any
vdb-execute
containers (Oasis and MCD). These will start listening for events and diffs to transform. - Start the
vdb-header-sync
container (and the Ethereum node it's reading from, if it's been shut down). This will begin syncing headers from the last synced block up to the head of the chain. The execute processes will transform missing events. Transforming storage diffs relies on information from events to generate storage keys, so it's important to do this step before restarting diff processing, and to allow it to catch up fully to the head of the chain.-
Note: if new contracts have been added since the system was paused, there are two options. Add the contracts and transformers to the
vdb-execute
configuration before starting the container (just like onboarding new collateral types), or let events sync, add the new contracts/transformers, and run a subsequent event backfill before proceeding.
-
Note: if new contracts have been added since the system was paused, there are two options. Add the contracts and transformers to the
- For each region with a statediffing geth + extract diffs container:
- Start the
vdb-extract-diffs
container for this region. - Unpause the statediffing geth node for this region.
- Storage diffs should be stored and transformed as the node syncs to the head of the chain.
- Start the
- Reenable Postgraphile and the API.