feat(anvil): add load/dump state options #3730

mattsse · 2022-11-21T19:51:59Z

Motivation

this integrates the loadState dumpState RPC calls into the config and adds --load-state <Path>, --dump-state <Path> cli arguments.

@kahuang @fubhy wdyt?

Solution

kahuang · 2022-11-21T19:57:01Z

Looks good and works for our use-case (being able to gracefully stop/start anvil instances)

Not in scope, but calling out here: if anvil were to be shut down with (e.g. kill -9) we would have no state since the shutdown handler doesn't run. A real "db" with a wal etc. would be great for resilience in a cloud environment. But, re-iterating that this is probably not in scope for anvil

fubhy · 2022-11-21T21:02:26Z

Good stuff! Looks like this would solve our use-case too. Thanks @mattsse!

fubhy · 2022-11-21T22:21:18Z

A couple observations after giving this a quick test spin:

I am not sure if two separate options (--dump-state vs. --load-state) are required. Did you choose to do that because it would allow "snapshot" use-cases that always start from the same (previously recorded) state but without overriding it? It might be useful but I have to admit I don't see where :-)
It would be good to allow --load-state to silently fail if there is no stored state yet (directory is empty). Otherwise, we'd have to write a custom entrypoint script that sets that flag conditionally. So in addition to the point that @kahuang already raised (kill -9), the state would also not be preserved properly if the wrapping shell script, for whatever reason, doesn't properly handle and forward the shutdown signals.
I think it woudl be better if state was written to disk continuously instead of just on shutdown. Agree with @kahuang that a proper db handler would be great for resilience. I'd personally say though that this would be in-scope for anvil (imho), but maybe not as part of this PR :)

fubhy · 2022-11-21T22:45:34Z

Instead of just retaining the latest state, we would also like to get full archive data. Do you see a chance for that to happen? Probably a different request than this but would be useful nonetheless.

mattsse · 2022-11-22T10:28:57Z

thanks for this feedback, will update accordingly.

perhaps we can flush this every couple minutes or so?

fubhy · 2022-11-22T12:30:37Z

Here's some background / context around my use-case:

I'm trying to put together a tightly integrated local development environment for front-to-back smart contract, data pipeline & dapp development needs.

Ideally, I'd like to end up with an AIO (all-in-one) service emulator running locally (similar to the likes of minikube, etc.) that lets you easily spin up and control all infrastructure components required for building a blockchain application end-to-end in a fault tolerant, fully observable, plug&play manner... With minimal configuration requrements & wiring of individual pieces. I'm dreaming of a fully replicable local development environment for all stakeholders across the stack (smart contract developers, data engineers, app developers, etc.) ...

Let's assume you are building a project from the grounds up (or are still iterating on all of these layers). In such a scenario, you want to ...

Write your smart contract code
Test it
Generate bindings
Write your substream / subgraph mappings
Spin up a local RPC node and deploy your contracts
Spin up your data services (e.g. firehose, substreams, subgraphs)
Run a couple of scripts to seed some state
Integrate the user interface on the frontend on top of the underlying contracts and the gathered data
Repeat ...

Currently, setting up such an environment requires developers to think about a lot of infrastructure components and keep a lot of things in sync. And operate them locally! Map ports, connect things, know what to do if any of them fails, etc.

It still feels that all these layers aren't as tightly knitted together within this space as I think they could be. The developer experience at the intersection between smart contracts, data and frontend is brutal. Each team out there is still doing all of that custom wiring by themselves with varying degrees of success.

I want a tool stack that allows developers on all of these layers to seamlessly operate all these service dependencies locally with the ability to easily stop & restart them at will. All whilst preserving local state between restarts. Without them then having to deal with corrupted / out-of-sync state etc.

The problem really is the introduction of stateful components to the stack.

For instance, in case of Firehose, Substreams and Subgraphs it wouldn't be enough to periodically flush data from Anvil to disk to tick these boxes.

If we kill and restart the entire stack, and upon restart, there's data in the Firehose instrumentation that's now missing in Anvil, then we've got a broken state.

Furthermore, in order to allow re-indexing data that depends on eth calls on archive state, we'd need to preserve that too.

I know that I'm asking for a lot here and that this is probably out of scope for this first iteration, but I wanted to provide some context regardless :-).

kahuang · 2022-11-22T19:58:55Z

thanks for this feedback, will update accordingly.

perhaps we can flush this every couple minutes or so?

Configurable flush period feels like a nice middle ground!

mattsse · 2022-11-23T13:55:11Z

@fubhy thanks so much for the additonal context.

this makes a ton of sense.

Could you please open another topic for 4) as this is beyond the scope of this PR?

mattsse · 2022-11-23T14:08:39Z

after reading this again I still have a question re:

I am not sure if two separate options (--dump-state vs. --load-state) are required.

the reason why I used two separate flags is so you can load from one location and set a separate for persistent otherwise you'd end up overwriting this on dump.

But maybe it is easier to use only one argument for load+dump and always. That would make it easier to load and update the saved state, but harder to load and save it to another location.
So unsure what the best behavior here would be.

kill -9 is SIGKILL iirc, which you can't intercept. But perhaps you can handle this with a combination of trap and kill -- -$$ SIGINT SIGTERM EXIT, but idk trap or set -e in detail

fubhy · 2022-11-23T15:05:42Z

after reading this again I still have a question re:

I am not sure if two separate options (--dump-state vs. --load-state) are required.

the reason why I used two separate flags is so you can load from one location and set a separate for persistent otherwise you'd end up overwriting this on dump.

Ganache has a --database option for that (https://trufflesuite.com/docs/ganache/reference/cli-options/#database). That was sufficient for me in the past and I've personally never had a use-case where I'd have wanted to specify read & write locations seperately. But I guess we could have both? An additional --state option that's simply an alias for both --dump-state and --load-state and, if there's no state store yet at that location, it would create it?

Actually I could see these two options being useful if you want to roll back to a previously stored state snapshot between restarts... ? So you initialize the db from a backup location and afterwards dump it elsewhere or choose to not dump it at all. Could be useful.

Like... A shared developer instance for your team to test stuff with... Hosted somewhere. Resets every 24 hours or so?

mattsse · 2022-11-23T15:09:54Z

An additional --state option that's simply an alias for both --dump-state and --load-state and, if there's no state store yet at that location, it would create it?

I like this, let's do that,

mslipper · 2022-11-30T03:15:13Z

This would be extremely helpful for us at Optimism. For context, we're in the process of prepping for our Bedrock migration, and we'd like to run a long-lived, public fork of Goerli so forks can test things out before doing the migration for real. Since this fork would be long-lived and will see a bunch of transaction volume, it's important that data be persisted to disk rather stored in memory.

We need the configurability here so that we can provision Kubernetes PVCs for state to be stored on.

fubhy · 2022-11-30T08:33:21Z

This would be extremely helpful for us at Optimism. For context, we're in the process of prepping for our Bedrock migration, and we'd like to run a long-lived, public fork of Goerli so forks can test things out before doing the migration for real. Since this fork would be long-lived and will see a bunch of transaction volume, it's important that data be persisted to disk rather stored in memory.

We need the configurability here so that we can provision Kubernetes PVCs for state to be stored on.

That sounds like you'd also need full persistence though (all data, not just latest state).

mattsse · 2022-11-30T14:38:02Z

@fubhy --state should work now,

I've set the interval to 60s but perhaps we make this configurable or decrease?

Update: added --state-interval option

gakonst

Ship it and we can iterate. Foundryup will have this tonight, or you can foundryup -b master. I'm triggering a docker build rn which will be avail in ghr

fubhy · 2022-11-30T17:28:42Z

Sweet. Thanks @mattsse!!

feat(anvil): add load/dump state options

417ca09

mattsse added T-feature Type: feature C-anvil Command: anvil labels Nov 21, 2022

fubhy mentioned this pull request Nov 24, 2022

Option to persist all archive data #3760

Open

Merge branch 'master' into matt/add-options-for-load-dump-state

64777cd

feat: periodically flush state

693dec6

mattsse force-pushed the matt/add-options-for-load-dump-state branch from 9f26b7c to 693dec6 Compare November 30, 2022 14:38

feat: make interval configurable

b4e2279

gakonst approved these changes Nov 30, 2022

View reviewed changes

gakonst merged commit a43313a into foundry-rs:master Nov 30, 2022

Maddiaa0 mentioned this pull request Jan 3, 2023

feat(anvil): dumpState / loadState on a forked anvil #4026

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(anvil): add load/dump state options #3730

feat(anvil): add load/dump state options #3730

mattsse commented Nov 21, 2022

kahuang commented Nov 21, 2022

fubhy commented Nov 21, 2022

fubhy commented Nov 21, 2022 •

edited

Loading

fubhy commented Nov 21, 2022

mattsse commented Nov 22, 2022

fubhy commented Nov 22, 2022 •

edited

Loading

kahuang commented Nov 22, 2022

mattsse commented Nov 23, 2022

mattsse commented Nov 23, 2022 •

edited

Loading

fubhy commented Nov 23, 2022 •

edited

Loading

mattsse commented Nov 23, 2022

mslipper commented Nov 30, 2022

fubhy commented Nov 30, 2022

mattsse commented Nov 30, 2022 •

edited

Loading

gakonst left a comment

fubhy commented Nov 30, 2022

feat(anvil): add load/dump state options #3730

feat(anvil): add load/dump state options #3730

Conversation

mattsse commented Nov 21, 2022

Motivation

Solution

kahuang commented Nov 21, 2022

fubhy commented Nov 21, 2022

fubhy commented Nov 21, 2022 • edited Loading

fubhy commented Nov 21, 2022

mattsse commented Nov 22, 2022

fubhy commented Nov 22, 2022 • edited Loading

kahuang commented Nov 22, 2022

mattsse commented Nov 23, 2022

mattsse commented Nov 23, 2022 • edited Loading

fubhy commented Nov 23, 2022 • edited Loading

mattsse commented Nov 23, 2022

mslipper commented Nov 30, 2022

fubhy commented Nov 30, 2022

mattsse commented Nov 30, 2022 • edited Loading

gakonst left a comment

Choose a reason for hiding this comment

fubhy commented Nov 30, 2022

fubhy commented Nov 21, 2022 •

edited

Loading

fubhy commented Nov 22, 2022 •

edited

Loading

mattsse commented Nov 23, 2022 •

edited

Loading

fubhy commented Nov 23, 2022 •

edited

Loading

mattsse commented Nov 30, 2022 •

edited

Loading