Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support mounting remote storage #3

Open
sambacha opened this issue Nov 9, 2023 · 8 comments
Open

Support mounting remote storage #3

sambacha opened this issue Nov 9, 2023 · 8 comments

Comments

@sambacha
Copy link

sambacha commented Nov 9, 2023

There are several ways to do this, the benefit is that (for me at least) I can access geth archive state from our production replication service. Geth requires a beacon client to sync to mainnet, which makes it more cumbersome, at least for mainnet use case.

I am refering to SSHFS, etc. Not S3.

@fxfactorial
Copy link
Owner

Agree - this is a good idea. I think as a fastest solution , I could do like what emacs tramps kind of does. that is, is you provide the path to the db (and I assume you know you can't use this on a running chain, just a snapshot hence I realize why you said replications service aka you can temp stop it while using the db)

you provide the path to the db starting with ssh, so like ssh:///home/repl1/eth-mainnet/geth/chaindata and then I from then on will do ssh reads for whenever the golang code needs a file read operation . I think that's the fastest, easiest way cause I don't want a second binary/daemon/websocket. Everyone already has ssh access to their servers

@fxfactorial
Copy link
Owner

also are you mainly using pebble backed or leveldb backed machines? I assume probably leveldb?

@fxfactorial
Copy link
Owner

actually I just looked at this again https://github.com/libfuse/sshfs - did you try it, maybe it should work already as is?

@fxfactorial
Copy link
Owner

@sambacha Okay so I tried this and while it does work , its painfully slow. Like you'll be waiting over 15 minutes (and this was for a snap synced dir, so I imagine archive one will be much worse) to do initial dir load (as there's a dir scan that happens) and sshfs will do a copy to in memory of each file. Chaindata itself will have many hundred of files so it's just not feasible.

@sambacha
Copy link
Author

@sambacha Okay so I tried this and while it does work , its painfully slow. Like you'll be waiting over 15 minutes (and this was for a snap synced dir, so I imagine archive one will be much worse) to do initial dir load (as there's a dir scan that happens) and sshfs will do a copy to in memory of each file. Chaindata itself will have many hundred of files so it's just not feasible.

What are you using for remote FS, FUSE?

@fxfactorial
Copy link
Owner

fxfactorial commented Nov 29, 2023

yes sshfs , and this was on local network too so I imagine remote on different network will be even worse

sshfs -o debug,sshfs_debug,loglevel=debug,allow_other,kill_on_unmount,reconnect,allow_other,direct_io,auto_cache thelio-archive:/eth-archive/goerli temp-mount-2

where sshfs running on my MacBook and thelio-archive is my local network linux machine.

@sambacha
Copy link
Author

OK OK, how about something similar to this?

ethereum/go-ethereum#26621

This PR is allows users to export their chain into an an archive format called Era1. It is formulated similarly to the Era1 format, which is optimized for reading and distribution CL data. The Era and Era1 format are stricter subsets of a simple type-length-value scheme called e2store2, both developed by the Nimbus team.

@fxfactorial
Copy link
Owner

Interesting - I didn't know about that will look up

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants