Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UMT: archive node runner notes re: hard fork documentation/process improvements #15262

Closed
2 tasks done
jrwashburn opened this issue Mar 1, 2024 · 3 comments
Closed
2 tasks done

Comments

@jrwashburn
Copy link

Preliminary Checks

Summary

Here are notes on where we can improve the documentation, process, etc., for the next test. I tried to include specific references and succeeded sometimes. I'm sorry they are not as organized or concise as I would like and do not always have helpful referents. I thought it better to get it all down somewhere before I lose/forget.

It would be helpful to arrange things in one main document, perhaps with links that can be updated in near real-time to point to builds, etc., as they are released. It was quite confusing and even now trying to update these notes, I'm not sure which document I'm referring to because it seems like there were so many different documents based on when/where information was released.

More clarity on which ledger json file to use with links to specific files, or specific directories. (e.g. mainnet.json vs. github link vs /var/lib/config_xyz.json, etc.) Note if specifying files like in /var/lib/config_x -- those are overwritten across builds so it is important to direct operator to copy the file as it is needed across builds. https://discord.com/channels/484437221055922177/1204059560684552253/1210002566599934012 I ended up getting bounced between a few different sources... still not sure what the best answer shoudl be. I ended up using /var/lib/coda/config_2025a732.json, but then had to downgrade at some point to get back to it because I didn't keep a copy.
https://discord.com/channels/484437221055922177/1204059560684552253/1210304380683816981
https://discord.com/channels/484437221055922177/1204059560684552253/1210221864157188166
https://discord.com/channels/484437221055922177/1204059560684552253/1210154910629363733

incorrect parameter for node --mainnet-blocks-bucket should be --blocks-bucket https://discord.com/channels/484437221055922177/1204059560684552253/1212168290877706280

In node startup instructions:
replace --log-json true with --log-json from mina-archive command (edited to replace instead of remove.)
remove --internal-tracing and --file-log-rotations 500 from the node command

On the archive node instructions, I would expect to also include this for the daemon for trustless archive upgrade participants:
--upload-blocks-to-gcloud true
and env vars for GCLOUD_KEYFILE, NETWORK_NAME, and GCLOUD_BLOCK_UPLOAD_BUCKET

Document which port rosetta needs to point to on the node assuming defaults (graphql port) -- this should have been obvious to me but for some reason I didn't read the parameter name and was left wondering what port to provide.

Rosetta requires undocumented environment variable and error is not obvious. MINA_ROSETTA_MAX_DB_POOL_SIZE

Directions posted need to explain the parameters for the replayer:
mina-replayer --archive-uri {db_connection_string} --input-file reference_replayer_input.json --output-file reference_replayer_output.json --checkpoint-interval 100
What is input-file and how is it created?

Replayer does not actually create --output-file when specified. #15260

The command on this page (https://docs2-git-major-upgrade-minadocs.vercel.app/berkeley-upgrade/migrating-archive-database-to-berkeley) is invalid (extra closing quote)
jq '.ledger.accounts' mainnet.json | jq '{genesis_ledger: {accounts: .}}' > replayer_input_config.json"
and what is the source for mainnet.json?

Clarify migration / replayer incremental runs https://discord.com/channels/484437221055922177/1204059560684552253/1212508873089351701

Install depenedencies on jq, etc. #15257

version conflict of mina-replayer - need version with --migration-mode https://discord.com/channels/484437221055922177/1204059560684552253/1212845720374214808

https://docs2-git-archivemigration-minadocs.vercel.app/berkeley-upgrade/migrating-archive-database-to-berkeley
mina replayer --output-file option is referred to as --output-config https://docs2-git-archivemigration-minadocs.vercel.app/berkeley-upgrade/migrating-archive-database-to-berkeley#how-to-verify-a-successful-migration

  • Knowing when to transition from stage 2 to stage 3 of migration -- e.g. when to add --fork-state-hash was not very obvious. I kept running stage 2 expecting it to finish. (Reading ahead finally identified the solution for me.) https://discord.com/channels/484437221055922177/1204059560684552253/1212879528955875390
  • Stage/Phase language may be confusing, perhaps Stage / Step or Phase/Step would be more clear?
    Numbering in examples is off under 6. Stage 2: remainder migration (refs 5.a, instead of 6.a - made it difficult to refer)

Loading genesis block file on archive can take hours on a remote database #15207

Combine zkapp_tables.sql with create_schema.sql https://discord.com/channels/484437221055922177/1204059560684552253/1212805656638398504
Also note that there can be schema confusion because most release notes include a reference to an archive schema so we had several different "archive schema" links to consider during the migration.

Mainnet issue - will be a problem for mainnet replayer #15211

node stuck in catchup #15206

Separate install package for mina-berkeley-migration https://discord.com/channels/484437221055922177/1204059560684552253/1212477934980440156
trying to overwrite '/usr/local/bin/mina-archive', which is also in package mina-archive 1.0.1umt-stop-slot-992168e

Important to start archive before the node when bringing up new forked network -- if not, archive can miss the first block and it is not stored in GCS. #15261

Steps to Resolve this Issue

n/a

@mrmr1993
Copy link
Member

mrmr1993 commented Mar 7, 2024

I've tried to break this down into sub-issues; hopefully these are representative.

@jrwashburn
Copy link
Author

This is much better! Sorry I didn't spend more time to simplify it initially.

One comment re:

Bug: parallel installs clobber the same file

I think the problem is that we needed different components from different installs at the same time - e.g. needed to stay on pre-fork archiver but run tooling that wasn't in that build. (Or something like that.) So I don't think it's just document not to do that, something needs to change to separate the applications in the builds.

@jrwashburn
Copy link
Author

[ ] Documentation: bring up archive nodes before new network starts UMT first block after hard fork is not stored to GCS and may be missed if archive is not started first #15261.

I think the preferred solution to this would be that the block would be sent to the precomputed blocks storage. There are probably many reasons why that is a better idea, and I think that after k blocks this will be lost otherwise?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Development

No branches or pull requests

4 participants