Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Operational swing-store artifact level #9391

Merged
merged 3 commits into from
Jun 3, 2024

Conversation

mhofman
Copy link
Member

@mhofman mhofman commented May 21, 2024

closes: #9388

Description

Switch from replay to operational for state-sync export/restore, genesis export/import, and as the default for cosmic-swingset's export/import of kernel DB command line tool.

Drive by change of the default chain home directory for the import/export tool (this location was effectively changed a couple year ago but not consistently in the rest of the SDK).

Security Considerations

If all nodes prune historical "replay" artifacts, we lose the ability to perform replay based upgrades in the future. We expect archive nodes to not manually prune their DB in this way.

Scaling Considerations

This should have no impact on performance right now. In the future, when exporting the IAVL data is not as slow, this would reduce the time it takes for state-sync snapshots to be created.

Documentation Considerations

Should communicate this change with validators

Testing Considerations

Covered by existing integration tests.

Upgrade Considerations

Chain software change

@mhofman mhofman added the force:integration Force integration tests to run on PR label May 21, 2024
@mhofman mhofman requested a review from gibson042 May 21, 2024 00:23
@mhofman mhofman marked this pull request as ready for review May 21, 2024 17:48
Copy link

cloudflare-workers-and-pages bot commented May 21, 2024

Deploying agoric-sdk with  Cloudflare Pages  Cloudflare Pages

Latest commit: 319b386
Status: ✅  Deploy successful!
Preview URL: https://7e90b1c3.agoric-sdk.pages.dev
Branch Preview URL: https://mhofman-9388-state-sync-oper.agoric-sdk.pages.dev

View logs

Copy link
Member

@gibson042 gibson042 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but I wonder about genesis export because an XS version change may be more likely in that scenario. Not an objection, though, just food for thought.

Comment on lines 100 to 103
swingStoreGenesisEventHandler{exportDir: swingStoreExportDir, snapshotHeight: snapshotHeight},
// The export will fail if the swing-store does not contain all replay artifacts
keeper.SwingStoreExportOptions{
ArtifactMode: keeper.SwingStoreArtifactModeReplay,
ArtifactMode: keeper.SwingStoreArtifactModeOperational,
ExportDataMode: keeper.SwingStoreExportDataModeSkip,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, shouldn't a genesis export require replay-level data?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so, "operational" includes both XS heap snapshots and the subset (suffix) of transcript entries since the snapshot point.

I think a "genesis export" is meant to immediately reconstruct the chain state vector as of a given block height without retrieving or re-applying any cosmos txns. It could be used as the genesis of a new chain, with a different identity, whose initial state is equal to the most-recent state of the parent. And for that, you only need the current state of the vat workers, which is "operational" in our world.

In particular, it is not trying to start from the genesis state of the parent chain.

Now, the debugging/diagnostic replay-everything-with-a-different-xsnap things we might find ourselves needing to do will require replay-level data, and I suppose the question is whether we intend agd genesis-export to be the tool to fetch the data for that purpose, or if there's a different subcommand (or a --replay flag) that we'd use for that purpose.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, at the end of the day, we can always devise a mechanism to re-import missing artifacts, as long as the hashes are correct in the export data. Genesis export does not change that. What may happen is that in a genesis export we'd need to edit artifacts in some fashion, which would require editing the hashes in the genesis file, but at this point I'm skeptical we'll ever be able to or need to do that.

Copy link
Member

@warner warner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, assuming I'm understanding the how-do-we-get-full-data thing properly

'home',
`${homedir}/.ag-chain-cosmos`,
)}/data/agoric`;
`${processValue.getFlag('home', `${homedir}/.agoric`)}/data/agoric`;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the comment on the previous line need to be updated to match

'home',
`${homedir}/.ag-chain-cosmos`,
)}/data/agoric`;
`${processValue.getFlag('home', `${homedir}/.agoric`)}/data/agoric`;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

@@ -41,10 +41,10 @@ export const ExportManifestFileName = 'export-manifest.json';
*/
export const getEffectiveArtifactMode = artifactMode => {
switch (artifactMode) {
case undefined:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.. and this might be the flag that one would provide to get a replay-everything -type bundle out of an archival node, to address @gibson042 's comment

@mhofman mhofman self-assigned this Jun 3, 2024
@mhofman mhofman added the automerge:rebase Automatically rebase updates, then merge label Jun 3, 2024
@mhofman mhofman force-pushed the mhofman/9388-state-sync-operational-level branch from 8490944 to 319b386 Compare June 3, 2024 18:57
@mergify mergify bot merged commit 8879538 into master Jun 3, 2024
63 checks passed
@mergify mergify bot deleted the mhofman/9388-state-sync-operational-level branch June 3, 2024 19:32
mhofman pushed a commit that referenced this pull request Jun 23, 2024
closes: #9388

## Description

Switch from `replay` to `operational` for state-sync export/restore,
genesis export/import, and as the default for cosmic-swingset's
export/import of kernel DB command line tool.

Drive by change of the default chain home directory for the
import/export tool (this location was effectively changed a couple year
ago but not consistently in the rest of the SDK).

### Security Considerations

If all nodes prune historical "replay" artifacts, we lose the ability to
perform replay based upgrades in the future. We expect archive nodes to
not manually prune their DB in this way.

### Scaling Considerations

This should have no impact on performance right now. In the future, when
exporting the IAVL data is not as slow, this would reduce the time it
takes for state-sync snapshots to be created.

### Documentation Considerations

Should communicate this change with validators

### Testing Considerations

Covered by existing integration tests.

### Upgrade Considerations

Chain software change
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
automerge:rebase Automatically rebase updates, then merge force:integration Force integration tests to run on PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Switch state-sync artifact level to operational
3 participants