Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(server): in-place testnet creator #19280

Merged
merged 16 commits into from
Feb 12, 2024
Merged

Conversation

czarcas7ic
Copy link
Contributor

@czarcas7ic czarcas7ic commented Jan 29, 2024

Description

Closes: #18706

This PR introduces the sdk side logic required to create testnets from local state. This feature is useful to create testnets that mirror mainnet state. When running on a recent snapshot, this process takes less than 3 seconds on Osmosis. This feature intends to replace the current method of state exported testnets, which currently takes 6 hours + requires 256 GB RAM VMs for Osmosis. Also, state exported testnets requires some esoteric knowledge on how to modify the state properly, whereas this should be much simpler for other chains to implement, with the goal being improved testing environments across all Cosmos chains.

Here is an example of this feature being implemented app side on Osmosis:

Important to note, I tested this throughly on v0.47 and it works as intended. I did, however, need to make a few changes to target the sdk main branch, which I am not able to test against Osmosis with. Here is the PR to the sdk fork for Osmosis, which will also be helpful if we want to back port this (we should):

osmosis-labs#506


Author Checklist

All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.

I have...

  • included the correct type prefix in the PR title
  • confirmed ! in the type prefix if API or client breaking change
  • targeted the correct branch (see PR Targeting)
  • provided a link to the relevant issue or specification
  • reviewed "Files changed" and left comments if necessary
  • included the necessary unit and integration tests
  • added a changelog entry to CHANGELOG.md
  • updated the relevant documentation or specification, including comments for documenting Go code
  • confirmed all CI checks have passed

Reviewers Checklist

All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.

I have...

  • confirmed the correct type prefix in the PR title
  • confirmed all author checklist items have been addressed
  • reviewed state machine logic, API design and naming, documentation is accurate, tests and test coverage

Summary by CodeRabbit

  • New Features
    • Introduced an in-place testnet command for easier testing and development within the server module.

@czarcas7ic czarcas7ic changed the title feat: in place testnet creator feat(upstream): in-place testnet creator Jan 29, 2024
if opts.AddFlags != nil {
opts.AddFlags(cmd)
}
addStartNodeFlags(cmd, opts)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created helper function to reduce code duplication across the normal start command and the testnet creator command, which also shares the same flags.

Comment on lines +597 to +605
if isTestnet, ok := svrCtx.Viper.Get(KeyIsTestnet).(bool); ok && isTestnet {
app, err = testnetify(svrCtx, home, appCreator, db, traceWriter)
if err != nil {
return app, traceCleanupFn, err
}
} else {
app = appCreator(svrCtx.Logger, db, traceWriter, svrCtx.Viper)
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only code that touches "core logic" of the normal startApp flow.

Comment on lines +670 to +674
// Set testnet keys to be used by the application.
// This is done to prevent changes to existing start API.
serverCtx.Viper.Set(KeyIsTestnet, true)
serverCtx.Viper.Set(KeyNewChainID, newChainID)
serverCtx.Viper.Set(KeyNewOpAddr, newOperatorAddress)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We set keys in Viper in order to prevent changing existing APIs. This allows us to share the same flow across start and in-place-testnet commands.

server/start.go Dismissed
Height: state.LastBlockHeight,
Round: 0,
BlockID: state.LastBlockID,
Timestamp: time.Now(),

Check warning

Code scanning / CodeQL

Calling the system time Warning

Calling the system time may be a possible source of non-determinism
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine, this is the only validator in the set, set setting the timestamp to local time is a non issue.

@czarcas7ic czarcas7ic marked this pull request as ready for review January 29, 2024 19:34
@czarcas7ic czarcas7ic requested a review from a team as a code owner January 29, 2024 19:34
Copy link
Contributor

coderabbitai bot commented Jan 29, 2024

Warning

Rate Limit Exceeded

@czarcas7ic has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 27 minutes and 22 seconds before requesting another review.

How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.
Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.
Please see our FAQ for further information.

Commits Files that changed from the base of the PR and between 25bee82 and 0c0b744.

Walkthrough

This change introduces a significant feature aimed at enhancing the testing capabilities within the server module by adding an in-place testnet CLI command. This feature is designed to simplify the process of utilizing mainnet state for testing purposes, addressing challenges related to the export and import of state due to resource constraints. By incorporating new functions and flags, the update facilitates state modifications for testing environments, potentially improving efficiency and resource management during the development and testing phases.

Changes

Files Change Summary
CHANGELOG.md Introduced a new feature: in-place testnet CLI command in the server module.
server/start.go Added imports, modified StartCmdWithOptions and startApp, added InPlaceTestnetCreator and testnetify functions, and new flags in addStartNodeFlags.

Assessment against linked issues

Objective Addressed Explanation
Simplify the process of using mainnet state for testing (#18706)
Address resource constraints during state export/import (#18706)
Offer a solution for modifying state for testing environments (#18706)
Implement state transition functions or a tool behind a build flag (#18706) It's unclear if these functions are behind a build flag as specified.
Develop a tool for state transitions including validator and bank address modifications (#18706) The changes suggest in-place modifications rather than a separate tool.
Modify the Cosmos Hub mainnet database to a single validator for testing (#18706)
Consider the impact on CometBFT's database state modification (#18706)

The code changes seem to address most of the key objectives outlined in the linked issue, with a primary focus on simplifying the testing process by enabling in-place modifications of the mainnet state. However, there is some ambiguity regarding the implementation of state transition functions or tools behind a build flag, and it appears that the approach taken leans more towards in-place modifications rather than developing a separate tool for state transitions.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share

Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit-tests for this file.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit tests for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository from git and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit tests.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • The JSON schema for the configuration file is available here.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/coderabbit-overrides.v2.json

CodeRabbit Discord Community

Join our Discord Community to get help, request features, and share feedback.

@czarcas7ic
Copy link
Contributor Author

czarcas7ic commented Jan 29, 2024

The leaking resource CI check is failing across all PRs.

@@ -631,7 +596,15 @@ func startApp(svrCtx *Context, appCreator types.AppCreator, opts StartCmdOptions
return app, traceCleanupFn, err
}

app = appCreator(svrCtx.Logger, db, traceWriter, svrCtx.Viper)
if isTestnet, ok := svrCtx.Viper.Get(KeyIsTestnet).(bool); ok && isTestnet {
Copy link
Contributor

@alexanderbez alexanderbez Jan 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what is the relationship between the start command, which checks KeyIsTestnet, and the new testnet command?

Copy link
Contributor Author

@czarcas7ic czarcas7ic Jan 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The testnet command and the start command are essentially identical except:

  1. The testnet command provides 2 inputs necessary for making the testnet
  2. The testnet command runs the testnetify logic before returning the app from the appCreator
  3. The pubkey and validator address retrieved from testnetify are stored to viper keys which are used at the application level

I am not sure if this answers your question, please lmk if you wanted clarification on something else.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

im not sure this needs to exist. I may be missing something but this tool should only modify the underlying db.

@julienrbrt julienrbrt self-assigned this Jan 30, 2024
@czarcas7ic
Copy link
Contributor Author

Added a small feat, which allows the upgrade handler to be triggered on the first block.

We ran into the case where, if new stores were introduced, the flow required would be:

  1. Use old daemon to run this testnetify logic
  2. Create upgrade prop
  3. Then the current daemon can be used

This flag allows us to skip all this and just use the current daemon on block one. Besides this, have not had any other issues, and has been working great.

Copy link
Member

@tac0turtle tac0turtle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall looks really good, i think its in the wrong place, this code will be gone in a few months. I think it makes more sense to have it separate and imported into the apps cmd file and set, similar to how we do other commands. Would you be open to changing it?

I think i understand the design, you aimed at taking a db and just starting it in testnet mode. The mix of testing code and production code is something we want to avoid. It is common in the sdk but this is because most of the designs were not thought through to keep these things separated. I think taking a db, running a command to migrate then starting is a cleaner separation of the two functionalities

@czarcas7ic
Copy link
Contributor Author

Had a discussion with @tac0turtle offline, and it seems like this implementation is good for merge / backport. Please lmk if there are any further desired changes I should make.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 0

Configuration used: .coderabbit.yml

Commits Files that changed from the base of the PR and between bad2345 and 4e191a1.
Files selected for processing (3)
  • CHANGELOG.md (1 hunks)
  • server/start.go (5 hunks)
  • server/util.go (1 hunks)
Files skipped from review as they are similar to previous changes (3)
  • CHANGELOG.md
  • server/start.go
  • server/util.go

@czarcas7ic
Copy link
Contributor Author

I just added a skip-confirmation flag to allow our infra/devops to automate testnet creation.

@czarcas7ic
Copy link
Contributor Author

I wont push any more to this. Just wanted to fix a small bug we found after creating many testnets, we saw that stopping a node with the halt height flag causes the application height to match the blockstore height, which needs to be handled slightly differently.

@julienrbrt julienrbrt added the backport/v0.50.x PR scheduled for inclusion in the v0.50's next stable release label Feb 9, 2024
@julienrbrt julienrbrt changed the title feat(upstream): in-place testnet creator feat(server): in-place testnet creator Feb 9, 2024
Copy link
Member

@julienrbrt julienrbrt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! one nit about app genesis instead of comet genesis, but we can do that ourselves in a follow up too.

server/start.go Outdated
return nil, err
}

// Modify genesis chain ID and save to genesis file.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use an app genesis here, otherwise we may overwrite the app genesis into a comet genesis

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@julienrbrt I think this addresses your concern, if you could look it over before merge that would be much appreciated!

25bee82

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, perfect, thank you!

@julienrbrt
Copy link
Member

Could you have a look at the linter complaining?

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 0

Configuration used: .coderabbit.yml

Commits Files that changed from the base of the PR and between 7d0e17e and 66355b3.
Files selected for processing (1)
  • server/start.go (5 hunks)
Files skipped from review as they are similar to previous changes (1)
  • server/start.go

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 0

Configuration used: .coderabbit.yml

Commits Files that changed from the base of the PR and between 66355b3 and 25bee82.
Files selected for processing (1)
  • server/start.go (5 hunks)
Additional comments: 5
server/start.go (5)
  • 4-27: The addition of new imports is correctly done and aligns with the described functionality in the PR objectives. Ensure that each imported package is used within the file to avoid unnecessary imports.
  • 600-608: The modifications in the startApp function to include conditional logic based on the isTestnet flag and the addition of the testnetify function call are correctly implemented. Ensure that the isTestnet flag is properly set and used throughout the application to maintain consistent behavior.
  • 618-710: The InPlaceTestnetCreator function is correctly implemented, providing a command to create and start a testnet from the current local state. Ensure that the user prompts and flag handling are clear and user-friendly. Additionally, validate that the error handling and cleanup processes are robust to prevent state corruption.
  • 712-929: The testnetify function correctly modifies both state and blockStore to allow the provided operator address and local validator key to control the network. Ensure that the modifications to the genesis file, state, and blockStore are correctly implemented and that error handling is comprehensive to prevent partial state modifications.
  • 931-983: The modifications to the addStartNodeFlags function to include new flags are correctly implemented. Ensure that each flag has a clear and concise description and that the flags are used appropriately within the application to control behavior based on user input.

@czarcas7ic czarcas7ic added the backport/v0.47.x PR scheduled for inclusion in the v0.47's next stable release label Feb 9, 2024
@czarcas7ic
Copy link
Contributor Author

I added the 47 backport label, although the logic will need to differ sightly due to obv difference between the versions. It will mirror the implementation on the osmosis fork.

@julienrbrt
Copy link
Member

julienrbrt commented Feb 10, 2024

I added the 47 backport label, although the logic will need to differ sightly due to obv difference between the versions. It will mirror the implementation on the osmosis fork.

v0.47 is bugfix only, so in theory it shouldn't get this.

@czarcas7ic
Copy link
Contributor Author

@julienrbrt how do you all backport state compatible features to previous versions? That's what I meant by adding the tag.

@julienrbrt
Copy link
Member

In theory, we don't backport features or improvements to v0.47. As it is in its bugfix only phase.

@czarcas7ic
Copy link
Contributor Author

Gotcha. I think that there will likely be quite a few chains on v47 for at least the next year, and this feature (if utilized) will improve security across these chains.

But totally respect the decision if thats not what you guys want to do!

@julienrbrt julienrbrt added this pull request to the merge queue Feb 12, 2024
Merged via the queue into main with commit 89df28c Feb 12, 2024
60 of 61 checks passed
@julienrbrt julienrbrt deleted the adam/testnetify-upstream branch February 12, 2024 10:54
mergify bot pushed a commit that referenced this pull request Feb 12, 2024
(cherry picked from commit 89df28c)

# Conflicts:
#	CHANGELOG.md
#	baseapp/options.go
#	server/start.go
mergify bot pushed a commit that referenced this pull request Feb 12, 2024
(cherry picked from commit 89df28c)

# Conflicts:
#	CHANGELOG.md
#	server/start.go
julienrbrt added a commit that referenced this pull request Feb 15, 2024
Co-authored-by: Adam Tucker <adam@osmosis.team>
Co-authored-by: marbar3778 <marbar3778@yahoo.com>
Co-authored-by: Julien Robert <julien@rbrt.fr>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/v0.47.x PR scheduled for inclusion in the v0.47's next stable release backport/v0.50.x PR scheduled for inclusion in the v0.50's next stable release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature]: state modifier for testing
4 participants