- 
                Notifications
    
You must be signed in to change notification settings  - Fork 19
 
feat: option to skip ledger replay #460
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
0986d91    to
    2680886      
    Compare
  
    9a3c8f8    to
    f684bc4      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aside from the requests I had add some test for toml parsing for the new config (whichever way we decide to implement it).
Let's have a chat with @GabrielePicco about introducing that enum, as that makes sure users don't try to provide an invalid config and have different results than expected, i.e some might try to replay the ledger and then reset it for whatever reason, obviously that is not valid.
With an enum it is obvious and only valid configs are allowed.
Also please always include a usage example, i.e. a toml snippet, ENV var setting + arg override. That makes it much easier to see what config was added and how.
        
          
                magicblock-accounts-db/src/lib.rs
              
                Outdated
          
        
      | pub type StWLock = Arc<RwLock<()>>; | ||
| 
               | 
          ||
| pub const ACCOUNTSDB_DIR: &str = "accountsdb"; | ||
| const ACCOUNTSDB_SUB_DIR: &str = "accountsdb/main"; | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use the above const you introduced here so to concat ACCOUNTSDB_SUB_DIR  we don't change accountsdb substring in one place and forget it in the tother.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in f723e75 but it uses a method instead of a const to avoid creating a macro or doing more complex &str building
        
          
                magicblock-api/src/ledger.rs
              
                Outdated
          
        
      | Slot::default() | ||
| }; | ||
| 
               | 
          ||
| if reset || skip_replay { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if we skip_replay we remove the ledger here? How are we gonna be able to restart from that ledger now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only resuming just uses AccountsDB
| &identity_keypair, | ||
| config.validator_config.ledger.reset, | ||
| config.validator_config.ledger.reset | ||
| || config.validator_config.ledger.skip_replay, | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if we skip replay then the validator keypair can be different?
This might be intended, but it seems odd to me that we allow a different validator to continue with a ledger that was created by another validator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ledger.ledger_path().as_path(), | ||
| config.validator_config.ledger.reset, | ||
| config.validator_config.ledger.reset | ||
| || config.validator_config.ledger.skip_replay, | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is getting repetetive.
Please add a method to LedgerConfig, similar to replayed such that here you could just do
!config.validator_config.ledger.replayed()Also consider adding this as a top level method, so not the entire codebase has to know how things are configured, i.e. it might be better to just be able todo:
if !config.ledger_replayed() { .. }There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in 1865b70
| let ledger = ledger::init(ledger_path, reset)?; | ||
| let (ledger, last_slot) = ledger::init( | ||
| ledger_path, | ||
| ledger_config.reset, | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is getting hairy, two booleans right next to each other. Easy to swap them by accident without noticing.
Do we need both or could we derive one from them?
Looking at these two bools all over the place it might be better to introduce an enum instead that is derived from the flags in the config, i.e.:
pub enum LedgerResumeStrategy {
    // implicitly replay is false here
    Reset,
    // do not reset and resume without replay
    ResumeOnly,
    // do not reset, replay and then resume
    ReplayAndResume
}You can pass this down and properly match to cover all cases everywhere. Also you could add convenience methods to the config to just check for a specific case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in 1865b70
| fn read(ledger_path: &Path, keypairs: &[Keypair]) -> Child { | ||
| let (_, mut validator, ctx) = | ||
| setup_offline_validator(ledger_path, None, Some(SLOT_MS), false); | ||
| setup_offline_validator(ledger_path, None, Some(SLOT_MS), false, false); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto what's what here?
| Some(programs), | ||
| Some(SLOT_MS), | ||
| true, | ||
| false, | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto what's what here?
| Some(programs), | ||
| Some(SLOT_MS), | ||
| false, | ||
| false, | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto what's what here?
This goes for all tests that follow as well .... (not putting ditto everywhere :) )
| slot: u64, | ||
| ) -> Child { | ||
| let (_, mut validator, ctx) = | ||
| setup_offline_validator(ledger_path, None, None, false, true); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, no clue what setting you're using here. reset? replay?
| .arg("--") | ||
| .arg(config_path) | ||
| .env("RUST_LOG_STYLE", log_suffix) | ||
| .env("RUST_LOG", "info") | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't override this env var here.
CI is logging on trace and devs wanna log on warn maybe.
Setting this in the code ignores the actual value of that env var.
Just run tests with that env var set i.e. RUST_LOG=warn make test.
Look at the makefile how it is defaulted, but if you set it from the command line that is the proper way to override it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's a mistake, addressed in 1865b70
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM aside from the naming Nits.
Please improve them before merging.
| pub type StWLock = Arc<RwLock<()>>; | ||
| 
               | 
          ||
| pub const ACCOUNTSDB_DIR: &str = "accountsdb"; | ||
| const ACCOUNTSDB_SUB_DIR: &str = "accountsdb/main"; | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider using https://crates.io/crates/const_format since otherwise the method call is doing work every time.
With const_fmt you'd be able to assign to a const and use as you were before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in ad9bdc8
        
          
                magicblock-api/src/ledger.rs
              
                Outdated
          
        
      | ) -> ApiResult<(Ledger, Slot)> { | ||
| // Save the last slot from the previous ledger to restart from it | ||
| let last_slot = if skip_replay || !reset { | ||
| let last_slot = if resume_strategy.resume() { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know I wasn't very good providing the naming for these things, just wanted to provide quick examples, but anything returning a bool should be something with is|has|does, etc.
resume looks like an action (in this case I'd think it resumes something and returns true if successful)
is_resuming is much better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in 540667b
| // ----------------- | ||
| fn maybe_process_ledger(&self) -> ApiResult<()> { | ||
| if self.config.ledger.reset || self.config.ledger.skip_replay { | ||
| if !self.config.ledger.resume_strategy.replay() { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is_replaying
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in 540667b
| 
               | 
          ||
| [ledger] | ||
| reset = false | ||
| resume-strategy = "replay-and-resume" | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again I wasn't very good with suggestions for the names. replay kinda implies resume right?
So I'd think the enum values should be: reset | replay | resume-only.
resume-only could also be just resume - up to you. The only part just makes it double clear that we won't replay.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in 540667b
b70fe3b    to
    9fa235d      
    Compare
  
    Solves #418 This PR: - Adds a config parameter to skip ledger replay when starting a validator, discarding the old ledger but preserving accountsDB and the validator keypair. - Adds a test that checks that the state is still there but not the ledger TOML: ```toml [ledger] resume-strategy = "replay-and-resume" path = "ledger.example.com" size = 1000000000 ``` CLI: ```bash cargo run -- --ledger-resume-strategy replay-and-resume LEDGER_RESUME_STRATEGY=replay-and-resume cargo run ``` Old config: ```toml [ledger] reset = true ``` Becomes: ```toml [ledger] resume-strategy = "reset" ``` --- Old config: ```toml [ledger] reset = false ``` Becomes: ```toml [ledger] resume-strategy = "replay" ```
Solves #418 This PR: - Adds a config parameter to skip ledger replay when starting a validator, discarding the old ledger but preserving accountsDB and the validator keypair. - Adds a test that checks that the state is still there but not the ledger TOML: ```toml [ledger] resume-strategy = "replay-and-resume" path = "ledger.example.com" size = 1000000000 ``` CLI: ```bash cargo run -- --ledger-resume-strategy replay-and-resume LEDGER_RESUME_STRATEGY=replay-and-resume cargo run ``` Old config: ```toml [ledger] reset = true ``` Becomes: ```toml [ledger] resume-strategy = "reset" ``` --- Old config: ```toml [ledger] reset = false ``` Becomes: ```toml [ledger] resume-strategy = "replay" ```
Solves #418 This PR: - Adds a config parameter to skip ledger replay when starting a validator, discarding the old ledger but preserving accountsDB and the validator keypair. - Adds a test that checks that the state is still there but not the ledger TOML: ```toml [ledger] resume-strategy = "replay-and-resume" path = "ledger.example.com" size = 1000000000 ``` CLI: ```bash cargo run -- --ledger-resume-strategy replay-and-resume LEDGER_RESUME_STRATEGY=replay-and-resume cargo run ``` Old config: ```toml [ledger] reset = true ``` Becomes: ```toml [ledger] resume-strategy = "reset" ``` --- Old config: ```toml [ledger] reset = false ``` Becomes: ```toml [ledger] resume-strategy = "replay" ```
Solves #418 This PR: - Adds a config parameter to skip ledger replay when starting a validator, discarding the old ledger but preserving accountsDB and the validator keypair. - Adds a test that checks that the state is still there but not the ledger TOML: ```toml [ledger] resume-strategy = "replay-and-resume" path = "ledger.example.com" size = 1000000000 ``` CLI: ```bash cargo run -- --ledger-resume-strategy replay-and-resume LEDGER_RESUME_STRATEGY=replay-and-resume cargo run ``` Old config: ```toml [ledger] reset = true ``` Becomes: ```toml [ledger] resume-strategy = "reset" ``` --- Old config: ```toml [ledger] reset = false ``` Becomes: ```toml [ledger] resume-strategy = "replay" ```
Solves #418 This PR: - Adds a config parameter to skip ledger replay when starting a validator, discarding the old ledger but preserving accountsDB and the validator keypair. - Adds a test that checks that the state is still there but not the ledger TOML: ```toml [ledger] resume-strategy = "replay-and-resume" path = "ledger.example.com" size = 1000000000 ``` CLI: ```bash cargo run -- --ledger-resume-strategy replay-and-resume LEDGER_RESUME_STRATEGY=replay-and-resume cargo run ``` Old config: ```toml [ledger] reset = true ``` Becomes: ```toml [ledger] resume-strategy = "reset" ``` --- Old config: ```toml [ledger] reset = false ``` Becomes: ```toml [ledger] resume-strategy = "replay" ```
Solves #418 This PR: - Adds a config parameter to skip ledger replay when starting a validator, discarding the old ledger but preserving accountsDB and the validator keypair. - Adds a test that checks that the state is still there but not the ledger TOML: ```toml [ledger] resume-strategy = "replay-and-resume" path = "ledger.example.com" size = 1000000000 ``` CLI: ```bash cargo run -- --ledger-resume-strategy replay-and-resume LEDGER_RESUME_STRATEGY=replay-and-resume cargo run ``` Old config: ```toml [ledger] reset = true ``` Becomes: ```toml [ledger] resume-strategy = "reset" ``` --- Old config: ```toml [ledger] reset = false ``` Becomes: ```toml [ledger] resume-strategy = "replay" ```
Closes #418
This PR:
Snippets
TOML:
CLI:
Migration
Old config:
Becomes:
Old config:
Becomes: