-
Notifications
You must be signed in to change notification settings - Fork 624
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add command neard validate-config #8485
Conversation
@nikurt Refactored the code so that we can now check multiple config files, only panic once to report all the failed checks. |
will fix the tests |
genesis_validation: GenesisValidationMode, | ||
validation_errors: &mut ValidationErrors, | ||
) -> anyhow::Result<Self> { | ||
let mut file = match File::open(path) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A more readable way would be to use .map_err()
, for example
Line 488 in 67a527b
let env_filter = builder.finish().map_err(ReloadError::Parse)?; |
pub fn validate_genesis(genesis: &Genesis) -> Result<(), ValidationError> { | ||
let mut validation_errors = ValidationErrors::new(); | ||
let mut genesis_validator = GenesisValidator::new(&genesis.config, &mut validation_errors); | ||
println!("\nValidating Genesis config and records, extracted from genesis.json. This could take a few minutes..."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Records don't have to be in genesis.json, they can be in a separate records file, which is usually records.json.
println!("\nValidating Genesis config and records, extracted from genesis.json. This could take a few minutes..."); | |
tracing::info!(target: "config", "Validating Genesis config and records. This could take a few minutes..."); |
nearcore/src/config.rs
Outdated
/// If config file issues occur, a ValidationError::ConfigFileError will be returned; | ||
/// If config semantic checks failed, a ValidationError::ConfigSemanticError will be returned | ||
pub fn from_file(path: &Path) -> Result<Self, ValidationError> { | ||
match Self::from_file_skip_validation(path) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
match Self::from_file_skip_validation(path) { | |
Self::from_file_skip_validation(path).map(|config| { | |
config.validate()?; | |
Ok(config) | |
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went for and_then()
nearcore/src/config.rs
Outdated
// if config.json has file issues, the program will directly panic | ||
let config = Config::from_file_skip_validation(&dir.join(CONFIG_FILENAME))?; | ||
// do config.json validation separately so that genesis_file, validator_file and genesis_file can be validated before program panic | ||
match config.validate() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this work?
match config.validate() { | |
config.validate().map_err(|e|validation_errors.push_errors(e)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would complain the Err arm not handled, I either have let _ =
, which is kinda ugly, or the following:
config.validate().map_or_else( |e|validation_errors.push_errors(e), |_| (), );
nearcore/src/config_validate.rs
Outdated
pub fn validate_config(config: &Config) -> Result<(), ValidationError> { | ||
let mut validation_errors = ValidationErrors::new(); | ||
let mut config_validator = ConfigValidator::new(config, &mut validation_errors); | ||
println!("\nValidating Config, extracted from config.json..."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please avoid println!()
as all output of neard normally goes to stderr, which is done via the tracing
create.
println!("\nValidating Config, extracted from config.json..."); | |
tracing::info!(target: "nearcore", "Validating Config, extracted from config.json..."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a rule as to what we put for target
? Searching in nearcore, seems like it does not always match the name of the library the code sits in...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm putting config
as target for all messages related to config validation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have some guidelines https://github.com/near/nearcore/blob/master/docs/practices/style.md#tracing
Avoid creating unnecessary targets, because filtering them in RUST_LOG
will be too much work.
} | ||
} | ||
|
||
#[cfg(test)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice tests! 👍
neard/src/cli.rs
Outdated
@@ -146,7 +149,8 @@ struct NeardOpts { | |||
/// Directory for config and data. | |||
#[clap(long, parse(from_os_str), default_value_os = crate::DEFAULT_HOME.as_os_str())] | |||
home: PathBuf, | |||
/// Skips consistency checks of the 'genesis.json' file upon startup. | |||
/// Skips consistency checks of the config files including | |||
/// genesis.json, config.json, node_key.json and validator_key.json upon startup. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at fn load_config()
I see that config.json
is always validated, isn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
forgot to change comments here. Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thank you very much!
@nikurt Question: I'm thinking maybe the config validation should only be enforced for mainnet/betanet/testnet? I see we have a bunch of tests that simply use default configs to start node, and we sometimes start localnet nodes with special configs like |
Localnet and the tests should have valid configs.
|
@nikurt So tracked_accounts should always be non-empty. I'm also seeing many runtime tests using default config where
epoch=0 is not allowed for any net? I will plant a epoch_length = 60 (same as localnet) for these tests.
|
9436529
to
fa6b1ef
Compare
nearcore/src/config.rs
Outdated
@@ -1254,7 +1253,8 @@ pub fn init_testnet_configs( | |||
|
|||
genesis.to_file(&node_dir.join(&configs[i].genesis_file)); | |||
configs[i].write_to_file(&node_dir.join(CONFIG_FILENAME)).expect("Error writing config"); | |||
info!(target: "near", "Generated node key, validator key, genesis file in {}", node_dir.display()); | |||
info!(target: "near", "create_testnet_configs_from_seeds: config.tracked_shards are {:?}", &configs[i].tracked_accounts); | |||
// info!(target: "near", "Generated node key, validator key, genesis file in {}", node_dir.display()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needed?
nearcore/src/config.rs
Outdated
@@ -1078,7 +1085,8 @@ pub fn init_configs( | |||
}; | |||
let genesis = Genesis::new(genesis_config, records.into()); | |||
genesis.to_file(&dir.join(config.genesis_file)); | |||
info!(target: "near", "Generated node key, validator key, genesis file in {}", dir.display()); | |||
//info!(target: "near", "Generated node key, validator key, genesis file in {}", dir.display()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needed?
I was trying to fix the 3 tests that’s still failing: db_migration, upgradable and backward_compatible. The tests report that tracked_accounts are empty so panicked. I added logic in code to fix that, but did not seem to alter test results. So I wanted to simply change info!() message to verify if the buildkite actually pick up my changes. Although I comment out “generated ..”, this message still shows up in the buildkite test results, makes me wonder is something wrong with buildkite? Sent from my iPhoneOn Feb 12, 2023, at 11:52 PM, nikurt ***@***.***> wrote:
@nikurt approved this pull request.
In nearcore/src/config.rs:
@@ -1254,7 +1253,8 @@ pub fn init_testnet_configs(
genesis.to_file(&node_dir.join(&configs[i].genesis_file));
configs[i].write_to_file(&node_dir.join(CONFIG_FILENAME)).expect("Error writing config");
- info!(target: "near", "Generated node key, validator key, genesis file in {}", node_dir.display());
+ info!(target: "near", "create_testnet_configs_from_seeds: config.tracked_shards are {:?}", &configs[i].tracked_accounts);
+ // info!(target: "near", "Generated node key, validator key, genesis file in {}", node_dir.display());
needed?
In nearcore/src/config.rs:
@@ -1078,7 +1085,8 @@ pub fn init_configs(
};
let genesis = Genesis::new(genesis_config, records.into());
genesis.to_file(&dir.join(config.genesis_file));
- info!(target: "near", "Generated node key, validator key, genesis file in {}", dir.display());
+ //info!(target: "near", "Generated node key, validator key, genesis file in {}", dir.display());
Needed?
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: ***@***.***>
|
4587f65
to
3427709
Compare
@nikurt I'm removing the check for nearcore/nearcore/src/shard_tracker.rs Line 21 in 161e3e3
With Wac and Marcelo's help, figured out that previously the |
Will there be any changes required for node owners during release when upgarding their running binary to a new version? For example would the node owner need to run the validate config command and ensure that config is correct for the new neard version before stopping the old neard and starting new neard? I know that the upgradable test checked that config generated by the current stable is valid according to your logic now. What will happen if a node owner has a config that was generated way sooner than that? In other words we know that we're one version backwards compatible. Do we want to be infinitely backwards compatible? If not how do we ensure that node owners don't get an unpleasant surprise when upgrading to new neard? For comparison our database is versioned and we automatically apply migrations when neard is restarted. Would we need a similar (automated or manual) process for config changes? |
We don't have processes around config upgrades. |
nearcore/src/config.rs
Outdated
} else { | ||
let error_message = | ||
format!("validator key file does not exist at the path {}", validator_file.display()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to give an error here? It's valid not to have a validator key present, and in fact most people probably don't have one present, since there are only a handful validators compared to the total number of nodes in the network
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah makes sense to not throw error here, also read the master branch code again, the logic there also does not throw error in case file does not exist, simply use a None for signer value.
nearcore/src/config.rs
Outdated
Some(Arc::new(signer) as Arc<dyn ValidatorSigner>) | ||
match InMemoryValidatorSigner::from_file(&validator_file) { | ||
Ok(signer) => Some(Arc::new(signer) as Arc<dyn ValidatorSigner>), | ||
Err(_) => None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we still want to give an error here the way things were before right? Because now with this change, if you mistakenly edit your validator_key.json
in a way that results in invalid JSON, now neard will just silently run as a non-validator instead of warning you
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice catch!
nearcore/src/config.rs
Outdated
genesis | ||
.validate(genesis_validation) | ||
.map_or_else(|e| validation_errors.push_errors(e), |_| ()); | ||
if matches!(genesis.config.chain_id.as_ref(), "mainnet" | "testnet" | "betanet") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did you mean to revert #8509? If not, then makes sense to add in the if validator_signer.is_some()
again
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops I don't think so, must have been a mistake while rebasing master. Will modify
nearcore/src/config.rs
Outdated
anyhow::ensure!(!config.tracked_shards.is_empty(), | ||
"Validator must track all shards. Please change `tracked_shards` field in config.json to be any non-empty vector"); | ||
} | ||
validation_errors.panic_if_errors(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not just return an Err(anyhow::Error)
? Could replace the panic_if_errors()
function with a function that returns an error, maybe like:
pub fn ok(&self) -> anyhow::Result<()> {
match self.generate_error_message_per_type() {
Some(e) => Err(anyhow::Error::msg(e)),
None => Ok(()),
}
}
or called whatever makes more sense to you. I bring it up just because I think the codebase in general has a problem with panicking too much as an error handling method in situations where nothing unexpected has happened: #5485
neard/src/cli.rs
Outdated
impl ValidateConfigCommand { | ||
pub(super) fn run(&self, home_dir: &Path) { | ||
let _ = nearcore::config::load_config(&home_dir, GenesisValidationMode::Full) | ||
.unwrap_or_else(|e| panic!("Error loading config: {:#}", e)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same thing with panics here. It's probably cleaner to have this function return an anyhow::Result<()>
, and just add a question mark to the place where this is called above. Giving a stacktrace here when there's a validatoion error doesn't feel great since nothing unexpected is happening. We can just print out the error to stderr normally and exit w/ nonzero code
let mut file = File::open(&path).map_err(|_| ValidationError::GenesisFileError { | ||
error_message: format!( | ||
"Could not open genesis config file at path {}.", | ||
&path.as_ref().to_path_buf().display() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could just delete the to_path_buf()
nearcore/src/config_validate.rs
Outdated
self.config.consensus.max_block_wait_delay | ||
); | ||
self.validation_errors | ||
.push_errors(ValidationError::ConfigSemanticsError { error_message: error_message }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
feels better to be consistent w/ either push_config_semantics_error()
or push_errors(ConfigSemanticsError {})
in every case
…uding config.json, genesis.json, node_key.json and validator_key.json
… will report all check results from all config files
…g track of ConfigSemanticError
…er than tracked_shards to be non-empty
…le config files are involved and add tracked_accounts to create_testnet_configs_from_seeds()
…age and add one for tracked_shards in init_testnet_configs
… to return Result and some minor improvements
This command validates the config files including: config.json, genesis.json, node_key.json, validator_key.json.
To run the command:
./target/debug/neard --home ~/.near/localnet/node1 validate-config
Example output after modifying config.json to be invalid:
The changes in this PR are roughly the following:
add a
.validate_with_panic()
method for Config.This function panics when any assertion fails. The assertions include the ones listed in https://pagodaplatform.atlassian.net/browse/ND-272?focusedCommentId=21738. Directly validating Config rather than ClientConfig is handy because Config has the same structure as config.json, while ClientConfig went thru some transformations, so directly validating Config can report more informative error messages for users to find the fields that are invalid.
.validate_with_panic()
is to be used inConfig::load_file
(replacing current.validate()
)and invalidate-config
command. The neard Run command calls theConfig::load_file
, we want the neard Run command to fail if any of the configs are invalid, thus panicking is ok here. The panicking behavior is also aligned with that ofvalidate_genesis()
.add a
.validate_configs(dir:&Path)
method for Config.This method is to be used in
validate-config
command. This method panics on any error encountered and shows the corresponding error message.This method loads config.json to create Config, then loads validator_key.json to create SignerKey, then loads node_key.json to create network signer, then loads genesis.json to create Genesis. Any error including file not found, cannot open file, failed to match data type and semantic matching error in genesis.json (achieved by
validate_genesis()
) and semantic error in config.json, will lead to panic with a corresponding message.add a
ValidateConfig
value in enumNeardSubCommand
, aValidateConfigCommand
struct andValidateConfigCommand.run
method.run method takes argument home path, which is supplied by neard_cmd.opts.
change
load_config()
function: replace argumentgenesis_validation
that enables validation for genesis only with argumentconfig_validation
that enables validation for all configsCreate new enum
ConfigValidationMode
withFull
andUnsafeFast
to represent config_validation. ConfigValidationMode::Full would indicate GenesisValidationMode::Full.The existing methods in
genesis_config.rs
andgenesis_validate.rs
that takes ingenesis_validation: GenesisValidationMode
remain unchanged.add
config_validation
parameter toConfig::from_file()
We want to make sure user also has control over whether running the validation. Except in
Config::load_config()
, where whenConfig::from_file()
is called, the config_validation param is supplied by user, all other places in code we enable validation.