-
Notifications
You must be signed in to change notification settings - Fork 945
[Merged by Bors] - Fallback nodes for eth1 access #1918
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pawanjay176
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. The macro is pretty neat!
Just a few minor nits.
A question is if we should add more logs so that the user gets warned if his main endpoint is for example just slow and sometimes hits timeouts.
Agree we should add extra logs.
Another nice to have imo would be to add metrics for num_correct_responses/num_requests for each passed endpoint. That way the user can set alerts for failing endpoints.
| .help("Specifies the server for a web3 connection to the Eth1 chain. Also enables the --eth1 flag. Defaults to http://127.0.0.1:8545.") | ||
| .takes_value(true) | ||
| ) | ||
| .arg( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could remove eth1-endpoint and just keep eth1-endpoints since eth1-endpoints since endpoints is applicable for both single and multiple endpoints.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't remove eth1-endpoint to stay backwards-compatible, but we can remove that in the next major release...
Currently the |
# Conflicts: # beacon_node/eth1/src/service.rs
|
I just realized a problem with the current approach: In the update we check the network id and chain id currently with try fallback. If the first endpoint uses the wrong network id or chain id and the second the correct one then the check will succeed with a To fix this we have three possibilities:
I am currently preparing 2. since I think its the best solution but if you disagree please tell me ;). |
# Conflicts: # beacon_node/src/cli.rs
|
This is now ready for review. A few notes:
|
beacon_node/eth1/src/service.rs
Outdated
| /// endpoint. If no endpoint is usable it returns None. Usability of endpoints is checked lazily | ||
| /// using the EndpointsCache structure. For each endpoint if it returns an error the on_err function | ||
| /// is called. | ||
| macro_rules! with_fallback_and_on_err { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The macro is cool, but I'm not convinced that we need to use a macro here. Generally, I find not writing a macro is the more maintainable and extensible solution; compile errors are simpler, imports/exports are easier and the types/generics are more clearly expressed. Of course, there are times where macros are needed or are simpler.
This is how I would go about this problem without templating/macros: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=564a7fc931b10cfc757f752afe261604
Considering that we're going to need this same fallback logic in the VC, I would suggest to use a struct-based approach (like in the the play above) defined in a new crate in common/ that can be shared between validator_client/ and beacon_node/eth1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can even include the validator client fallback in this PR, if you like like :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the detailed review. I wanted to go a similar route at the beginning but then probably abandoned the idea too early after having problems integrating everything in an async way. I will have a try at that again...
# Conflicts: # beacon_node/eth1/src/service.rs # lcli/src/eth1_genesis.rs
paulhauner
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, apart from the unfortunate merge conflict :)
| Err(FallbackError::AllErrored(errors)) | ||
| } | ||
|
|
||
| pub fn map_format_error<'a, E, F, S>(&'a self, f: F, error: &FallbackError<E>) -> String |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It took me a little while to understand what this doing, but it looks like it does it faithfully.
An alternate approach could be based upon this rough sketch:
impl<T: Display> Fallback<T> {
pub fn format_error(&self, error: &FallbackError<E>) -> String {
..
}
}
impl Display for EndpointWithState {
..
}I won't block on this though, it works well as is :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't do that yet, since I will change that part anyway for restructuring for the bn case.
| .await | ||
| .map_err(error_connecting)?; | ||
| if &network_id != config_network_id { | ||
| warn!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is in conflict with #1981 unfortunately. I think what we need to do is return some error if we get chain_id ==0 so that we try again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am returning now an EndpintError::FarBehind error + logging a warning.
# Conflicts: # beacon_node/eth1/src/service.rs
|
bors r+ |
## Issue Addressed part of #1883 ## Proposed Changes Adds a new cli argument `--eth1-endpoints` that can be used instead of `--eth1-endpoint` to specify a comma-separated list of endpoints. If the first endpoint returns an error for some request the other endpoints are tried in the given order. ## Additional Info Currently if the first endpoint fails the fallbacks are used silently (except for `try_fallback_test_endpoint` that is used in `do_update` which logs a `WARN` for each endpoint that is not reachable). A question is if we should add more logs so that the user gets warned if his main endpoint is for example just slow and sometimes hits timeouts.
paulhauner
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thanks @blacktemplar. I think users will really appreciate this one.
I'll publish a release in the coming hours :)
|
Pull request successfully merged into unstable. Build succeeded: |
Issue Addressed
part of #1883
Proposed Changes
Adds a new cli argument
--eth1-endpointsthat can be used instead of--eth1-endpointto specify a comma-separated list of endpoints. If the first endpoint returns an error for some request the other endpoints are tried in the given order.Additional Info
Currently if the first endpoint fails the fallbacks are used silently (except for
try_fallback_test_endpointthat is used indo_updatewhich logs aWARNfor each endpoint that is not reachable). A question is if we should add more logs so that the user gets warned if his main endpoint is for example just slow and sometimes hits timeouts.