Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implements a fallback Eth1Provider between multiple eth1 endpoints #3832

Merged
merged 13 commits into from
Apr 14, 2021

Conversation

tbenr
Copy link
Contributor

@tbenr tbenr commented Apr 7, 2021

PR Description

mainly based on #3377

  • Added FallbackAwareEth1Provider that implements Eth1Provider

    • Takes a list of Eth1Provider corresponding to the candidates
    • Instantiate an Eth1ProviderSelector
  • Added --eth1-endpoints alias for the option

  • Updated to a list of endpoints

    • Updated PowchainService to build the Eth1Provider accordingly
    • If only one endpoint is specified a classical provider is built
    • If more than one endpoint is specified a FallbackAwareEth1Provider is built

TODOs\TBAs

  • implement logic in Eth1ProviderSelector and\or PowchainService to poll and select usable candidates based on ethSyncing() calls and getChainId()
  • tests on FallbackAwareEth1ProviderSelectorTest are not covering all methods in Eth1Provider interface
  • existing pow tests are initializing with one endpoint so they are not running through FallbackAwareEth1Provider
  • rework DepositContractAccessor and DepositContract so that we don't have to provide a specific web3js instance to it

Fixed Issue(s)

part of #3327

Documentation

  • I thought about documentation and added the documentation label to this PR if updates are required.

Changelog

  • I thought about adding a changelog entry, and added one if I deemed necessary.

Copy link
Contributor

@ajsutton ajsutton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking really good. I'd say with a few little clean ups and improving the tests a bit it could be merged as the first step. Then use follow up PRs for checking syncing/chainid and getting the DepositContract to use the fallback properly.

The CLI option to specify multiple endpoints is hidden anyway which is enough of a feature toggle to prevent users stumbling across this and finding it not fully ready yet.

@tbenr
Copy link
Contributor Author

tbenr commented Apr 8, 2021

@ajsutton should be better now

Copy link
Contributor

@ajsutton ajsutton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good just a couple of little suggestions. Also would be worth changing the description so instead of fixes #3327 it's just part of #3327 otherwise it will close the issue and we need to get the DepositContract using the fallback before the issue is fully fixed.

Also for the record, my theory around doing these reviews is that we're working in partnership so I try and provide actual code suggestions but it's also a change to "mind meld" on code style and the various decisions that mostly come down to personal preference but provide consistency across the code base. So while I could make the final few tweaks and merge this now I just leave them as suggestions to help with that knowledge sharing.

@tbenr
Copy link
Contributor Author

tbenr commented Apr 9, 2021

@ajsutton thanks for providing these reviews.
Actually most of code suggestions you give me were oversights that I could have avoided.

@tbenr
Copy link
Contributor Author

tbenr commented Apr 9, 2021

@ajsutton I'd like to start on the DepositContract rework and endpoint candidates selection. So its up to you if you want to merge now or wait to have a complete fix to #3327

@ajsutton
Copy link
Contributor

ajsutton commented Apr 9, 2021

@ajsutton I'd like to start on the DepositContract rework and endpoint candidates selection. So its up to you if you want to merge now or wait to have a complete fix to #3327

It's still a fairly small change and we're reviewing as we go along so let's just keep going with this pr.

@tbenr
Copy link
Contributor Author

tbenr commented Apr 9, 2021

@ajsutton one question: I see there are various unused public methods in DepositContract like https://github.com/ConsenSys/teku/blob/114e81b5aa0cce3fa83b09ceb19316c1bd330c32/pow/src/main/java/tech/pegasys/teku/pow/contract/DepositContract.java#L169 and various load.
They are the result of the autogeneration thus we can get the rid of them, right?

@tbenr
Copy link
Contributor Author

tbenr commented Apr 11, 2021

@ajsutton I did several changes. I also made the fallback provider to be used in case of single endpoint (unifies exception handling in tests and code) but don't know if you'll agree.
I also started working on the candidates selection, but before continuing I'd like to have some feedback from you at this stage!

@ajsutton
Copy link
Contributor

@ajsutton one question: I see there are various unused public methods in DepositContract like

https://github.com/ConsenSys/teku/blob/114e81b5aa0cce3fa83b09ceb19316c1bd330c32/pow/src/main/java/tech/pegasys/teku/pow/contract/DepositContract.java#L169

and various load.
They are the result of the autogeneration thus we can get the rid of them, right?

Yep we can get rid of them. I think there's methods in DepositContractAccessor that are no longer required as well (in fact all of DepositContractAccessor may be redundant now, not sure).

Will make sure to take a look at the latest changes this morning.

Copy link
Contributor

@ajsutton ajsutton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the direction this is heading. Looking really good.

Copy link
Contributor

@ajsutton ajsutton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks so much for taking this on. I'll do some manual testing on this today but I think this is ready to merge. We can continue to improve by only trying endpoints that are in-sync etc but I think this covers the key use case where a local node is offline so it should fail-over to another provider.

Anything else you want to get done in this PR?

…xception.java

Co-authored-by: Adrian Sutton <adrian@symphonious.net>
@tbenr
Copy link
Contributor Author

tbenr commented Apr 13, 2021

cool :) I'd like to add the endpoint validation logic but can be added on a separate PR

@ajsutton ajsutton mentioned this pull request Apr 14, 2021
Comment on lines 138 to 142
LOG.warn("Retrying with next eth1 endpoint", err);
return run(task, providers, exceptionsContainer);
} else {
LOG.error("All available eth1 endpoints failed", exceptionsContainer);
return SafeFuture.failedFuture(exceptionsContainer);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In testing locally I was reminded that we deliberately don't print every error from the eth1 node above debug level because there are so many requests made and retries that they wind up swamping the logs. Otherwise this works great so I'm going to change these to debug level, pull in the latest changes and merge this.

I've logged #3855 to improve the logging around eth1 - it's not terrible now but we can do better and some of it may be worth at least bearing in mind when doing the status checks on endpoints.

Copy link
Contributor

@ajsutton ajsutton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks so much for your work on this - fantastic stuff.

@ajsutton
Copy link
Contributor

Oh, I've put this in the changelog as early access as the new CLI option is still hidden. Let's see where we get to with the endpoint validation stuff but I suspect we'll actually unhide it and make it a fully ready to go feature for the next release.

@ajsutton ajsutton merged commit 15c610c into Consensys:master Apr 14, 2021
@tbenr tbenr deleted the fallback-eth1-nodes branch April 14, 2021 05:14
@tbenr tbenr mentioned this pull request Apr 18, 2021
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants