Remove possibility of conflict between --web3provider and --http-web3provider #5826

handelaar2 · 2020-05-12T11:02:19Z

🐞 Bug Report

Description

There are 2 parameters for pointing to an Eth1-node: --web3provider and --http-web3provider
However these parameters cannot be used "independently".
Only populating one parameter (say --http-web3provider ) and omit the other defaults that one to goerli.prylabs.net.

Having two eth1 sources seems to lead to conflicting data and the beacon log start showing errors:
[2020-05-12 08:16:29] ERROR powchain: Unable to process past logs Could not process deposit log: could not get correct merkle index: latest index observed is not accurate, wanted 30928, but received 3090

The local (Nethermind) Eth1-node is also suffering from this conflict and is bombarded with RPC/http eth_logs-calls, as the conflicting data is causing a loop.

In other words, -web3provider and --http-web3provider have to point to the same eth1-node so there is only one source of eth1 data.

@nisdas confirmed that any mixing here would be the cause of the error message and the looping.

If this is indeed a restriction of the current eth1-code, the web3-parameters should be defined accordingly:
--web3provider =>ip address only
--web3_ws-port
--web3_http-port

Has this worked before in a previous version?

n/a

Yes, the previous version in which this bug was not present was: ....

🔬 Minimal Reproduction

Use Nethermind as local eth1 node. Run the beacon node with "only" the http parameter. That is causing the above (looping) error. Adding the ws-parameter is not possible. Nethermind does not support websocket subscriptions. So activating parameter --web3provider (ws) gives the following error messages (both in Nethermind and in beacon-node):

So leaving the ws-parameter out:
-local_node=>http
-goerli.prylabs.net=>ws
This is causing the RPC loops.
From a user perspective: there is no error/warning message in the beacon that setting only one parameter is actually not possible.

🔥 Error

🌍 Your Environment

Operating System:

  
docker with latest beacon and Nethermind 1.8.30

What version of Prysm are you running? (Which release)

  
96219fc9fad6b72b751352c2c198eb067749f3e0

Anything else relevant (validator index / public key)?

I think it is important to reduce the dependency on goerli.prylabs.net for eth1 data. Any obstacle or unclarity of using a local eth1-node should be removed.

The text was updated successfully, but these errors were encountered:

0xKiwi · 2020-05-12T15:56:57Z

Hello @handelaar2 ! Thanks for the report, asking for clarity here, but what is the issue you'd like to see solved? The flag dependency on eachother matching? Or better UX in terms of setting the flags to ensure they match eachother?

terencechain · 2020-05-12T15:59:14Z

@0xKiwi See the relevant convo here:
https://discordapp.com/channels/476244492043812875/560308336248225812/709701327449751684

0xKiwi · 2020-05-12T16:04:26Z

I see! So solving this would be making a --web3provider flag with --web3-ws-port and --web3-rpc-port flags.

handelaar2 · 2020-05-12T16:11:01Z

@0xKiwi That was indeed what Nishant called a good suggestion. We would have two port parameters but only one web3provider IP address. This way the prysm "eth1-code" would not be confused having "input" from two sources.
It would also block users from potentially entering 2 different IP addresses and/or different hardcoded defaults kicking in.

It would however not solve my problem which is: using Nethermind as my local eth1-node. (Hopefully they prioritize development of web socket subscriptions). In their discord however people write:

@nisdas Anything from your side to add?

@tkstanczak Tomasz, care to comment about websocket subscriptions for Nethermind?

nisdas · 2020-05-13T00:06:15Z

A subscription based api makes code cleaner and easier to write for. The alternative would be polling based apis, which would require a rewrite of portions of our eth1 code. Do you have the reasons stated out for why subscription methods are bad here?
@handelaar2

MicahZoltu · 2020-05-13T06:50:51Z

For context, my recommendation to not use eth_subscribe comes from the fact that its design makes it susceptible to unrecoverable faults. Augur ran into a ton of trouble with this during its development and I eventually wrote https://github.com/ethereumjs/ethereumjs-blockstream/ for them to deal with it, which doesn't use eth_subscribe at all.

The short version of the problem is that if you assume the backend node can disappear at any moment (and then come back later, or you get load balanced to a different node), you cannot rely on witnessing every block/log addition and removal necessary to correctly maintain whatever client-side chain state you are maintaining. The library linked above solves this by maintaining a some number of blocks in-memory client-side, and making sure that it always has a clear chain forward and backwards (during a reorg) before notifying the client of an addition/removal. If you just use eth_subscribe to get your block/log notifications, you may miss an addition, or miss a removal, which can lead to client-side sync issues.

I don't know anything about this project, but I do recommend against anyone using eth_subscribe for anything other than toy projects. For real stuff you really need chain history to be reconciled within the app's failure domain, or you need a much more robust system for maintaining synchronization with a backing node than eth_subscribe offers.

FWIW: This came up because Augur used Infura as its backend early on and when you talk to Infura you aren't talking to a single Ethereum node, you are talking to a load balancer that is backed by many nodes. These nodes may be out of sync with each other, which makes the "missed notification" problem actually pretty common. However, even if you are using a single backing Ethereum node, the missed notifications can occur if that node has an outage (even a very brief one).

handelaar2 · 2020-05-13T09:44:16Z

So if you ask me what would solve all issues: switch to HTTP RPC endpoint only. #2299 was a good start ;-)) but we also need HTTP RPC for fetching future logs.

more robust (Micah's experience)
no more mixing of web3 endpoints (one ip+one port; no more merkle index errors)
better integration with existing web3 providers (Nethermind,Infura, ....)

Just my 2 cents.

prestonvanloon · 2020-05-13T20:31:33Z

It's not as easy as setting a different port. Providers like infura have an entirely different connection URL between websockets and json-rpc.

nisdas · 2020-05-14T05:18:28Z

Looking at our code and what we have in our powchain package, it might not be too much work to sunset websockets and eth_subscribe from our codebase. We only have 1 method that uses it. Although whether we choose to use a subscription based method or instead a polling approach, both methods will require us to have some sort of client synchronization logic on our end to handle missing blocks and logs we receive from the eth1 chain along with any accompanying re-orgs.

What we have now is a bit handwavy, where we basically look at the current head block , and request all logs from Head - N . This is usually fine if N is a large enough number as goerli rarely ever has deep re-rorgs and it is even more unlikely on the mainchain. Although the spec does do the same thing, one of my goals this quarter was to have something more robust for reconcilling the eth1 state of our deposit contract. We do try request any missing blocks and logs in the event they are missing, however dealing with eth1 re-orgs is a bit more difficult. This change could be bundled into it, and I do not think it would be too much extra work.

@prysmaticlabs/core-team what do you guys think ?

prestonvanloon · 2020-05-18T16:12:47Z

I am in favor of removing websockets requirement in Prysm. We don't need on real time events in ETH1 given that we are only supposed to be looking at blocks that are at least 1024 older than the head of ETH1.

Removing websockets eases infrastructure requirements, code complexity, and potential failure vectors.

@nisdas is this something you are interested in working on? If not, I can delegate or work on it myself.

nisdas · 2020-05-19T10:43:15Z

I don't have the immediate bandwidth right now to work on it, so its fine if you want to take this on right now or delegate it @prestonvanloon

alonmuroch · 2020-05-25T14:33:59Z

@prestonvanloon I might be able to help.
The gist of it is to remove any dependency for eth 1 web socket + flags?

prestonvanloon · 2020-05-29T10:44:50Z

@alonmuroch yes exactly. There is some functionality that needs to be rewritten. I wouldn't call this a trivial issue, but seems like something you could do. Let me know if I can assign to you

alonmuroch · 2020-05-31T10:26:00Z

@prestonvanloon I'll give it a go.
Though it's marked as closed..

alonmuroch · 2020-06-01T10:28:54Z

@rauljordan @prestonvanloon this issue is still open? should I take a look at it or the PR attached takes care of it?

prestonvanloon · 2020-06-01T20:00:10Z

@alonmuroch I believe Raul is working on this in #6055.
This turned out to be a high priority item from user feedback on the survey we sent last week.

0xKiwi added the Bug Something isn't working label May 12, 2020

0xKiwi added Enhancement New feature or request Good First Issue Good for newcomers and removed Bug Something isn't working labels May 12, 2020

0xKiwi changed the title ~~Conflict between --web3provider and --http-web3provider~~ Remove possibility of conflict between --web3provider and --http-web3provider May 12, 2020

nisdas added Discussion Simply a thread for talking about stuff Refactor and removed Good First Issue Good for newcomers labels May 14, 2020

prestonvanloon added this to the Diamond milestone May 18, 2020

prestonvanloon added the Help Wanted Extra attention is needed label May 19, 2020

nisdas mentioned this issue May 25, 2020

Req: Add a list of Eth1 sync types that are required for the beaconnode to docs #5970

Closed

handelaar2 mentioned this issue May 27, 2020

Missing config for Witti Prysm beacon goerli/medalla#11

Closed

rauljordan mentioned this issue May 30, 2020

Remove Websockets for ETH1 From Prysm #6055

Merged

prestonvanloon assigned rauljordan Jun 1, 2020

prestonvanloon added the Priority: High High priority item label Jun 1, 2020

prylabs-bulldozer bot closed this as completed in #6055 Jun 3, 2020

Djadih mentioned this issue Oct 2, 2020

Outdated, deprecated startup parameters prysmaticlabs/documentation#246

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove possibility of conflict between --web3provider and --http-web3provider #5826

Remove possibility of conflict between --web3provider and --http-web3provider #5826

handelaar2 commented May 12, 2020 •

edited

0xKiwi commented May 12, 2020

terencechain commented May 12, 2020

0xKiwi commented May 12, 2020

handelaar2 commented May 12, 2020 •

edited

nisdas commented May 13, 2020

MicahZoltu commented May 13, 2020

handelaar2 commented May 13, 2020 •

edited

prestonvanloon commented May 13, 2020

nisdas commented May 14, 2020 •

edited

prestonvanloon commented May 18, 2020

nisdas commented May 19, 2020

alonmuroch commented May 25, 2020

prestonvanloon commented May 29, 2020

alonmuroch commented May 31, 2020

alonmuroch commented Jun 1, 2020

prestonvanloon commented Jun 1, 2020

Remove possibility of conflict between --web3provider and --http-web3provider #5826

Remove possibility of conflict between --web3provider and --http-web3provider #5826

Comments

handelaar2 commented May 12, 2020 • edited

🐞 Bug Report

Description

Has this worked before in a previous version?

🔬 Minimal Reproduction

🔥 Error

🌍 Your Environment

0xKiwi commented May 12, 2020

terencechain commented May 12, 2020

0xKiwi commented May 12, 2020

handelaar2 commented May 12, 2020 • edited

nisdas commented May 13, 2020

MicahZoltu commented May 13, 2020

handelaar2 commented May 13, 2020 • edited

prestonvanloon commented May 13, 2020

nisdas commented May 14, 2020 • edited

prestonvanloon commented May 18, 2020

nisdas commented May 19, 2020

alonmuroch commented May 25, 2020

prestonvanloon commented May 29, 2020

alonmuroch commented May 31, 2020

alonmuroch commented Jun 1, 2020

prestonvanloon commented Jun 1, 2020

handelaar2 commented May 12, 2020 •

edited

handelaar2 commented May 12, 2020 •

edited

handelaar2 commented May 13, 2020 •

edited

nisdas commented May 14, 2020 •

edited