New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RPC subscriptions: client is not pulling messages fast enough #3935
Comments
After some investigation, I noted that changing this: Line 170 in 72785a2
to sub, err := eventBus.Subscribe(subCtx, addr, q, 9999) Fixed the problem. Would it make sense to provide an RPC config param (i.e. |
"i've found that this error can happen if there are two events from the same subscription on a single block. the second event triggers the error because the first event is still in the queue" - Mark from ShapeShift |
can u give me an example of that? I can't picture it |
I believe the right thing to do here would be buffering events on the client side. Forcing Tendermint to buffer events for subscriptions will further a) increase the amount of memory b) bring more complexity potentially since we'll have to persist buffered events (so that if Tendermint crashes, we don't loose those events). |
I understand your concerns, and I agree that buffering events on the client-side is the way to go right now. |
Solution 1I would propose to allocate a buffer per each subscription on the Tendermint (server) side to allow some slowness in clients (or short bursts of events). A similar buffer should exist on the client side if processing a single event takes time (alt., processing should be done in a separate thread). The size of this buffer can be configurable (similar to TCP send/receive buffers) or constant (most of the users will not tweak it, I think). However, I am not exactly sure what's the ideal size. pros:
cons:
Solution 2Allow blocking subscriptions ( pros:
cons:
|
I don't think this is viable, any RPC client would be able to effectively DoS a node simply by opening a subscription and doing nothing.
Is this not already a problem? Without being familiar with the event bus, if a Tendermint node crashes I'm guessing any in-flight events would be lost. Also, once the node comes back online I would think it's possible for it to generate events before the client reconnects, unless we have some sort of sequence number we can resume from in which case we could use that to resume events that were lost in the buffer as well. I agree that delivery guarantees are fantastic, but if we have the necessary infrastructure to actually give such guarantees then that infrastructure should be able to easily handle buffer loss as well. |
Is there a workaround this issue? I am currently running into the exact same issue, and I am pretty sure it is not because of my client's slowness. (it is basically a tight loop polling messages from websocket). |
Tendermint version
0.32.3
ABCI app
Environment:
What happened:
When I subscribe to events via RPC API, after getting 0 or more events (this is completely random) I get the following error:
What you expected to happen:
I should keep getting the events
Have you tried the latest version: yes
How to reproduce it (as minimally and precisely as possible):
step 1
Start tendermint:
go run ./cmd/tendermint/main.go node --proxy_app kvstore
step 2
Subscribe to events, I'm using ws but any WS client should be sufficient:
You can increment
$WAIT
to 1 or more seconds until the error dissapears.Config:
Generated by
tendermint init
The text was updated successfully, but these errors were encountered: