RPC subscriptions: client is not pulling messages fast enough #3935

qustavo · 2019-09-03T15:52:54Z

Tendermint version
0.32.3

ABCI app

kvstore
InProc implementation

Environment:

OS linux-5.2.10 (Archlinux)
Install tools: git

What happened:
When I subscribe to events via RPC API, after getting 0 or more events (this is completely random) I get the following error:

{
  "jsonrpc": "2.0",
  "id": "1#event",
  "error": {
    "code": -32000,
    "message": "Server error",
    "data": "subscription was cancelled (reason: client is not pulling messages fast enough)"
  }
}

What you expected to happen:
I should keep getting the events

Have you tried the latest version: yes

How to reproduce it (as minimally and precisely as possible):

step 1
Start tendermint:
go run ./cmd/tendermint/main.go node --proxy_app kvstore
step 2
Subscribe to events, I'm using ws but any WS client should be sufficient:

$  ws ws://localhost:26657/websocket
> {"jsonrpc":"2.0", "id": 1, "method": "subscribe", "params": {"query": "tm.event='Tx'"}}
< {
  "jsonrpc": "2.0",
  "id": 1,
  "result": {}
}

- step 3
Broadcast tons of Tx:
```bash
WAIT=0.1
for i in $(seq 10000)
do
  echo $i
  sleep $WAIT
  curl http://localhost:26657/broadcast_tx_sync\?tx\=\"$i\"
done

You can increment $WAIT to 1 or more seconds until the error dissapears.

Config:
Generated by tendermint init

The text was updated successfully, but these errors were encountered:

qustavo · 2019-09-03T15:58:50Z

After some investigation, I noted that changing this:

tendermint/rpc/core/events.go

Line 170 in 72785a2

sub, err := eventBus.Subscribe(subCtx, addr, q)

to

	sub, err := eventBus.Subscribe(subCtx, addr, q, 9999)

Fixed the problem.

Would it make sense to provide an RPC config param (i.e. max_subscription_capacity) that allows passing an arbitrary capacity to the Subscribe function?

melekes · 2019-09-03T19:01:59Z

"i've found that this error can happen if there are two events from the same subscription on a single block. the second event triggers the error because the first event is still in the queue" - Mark from ShapeShift

qustavo · 2019-09-03T19:17:28Z

can u give me an example of that? I can't picture it

melekes · 2019-09-05T09:55:07Z

I believe the right thing to do here would be buffering events on the client side. Forcing Tendermint to buffer events for subscriptions will further a) increase the amount of memory b) bring more complexity potentially since we'll have to persist buffered events (so that if Tendermint crashes, we don't loose those events).

qustavo · 2019-09-05T10:06:45Z

I understand your concerns, and I agree that buffering events on the client-side is the way to go right now.
Now, regarding your concerns: a) although that would increase the amount of memory, you can allow the user to control the size (max_subscription_capacity) plus I don't think that the amount of memory will increase dramatically and b) if we buffer in the caller, we (the caller program) still need to implement a synchronization system against Tendermint. Assuming TM crashes and loses messages, the caller SHOULD make sure to query TM to fill the gap between the last received events and the newly received.

melekes · 2020-03-03T12:55:27Z

Solution 1

I would propose to allocate a buffer per each subscription on the Tendermint (server) side to allow some slowness in clients (or short bursts of events). A similar buffer should exist on the client side if processing a single event takes time (alt., processing should be done in a separate thread).

The size of this buffer can be configurable (similar to TCP send/receive buffers) or constant (most of the users will not tweak it, I think). However, I am not exactly sure what's the ideal size.

pros:

we don't block Tendermint if the certain WS client is slow to consume events
allows short bursts

cons:

can loose events currently in buffer if TM crashes

Solution 2

Allow blocking subscriptions (/subscribe?unbuffered=true).

pros:

guaranteed delivery

cons:

blocking Tendermint consensus & other parts using eventBus & other clients

buffer size: 100 Closes #3935

erikgrinaker · 2020-03-03T13:27:58Z

Allow blocking subscriptions

I don't think this is viable, any RPC client would be able to effectively DoS a node simply by opening a subscription and doing nothing.

can loose events currently in buffer if TM crashes

Is this not already a problem? Without being familiar with the event bus, if a Tendermint node crashes I'm guessing any in-flight events would be lost. Also, once the node comes back online I would think it's possible for it to generate events before the client reconnects, unless we have some sort of sequence number we can resume from in which case we could use that to resume events that were lost in the buffer as well.

I agree that delivery guarantees are fantastic, but if we have the necessary infrastructure to actually give such guarantees then that infrastructure should be able to easily handle buffer loss as well.

Closes #3935

feizerl · 2021-09-16T00:44:32Z

Is there a workaround this issue? I am currently running into the exact same issue, and I am pretty sure it is not because of my client's slowness. (it is basically a tight loop polling messages from websocket).

qustavo mentioned this issue Sep 3, 2019

go-kosu: subscription to topic orders occasionally does not work ParadigmFoundation/kosu-monorepo#253

Closed

3 tasks

melekes added T:bug Type Bug (Confirmed) C:rpc Component: JSON RPC, gRPC labels Sep 4, 2019

NicolasMahe mentioned this issue Dec 17, 2019

Improve rpc client subscribe capacity #4256

Closed

melekes mentioned this issue Jan 21, 2020

lite2: add StreamingProvider #4327

Closed

melekes mentioned this issue Mar 3, 2020

RPC vote event problems #4156

Closed

melekes self-assigned this Mar 3, 2020

melekes added a commit that referenced this issue Mar 3, 2020

rpc: create buffered subscriptions on /subscribe

21d8c5a

buffer size: 100 Closes #3935

melekes mentioned this issue Mar 3, 2020

rpc: create buffered subscriptions on /subscribe #4521

Merged

5 tasks

melekes closed this as completed in #4521 Mar 6, 2020

melekes added a commit that referenced this issue Mar 6, 2020

rpc: create buffered subscriptions on /subscribe (#4521)

bc89aad

Closes #3935

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RPC subscriptions: client is not pulling messages fast enough #3935

RPC subscriptions: client is not pulling messages fast enough #3935

qustavo commented Sep 3, 2019

qustavo commented Sep 3, 2019

melekes commented Sep 3, 2019 •

edited

qustavo commented Sep 3, 2019

melekes commented Sep 5, 2019

qustavo commented Sep 5, 2019

melekes commented Mar 3, 2020 •

edited

erikgrinaker commented Mar 3, 2020

feizerl commented Sep 16, 2021

RPC subscriptions: client is not pulling messages fast enough #3935

RPC subscriptions: client is not pulling messages fast enough #3935

Comments

qustavo commented Sep 3, 2019

qustavo commented Sep 3, 2019

melekes commented Sep 3, 2019 • edited

qustavo commented Sep 3, 2019

melekes commented Sep 5, 2019

qustavo commented Sep 5, 2019

melekes commented Mar 3, 2020 • edited

Solution 1

Solution 2

erikgrinaker commented Mar 3, 2020

feizerl commented Sep 16, 2021

melekes commented Sep 3, 2019 •

edited

melekes commented Mar 3, 2020 •

edited