Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tendermint unresponsive under heavy load #1642

Closed
olegomano opened this Issue May 29, 2018 · 8 comments

Comments

Projects
None yet
6 participants
@olegomano
Copy link

olegomano commented May 29, 2018

Is this a BUG REPORT or FEATURE REQUEST? (choose one):
BUG REPORT

Tendermint version (use tendermint version or git rev-parse --verify HEAD if installed from source):
0.19.2

ABCI app (name for built-in, URL for self-written if it's publicly available):
BigchainDB 2.0.0a2
Debian 9.4.0

Hi,
I am running a singe Tendermint node under BigchainDB for testing and evaluation purposes, using the standard docker-compose setup provided on the BigchainDB github page. Periodically the Tendermint node becomes unresponsive, usually after heavy use. Running the tm-bench tool to simulate a heavy use scenario will consistently put Tendermint into the unresponsive state. I am able to replicate this on two different machines. I found that it is easier to reproduce when running tm-bench on the same machine as the Tenderming node, and that if your node is remote it can take a while to experience the unresponsive behavior.

You can reproduce the issue by doing the follwing
sudo ./tm-bench -v 192.168.1.13:32946

Here are the logs from tm-bench

Running 10s test 
I[05-29|22:32:47.189] Latest block height                          h=3839
I[05-29|22:32:47.189] Time started                                 t=2018-05-29T15:32:47.189606179-07:00
I[05-29|22:32:49.246] sent 1000 transactions                       addr=
 took=64.610214ms
I[05-29|22:32:50.247] sent 1000 transactions                       addr=
took=37.131893ms
I[05-29|22:32:51.248] sent 1000 transactions                       addr=
 took=43.399621ms
I[05-29|22:32:52.249] sent 1000 transactions                       addr=
took=41.565253ms
I[05-29|22:32:53.250] sent 1000 transactions                       addr=
 took=333.455458ms
I[05-29|22:32:55.353] sent 1000 transactions                       addr=
took=2.102869612s
txs send failed: write tcp : i/o timeout. Try reducing the connections count and increasing the rate.
chain@debian:~/Documents/tools/tm-bench$ sudo ./tm-bench -v
Running 10s test @



Status: Post : EOF

Here are the logs of Tendermint in the unresponsive state.

I[05-29|21:35:46.228] Finalizing commit of block with 0 txs        module=consensus height=59 hash=4C334418A1D2CA607418539A90433D59183ABDA3 root=
I[05-29|21:35:46.229] Block{
  Header{
    ChainID:        test-chain-SAQsl5
    Height:         59
    Time:           2018-05-29 21:35:46.124091018 +0000 UTC
    NumTxs:         0
    TotalTxs:       0
    LastBlockID:    995AE75CAFA55EEE6AEF143BF255343D6C2CD618:1:591719EDB095
    LastCommit:     DF54FE06AFB12D187148F1D600EA892AACCC19EE
    Data:           
    Validators:     30CE96F687B2E956958F38AA301D59719E262013
    App:            
    Consensus:       F66EF1DF8BA6DAC7A1ECCE40CC84E54A1CEBC6A5
    Results:        
    Evidence:       
  }#4C334418A1D2CA607418539A90433D59183ABDA3
  Data{
    
  }#
  Data{
    
  }#
  Commit{
    BlockID:    995AE75CAFA55EEE6AEF143BF255343D6C2CD618:1:591719EDB095
    Precommits: Vote{0:3CC0CEAC73D7 58/00/2(Precommit) 995AE75CAFA5 /DE418CF9DC4C.../ @ 2018-05-29T21:35:45.090Z}
  }#DF54FE06AFB12D187148F1D600EA892AACCC19EE
}#4C334418A1D2CA607418539A90433D59183ABDA3 module=consensus 
I[05-29|21:35:46.303] Executed block                               module=state height=59 validTxs=0 invalidTxs=0
I[05-29|21:35:46.503] ABCIQuery                                    module=rpc path= data= result="value:\"example result\" "
I[05-29|21:35:46.503] HTTPRestRPC                                  module=rpc-server method=/abci_query args="[ <common.HexBytes Value> <int64 Value> <bool Value>]" returns="[<*core_types.ResultABCIQuery Value> <error Value>]"
I[05-29|21:35:46.503] Served RPC HTTP response                     module=rpc-server method=GET url=/abci_query status=200 duration=0 remoteAddr=172.18.0.4:56528
I[05-29|21:35:49.124] Timed out                                    module=consensus dur=3s height=59 round=0 step=RoundStepPropose
I[05-29|21:35:49.609] ABCIQuery                                    module=rpc path= data= result="value:\"example result\" "
I[05-29|21:35:49.609] HTTPRestRPC                                  module=rpc-server method=/abci_query args="[ <common.HexBytes Value> <int64 Value> <bool Value>]" returns="[<*core_types.ResultABCIQuery Value> <error Value>]"
I[05-29|21:35:49.610] Served RPC HTTP response                     module=rpc-server method=GET url=/abci_query status=200 duration=1 remoteAddr=172.18.0.4:56532
I[05-29|21:35:52.714] ABCIQuery                                    module=rpc path= data= result="value:\"example result\" "
I[05-29|21:35:52.714] HTTPRestRPC                                  module=rpc-server method=/abci_query args="[ <common.HexBytes Value> <int64 Value> <bool Value>]" returns="[<*core_types.ResultABCIQuery Value> <error Value>]"
I[05-29|21:35:52.714] Served RPC HTTP response                     module=rpc-server method=GET url=/abci_query status=200 duration=0 remoteAddr=172.18.0.4:56536
I[05-29|21:35:55.818] ABCIQuery                                    module=rpc path= data= result="value:\"example result\" "
I[05-29|21:35:55.818] HTTPRestRPC                                  module=rpc-server method=/abci_query args="[ <common.HexBytes Value> <int64 Value> <bool Value>]" returns="[<*core_types.ResultABCIQuery Value> <error Value>]"
I[05-29|21:35:55.818] Served RPC HTTP response                     module=rpc-server method=GET url=/abci_query status=200 duration=1 remoteAddr=172.18.0.4:56542
I[05-29|21:35:58.938] ABCIQuery                                    module=rpc path= data= result="value:\"example result\" "
I[05-29|21:35:58.938] HTTPRestRPC                                  module=rpc-server method=/abci_query args="[ <common.HexBytes Value> <int64 Value> <bool Value>]" returns="[<*core_types.ResultABCIQuery Value> <error Value>]"
I[05-29|21:35:58.938] Served RPC HTTP response                     module=rpc-server method=GET url=/abci_query status=200 duration=1 remoteAddr=172.18.0.4:56546
I[05-29|21:36:02.078] ABCIQuery                                    module=rpc path= data= result="value:\"example result\" "
I[05-29|21:36:02.078] HTTPRestRPC                                  module=rpc-server method=/abci_query args="[ <common.HexBytes Value> <int64 Value> <bool Value>]" returns="[<*core_types.ResultABCIQuery Value> <error Value>]"
I[05-29|21:36:02.079] Served RPC HTTP response                     module=rpc-server method=GET url=/abci_query status=200 duration=0 remoteAddr=172.18.0.4:56550
I[05-29|21:36:05.178] ABCIQuery                                    module=rpc path= data= result="value:\"example result\" "
I[05-29|21:36:05.178] HTTPRestRPC                                  module=rpc-server method=/abci_query args="[ <common.HexBytes Value> <int64 Value> <bool Value>]" returns="[<*core_types.ResultABCIQuery Value> <error Value>]"
I[05-29|21:36:05.178] Served RPC HTTP response                     module=rpc-server method=GET url=/abci_query status=200 duration=0 remoteAddr=172.18.0.4:56554
I[05-29|21:36:08.295] ABCIQuery                                    module=rpc path= data= result="value:\"example result\" "
I[05-29|21:36:08.295] HTTPRestRPC                                  module=rpc-server method=/abci_query args="[ <common.HexBytes Value> <int64 Value> <bool Value>]" returns="[<*core_types.ResultABCIQuery Value> <error Value>]"
I[05-29|21:36:08.295] Served RPC HTTP response                     module=rpc-server method=GET url=/abci_query status=200 duration=0 remoteAddr=172.18.0.4:56558
I[05-29|21:36:11.417] ABCIQuery                                    module=rpc path= data= result="value:\"example result\" "
I[05-29|21:36:11.417] HTTPRestRPC                                  module=rpc-server method=/abci_query args="[ <common.HexBytes Value> <int64 Value> <bool Value>]" returns="[<*core_types.ResultABCIQuery Value> <error Value>]"
I[05-29|21:36:11.418] Served RPC HTTP response                     module=rpc-server method=GET url=/abci_query status=200 duration=1 remoteAddr=172.18.0.4:56562
I[05-29|21:36:11.506] Ensure peers                                 module=p2p numOutPeers=0 numInPeers=0 numDialing=0 numToDial=10
I[05-29|21:36:11.506] No addresses to dial nor connected peers. Falling back to seeds module=p2p 
I[05-29|21:36:14.546] ABCIQuery                                    module=rpc path= data= result="value:\"example result\" "
I[05-29|21:36:14.547] HTTPRestRPC                                  module=rpc-server method=/abci_query args="[ <common.HexBytes Value> <int64 Value> <bool Value>]" returns="[<*core_types.ResultABCIQuery Value> <error Value>]"
I[05-29|21:36:14.547] Served RPC HTTP response                     module=rpc-server method=GET url=/abci_query status=200 duration=1 remoteAddr=172.18.0.4:56566
I[05-29|21:36:17.665] ABCIQuery                                    module=rpc path= data= result="value:\"example result\" "
I[05-29|21:36:17.665] HTTPRestRPC                                  module=rpc-server method=/abci_query args="[ <common.HexBytes Value> <int64 Value> <bool Value>]" returns="[<*core_types.ResultABCIQuery Value> <error Value>]"
I[05-29|21:36:17.665] Served RPC HTTP response                     module=rpc-server method=GET url=/abci_query status=200 duration=0 remoteAddr=172.18.0.4:56570
I[05-29|21:36:20.782] ABCIQuery                                    module=rpc path= data= result="value:\"example result\" "
I[05-29|21:36:20.782] HTTPRestRPC                                  module=rpc-server method=/abci_query args="[ <common.HexBytes Value> <int64 Value> <bool Value>]" returns="[<*core_types.ResultABCIQuery Value> <error Value>]"
I[05-29|21:36:20.783] Served RPC HTTP response                     module=rpc-server method=GET url=/abci_query status=200 duration=1 remoteAddr=172.18.0.4:56574
I[05-29|21:36:23.902] ABCIQuery                                    module=rpc path= data= result="value:\"example result\" "
I[05-29|21:36:23.902] HTTPRestRPC                                  module=rpc-server method=/abci_query args="[ <common.HexBytes Value> <int64 Value> <bool Value>]" returns="[<*core_types.ResultABCIQuery Value> <error Value>]"
I[05-29|21:36:23.902] Served RPC HTTP response                     module=rpc-server method=GET url=/abci_query status=200 duration=0 remoteAddr=172.18.0.4:56580
I[05-29|21:36:27.014] ABCIQuery                                    module=rpc path= data= result="value:\"example result\" "
I[05-29|21:36:27.014] HTTPRestRPC                                  module=rpc-server method=/abci_query args="[ <common.HexBytes Value> <int64 Value> <bool Value>]" returns="[<*core_types.ResultABCIQuery Value> <error Value>]"
I[05-29|21:36:27.014] Served RPC HTTP response                     module=rpc-server method=GET url=/abci_query status=200 duration=0 remoteAddr=172.18.0.4:56584
I[05-29|21:36:30.119] ABCIQuery                                    module=rpc path= data= result="value:\"example result\" "
I[05-29|21:36:30.119] HTTPRestRPC                                  module=rpc-server method=/abci_query args="[ <common.HexBytes Value> <int64 Value> <bool Value>]" returns="[<*core_types.ResultABCIQuery Value> <error Value>]"
I[05-29|21:36:30.119] Served RPC HTTP response                     module=rpc-server method=GET url=/abci_query status=200 duration=0 remoteAddr=172.18.0.4:56588
I[05-29|21:36:33.230] ABCIQuery                                    module=rpc path= data= result="value:\"example result\" "
I[05-29|21:36:33.231] HTTPRestRPC                                  module=rpc-server method=/abci_query args="[ <common.HexBytes Value> <int64 Value> <bool Value>]" returns="[<*core_types.ResultABCIQuery Value> <error Value>]"
I[05-29|21:36:33.231] Served RPC HTTP response                     module=rpc-server method=GET url=/abci_query status=200 duration=0 remoteAddr=172.18.0.4:56592
I[05-29|21:36:36.346] ABCIQuery                                    module=rpc path= data= result="value:\"example result\" "
I[05-29|21:36:36.346] HTTPRestRPC                                  module=rpc-server method=/abci_query args="[ <common.HexBytes Value> <int64 Value> <bool Value>]" returns="[<*core_types.ResultABCIQuery Value> <error Value>]"
I[05-29|21:36:36.347] Served RPC HTTP response                     module=rpc-server method=GET url=/abci_query status=200 duration=0 remoteAddr=172.18.0.4:56596
I[05-29|21:36:39.470] ABCIQuery                                    module=rpc path= data= result="value:\"example result\" "
I[05-29|21:36:39.470] HTTPRestRPC                                  module=rpc-server method=/abci_query args="[ <common.HexBytes Value> <int64 Value> <bool Value>]" returns="[<*core_types.ResultABCIQuery Value> <error Value>]"
I[05-29|21:36:39.471] Served RPC HTTP response                     module=rpc-server method=GET url=/abci_query status=200 duration=1 remoteAddr=172.18.0.4:56600
@melekes

This comment has been minimized.

Copy link
Contributor

melekes commented May 31, 2018

  1. What's your setup look like?
$ nproc
...
$ uname -a
...
  1. Could you provide logs with debug output level https://tendermint.readthedocs.io/projects/tools/en/develop/running-in-production.html#logging

  2. I don't see any transactions in the log

I[05-31|11:47:39.417] WSJSONRPC                                    module=rpc-server protocol=websocket remote=127.0.0.1:60952 method=broadcast_tx_async
I[05-31|11:47:39.417] Added good transaction                       module=mempool tx="ӽ4\ufffdM4\ufffdM4\ufffdM4y\ufffd7\ufffdM4\ufffdM4\ufffdM4\ufffd\ufffd\ufffd{\ufffd\ufffdۍ\u001e۞\ufffd\ufffdgu{\ufffd\ufffd\ufffd\ufffd8\ufffd\ufffdyw\ufffd5m\ufffd|ӝ4\ufffdM4㧽q\ufffd\u001a\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\ufffdǛ\ufffd\ufffd\u001aݮ;m\ufffd\ufffd\ufffdޟq\ufffd;ߧ\u001fמ_\ufffd5sW}׎\ufffd\ufffd\ufffd\ufffd\ufffdW\ufffd\ufffd\ufffdtw\ufffdts\ufffdz\ufffd_\ufffdy׶\ufffd\ufffd\ufffd;\ufffd\ufffd\ufffd۾z{}w\ufffd׸Ӷ\ufffdk\ufffd8\ufffd\ufffd:\ufffd\ufffd\ufffdw\ufffd{\ufffdW\ufffdy\ufffd9\ufffd\ufffdx\ufffdo4\ufffd\ufffd\ufffdo\ufffd\u001eӮzq\ufffd\ufffd{\ufffd\ufffd\ufffd\ufffdw\ufffd\ufffd\u001a\ufffd\ufffdxy\ufffd<\ufffd\ufffd\ufffd\ufffdM\ufffd\ufffdO:\ufffdn7u\ufffdxo}{W\ufffd{m\ufffds~\ufffd\ufffd\ufffd:ߖ\ufffdi\ufffd\u001d\ufffdM\ufffd}\ufffd\ufffd\ufffdN\ufffd\ufffdW\ufffd\ufffd\ufffd8\ufffd_{{\ufffd5u\ufffdzq\ufffd\u001dsn\ufffdm\ufffd8\ufffd\ufffd\ufffd\ufffd\ufffdv\ufffdv\ufffd\ufffd\ufffd:\ufffdO:ѷ\ufffd۾\ufffd\ufffdN\ufffdo}\u001e\ufffd\ufffd\u001b\ufffd\ufffd\ufffd{M5y\ufffd\ufffdq\ufffd\ufffdi\ufffd\ufffd\ufffd\ufffd\ufffd߯<u\ufffd\ufffd\ufffd\ufffdz\ufffdֵm\ufffd\ufffdk\ufffd\ufffdwg=u\ufffd\ufffdӶ\ufffd\ufffd\u001bkG;s];\ufffdM^׏8{\ufffd\ufffds\ufffd=\ufffd\ufffd9\ufffd\ufffd\ufffdu\ufffd_\ufffdn\ufffd" res="&{CheckTx:fee:<> }"

It means either no txs are making it to Tendermint or there is a deadlock?

  1. any config changes? no_empty_blocks?
@kansi

This comment has been minimized.

Copy link

kansi commented May 31, 2018

@melekes Following is how it looks like on my system

➜  tm-bench git:(master) ✗ nproc
8
➜  tm-bench git:(master) ✗ uname -a
Linux arch 4.14.40-1 #1 SMP PREEMPT Wed May 9 20:10:25 UTC 2018 x86_64 GNU/Linux
➜  tm-bench git:(master) ✗  tendermint version
0.19.3-aab98828

Tmbench output

➜  tm-bench git:(master) ✗ tm-bench -c 1 -T 30 -r 10000 -v localhost:46657                 
Running 30s test @ localhost:46657
I[05-31|12:15:57.690] Latest block height                          h=2
I[05-31|12:15:57.690] Time started                                 t=2018-05-31T14:15:57.690134771+02:00
I[05-31|12:15:59.816] sent 10000 transactions                      addr=[::1]:46657 took=1.125194619s
I[05-31|12:16:03.220] sent 10000 transactions                      addr=[::1]:46657 took=3.403545494s
txs send failed: write tcp [::1]:34704->[::1]:46657: i/o timeout. Try reducing the connections count and increasing the rate.

I[05-31|12:16:06.914] WSJSONRPC                                    module=rpc-server protocol=websocket remote=[::1]:34704 method=broadcast_tx_async
I[05-31|12:16:06.914] Added good transaction                       module=mempool tx="\ufffdM4\ufffdM4\ufffdM4\ufffdM4sF\ufffd\ufffd]4\ufffdM4\ufffdM4w\ufffd}\ufffdμ}\ufffd;{\ufffd[o\ufffd\ufffd\ufffdV\ufffd\ufffdM\ufffdoW\u001c\ufffd\ufffd\u001fm\ufffd|ӝ4\ufffdM4\ufffdtsF\ufffd\ufffd\ufffd[o\ufffdz\ufffd\ufffd7\ufffd\ufffd\ufffd\ufffd\ufffd}i\ufffd8y\ufffd\ufffd\ufffd޷s\ufffd\ufffd\ufffd\ufffdt\ufffd^\\\ufffd֝\ufffd\ufffd8\ufffd\ufffdx{\ufffd^\ufffdƛ\ufffd\ufffdto\ufffd\ufffd\ufffd\ufffd\u001fk\ufffd\ufffdy\ufffd\ufffdm\ufffd_s]:\ufffd\ufffd\ufffdu\ufffd\ufffdw_|租sn\\\ufffdͼ\ufffd\ufffd^\ufffd\ufffd\ufffdٷ\ufffd\ufffdָ\ufffd}\ufffdi\ufffd9kֽק\u001e\ufffd\ufffdxq\ufffd\ufffd\ufffd\ufffd}\ufffd\ufffd\ufffd\ufffd}\u001eۍ9\ufffd\ufffd\ufffd\ufffd\ufffd\u001cm\ufffd\ufffdg\ufffdq\ufffd\\\ufffd\ufffd^\ufffdn\ufffd׮\ufffd\ufffdv\ufffd\ufffd6\ufffd\ufffd\ufffd^7ӭ6s\ufffd\u001d\ufffd\ufffd\ufffd}\ufffd|\ufffd\ufffd:\ufffdm\u001au\ufffd\ufffds\ufffd\ufffdߎ\ufffd\ufffd\ufffdZ{]\ufffd\ufffd\ufffdZ\ufffdO:o\ufffd}{\ufffd\ufffd\ufffd\ufffd\u001di\ufffd\ufffd\ufffd_6q\ufffd7\ufffd\ufffd\ufffd\ufffdM]\ufffd\ufffd\ufffd\ufffd޶\ufffdm}k\ufffd\ufffd٦\ufffd\ufffd\ufffd\ufffdo\ufffdZ\ufffd\ufffd\ufffdu\ufffdu\ufffd\ufffd5鮷sO8\ufffd\ufffd\ufffdi\ufffd\u001b\ufffd\ufffdx\ufffd\ufffd<\ufffd]^\ufffdm\u001dm\ufffd9k\ufffd\ufffd\ufffd\ufffdvy\ufffd<\ufffd\ufffd;i\ufffd\ufffdw\ufffd]{ʹk\ufffdu" res="&{CheckTx:fee:<> }"
I[05-31|12:16:06.915] WSJSONRPC                                    module=rpc-server protocol=websocket remote=[::1]:34704 method=broadcast_tx_async
I[05-31|12:16:06.915] Committed state                              module=state height=83 txs=282 appHash=E6DF020000000000
I[05-31|12:16:06.916] Recheck txs                                  module=mempool numtxs=206 height=83
I[05-31|12:16:06.916] Done rechecking txs                          module=mempool 
I[05-31|12:16:09.787] Timed out                                    module=consensus dur=3s height=83 round=0 step=RoundStepPropose
I[05-31|12:16:26.214] Ensure peers                                 module=p2p numOutPeers=0 numInPeers=0 numDialing=0 numToDial=10
I[05-31|12:16:26.214] No addresses to dial nor connected peers. Falling back to seeds module=p2p 
E[05-31|12:16:51.691] Failed to write ping                         module=rpc-server protocol=websocket remote=[::1]:34704 err="write tcp [::1]:46657->[::1]:34704: write: connection reset by peer"
I[05-31|12:16:51.691] Stopping wsConnection                        module=rpc-server protocol=websocket remote=[::1]:34704 impl=wsConnection
I[05-31|12:16:51.691] Served RPC HTTP response                     module=rpc-server method=GET url=/websocket status=200 duration=54000 remoteAddr=[::1]:34704
D[05-31|12:16:56.214] Broadcast                                    module=p2p channel=56 msgBytes=BB4D4D3304
I[05-31|12:16:56.214] Ensure peers                                 module=p2p numOutPeers=0 numInPeers=0 numDialing=0 numToDial=10
I[05-31|12:16:56.214] No addresses to dial nor connected peers. Falling back to seeds module=p2p 

Also, the JSON RPC api became un-responsive i.e. curl localhost:46657/status or curl localhost:46657/abci_info failed to return.

config.toml changes

create_empty_blocks = false
@melekes

This comment has been minimized.

Copy link
Contributor

melekes commented Jun 1, 2018

@kansi

This comment has been minimized.

Copy link

kansi commented Jun 1, 2018

@melekes I tried it out with 0.19.7 and it seems to work fine with the following config.toml

[consensus]
max_block_size_txs = 2000
create_empty_block = false

[mempool]
size = 10000

My larger concern in undersanding how Tendermint works i.e. it seem to me that untill the mempool cap is hit tendermint keeps growing the mempool. Only after the cap is hit it switches context, creates and commits a block and then switches back to receiving more transaction untill again the mempool limit is reached. Is my observation even correct?

If such is the case then I would like to understand why does tendermint not take opportunity to create a block as soon as max_block_size_txs are availabe in the mempool?

Furthermore, I would also like understand that what recheck = true does? It seems that transactions which are in the mempool but didn't get included in the ongoing block are re-checked when they are included in the next block. In case I set this option to false I see considerable performace gain but I wonder what are the implications of having recheck = false and rejecting a transaction during deliverTx?

@ebuchman ebuchman added the mempool label Jun 1, 2018

@olegomano

This comment has been minimized.

Copy link
Author

olegomano commented Jun 1, 2018

@melekes

Hey, we upgraded to Tendermint 0.19.7. We are still experiencing the same issue, but now it runs longer before locking up.

48C7 /91AD2ED010A5.../ @ 2018-06-01T17:51:03.331Z}" prevotes="VoteSet{H:794 R:1 T:1 +2/3:209289A448C7BF864CC99F1DB44C4B6FAF3A3ADB:1:66E21A9E13B3(1) BA{1:x} map[]}"

I[06-01|17:51:10.882] Updating ValidBlock because of POL.          module=consensus validRound=0 POLRound=1

I[06-01|17:51:11.175] enterPrecommit(794/1). Current: 794/1/RoundStepPrevote module=consensus height=794 round=1

I[06-01|17:51:12.968] enterPrecommit: +2/3 prevoted proposal block. Locking module=consensus height=794 round=1 hash=209289A448C7BF864CC99F1DB44C4B6FAF3A3ADB

I[06-01|17:51:19.470] Ensure peers                                 module=p2p numOutPeers=0 numInPeers=0 numDialing=0 numToDial=10

I[06-01|17:51:20.435] No addresses to dial nor connected peers. Falling back to seeds module=p2p

I[06-01|17:51:20.762] Signed and pushed vote                       module=consensus height=794 round=1 vote="Vote{0:76F91525228F 794/01/2(Precommit) 209289A448C7 /BDE56BCE47FA.../ @ 2018-06-01T17:51:16.001Z}" err=null
@ebuchman

This comment has been minimized.

Copy link
Contributor

ebuchman commented Jun 2, 2018

it seem to me that untill the mempool cap is hit tendermint keeps growing the mempool.

This should not be the case. Tendermint should be making blocks as soon as txs are available.

I'm having trouble replicated anything like the mempool growing unbounded. Certainly with too many txs through tm-bench it seems the RPC will time out, but then become immediately responsive again (ie. I can re run tm-bench and tendermint starts making blocks again). Do you see the same behaviour ?

transactions which are in the mempool but didn't get included in the ongoing block are re-checked when they are included in the next block.

Right - committing a block with transactions can change the validity of txs in the mempool. Its really up to you and your app how important the recheck parameter is to you. It's fine to just include the txs and have them return errors during DeliverTx. Currently with the mempool, we guarantee that an honest/correct proposer will never include transactions that will error in DeliverTx.

@ldmberman

This comment has been minimized.

Copy link

ldmberman commented Jun 12, 2018

I believe, this issue can be closed.

I am running a singe Tendermint node under BigchainDB for testing and evaluation purposes, using the standard docker-compose setup provided on the BigchainDB github page

This was related not to Tendermint but a weird bug in docker-compose.

Running the tm-bench tool to simulate a heavy use scenario will consistently put Tendermint into the unresponsive state

This should be fixed by #1673.

@olegomano could you perhaps double-check if you can still reproduce any of it?

@xla

This comment has been minimized.

Copy link
Contributor

xla commented Jun 12, 2018

Please re-open if still an issue.

@xla xla closed this Jun 12, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.