Skip to content
This repository has been archived by the owner on Nov 6, 2020. It is now read-only.

broken parity 2.0.5-stable - refuses to accept new transactions after ~500 tx [possibly a multi-threading issue, see new comments] #9582

Closed
drandreaskrueger opened this issue Sep 18, 2018 · 30 comments
Labels
Z0-unconfirmed 🤔 Issue might be valid, but it’s not yet known.
Milestone

Comments

@drandreaskrueger
Copy link

drandreaskrueger commented Sep 18, 2018

I had to downgrade to parity v1.11.11 because parity 2.0.5 is broken

  • Parity Ethereum version: 2.0.5
  • Operating system: Linux
  • Installation: docker
  • Fully synchronized: does not apply here
  • Network: Aura, private
  • Restarted: yes

Please see the two runs summarized below

actual behavior / v2.0.5-"stable"

https://gitlab.com/electronDLT/chainhammer/blob/master/parity.md#run-11-on-amazon-t2large

block 3 | new #TX 153 / 2000 ms =  76.5 TPS_current | total: #TX  154 /  2.2 s =  71.0 TPS_average
block 4 | new #TX 327 / 4000 ms =  81.8 TPS_current | total: #TX  481 /  6.5 s =  74.5 TPS_average
block 5 | new #TX  25 / 4000 ms =   6.2 TPS_current | total: #TX  506 / 10.1 s =  50.0 TPS_average

block 6 | new #TX   0 / 42000 ms =   0.0 TPS_current | total: #TX  506 / 52.3 s =   9.7 TPS_average
block 7 | new #TX   0 / 68000 ms =   0.0 TPS_current | total: #TX  506 / 120.2 s =   4.2 TPS_average
block 8 | new #TX   0 / 6000 ms =   0.0 TPS_current | total: #TX  506 / 126.2 s =   4.0 TPS_average

There are 19495 more transactions hammered at the node, but none of them go through.

expected behavior / v1.11.11

https://gitlab.com/electronDLT/chainhammer/blob/master/parity.md#how-to-downgrade-to-a-stable-stable-version

block 1 | new #TX 129 / 4000 ms =  32.2 TPS_current | total: #TX  130 /  4.0 s =  32.5 TPS_average
block 2 | new #TX 215 / 4000 ms =  53.8 TPS_current | total: #TX  345 /  8.3 s =  41.8 TPS_average
block 3 | new #TX 213 / 4000 ms =  53.2 TPS_current | total: #TX  558 / 12.3 s =  45.5 TPS_average
block 4 | new #TX 214 / 4000 ms =  53.5 TPS_current | total: #TX  772 / 16.2 s =  47.6 TPS_average
...
block 90 | new #TX 197 / 4000 ms =  49.2 TPS_current | total: #TX 19666 / 370.1 s =  53.1 TPS_average
block 91 | new #TX 225 / 4000 ms =  56.2 TPS_current | total: #TX 19891 / 374.1 s =  53.2 TPS_average
block 92 | new #TX 110 / 4000 ms =  27.5 TPS_current | total: #TX 20001 / 378.0 s =  52.9 TPS_average

block 93 | new #TX   0 / 90000 ms =   0.0 TPS_current | total: #TX 20001 / 468.0 s =  42.7 TPS_average
block 94 | new #TX   0 / 4000 ms =   0.0 TPS_current | total: #TX 20001 / 471.9 s =  42.4 TPS_average

v1.11.11 works just fine.

reproduce

Reproduce this with my public Amazon AMI image ... quickly in ~10 minutes:

https://gitlab.com/electronDLT/chainhammer/blob/master/reproduce.md#readymade-amazon-ami

@ghost
Copy link

ghost commented Sep 18, 2018

@drandreaskrueger I think there is a same issue with geth also, can you check it??

@drandreaskrueger
Copy link
Author

drandreaskrueger commented Sep 18, 2018

no there is not.

Or rather, more precise: No, I did not see a similar issue with geth.

@ghost
Copy link

ghost commented Sep 18, 2018

@drandreaskrueger Have you checked that geth purges transactions on mining node's txpool?

Here is my issue for stale transactions on geth ethereum/go-ethereum#17679

It can be happened by sending a lot of transactions from other node to mining node

@drandreaskrueger
Copy link
Author

drandreaskrueger commented Sep 18, 2018

Thanks for bumping up my issue @eosclassicteam but ... second time now: You tend to create confusion and chaos with your posts in other people's issue threads. Your post is unrelated to this issue here. I don't have to "purge all transactions on local txpool" because I am always starting from scratch again. geth is a completely different client. And what I am describing does not happen in parity v1.11.11 but does happen in parity v2.0.5.

@ghost
Copy link

ghost commented Sep 18, 2018

@drandreaskrueger 😂 Sorry for bothering you, do you have any contact info? Are you online on discord or Gitter?

@debris
Copy link
Collaborator

debris commented Sep 18, 2018

@tomusdrw any ideas?

@debris debris added the Z0-unconfirmed 🤔 Issue might be valid, but it’s not yet known. label Sep 18, 2018
@5chdn 5chdn added this to the 2.2 milestone Sep 19, 2018
@drandreaskrueger
Copy link
Author

Also tried with -currently newest stable- v2.0.6 today:

cd ~/paritytech_parity-deploy
sudo ./clean.sh
docker kill $(docker ps -q); docker rm $(docker ps -a -q); docker rmi $(docker images -q)

ARGS="--db-compaction ssd --tracing off --gasprice 0 --gas-floor-target 100000000000 "
ARGS=$ARGS"--pruning fast --tx-queue-size 32768 --tx-queue-mem-limit 0 --no-warp "
ARGS=$ARGS"--jsonrpc-threads 8 --no-hardware-wallets --no-dapps --no-secretstore-http "
ARGS=$ARGS"--cache-size 4096 --scale-verifiers --num-verifiers 16 --force-sealing "

./parity-deploy.sh --nodes 4 --config aura --name myaura --geth $ARGS

jq ".engine.authorityRound.params.stepDuration = 5" deployment/chain/spec.json > tmp; mv tmp deployment/chain/spec.json
sed -i 's/parity:stable/parity:v2.0.6/g' docker-compose.yml

docker-compose up

and then a communication test (deploying contract, reading it, writing a state change):

cd ~/electronDLT_chainhammer && source py3eth/bin/activate
./deploy.py

which worked fine.

Then the benchmarking viewer:

./tps.py

and the send-20000-transactions task:

cd ~/electronDLT_chainhammer && source py3eth/bin/activate
./deploy.py notest; ./send.py threaded2 23

Good news

It starts of considerably faster than any of the older v1.x.y parity versions (see e.g. run15-run17). See below.

Bad news

But then it fails to digest any transaction after ~2000 transactions:

./tps.py 
versions: web3 4.3.0, py-solc: 2.1.0, solc 0.4.24+commit.e67f0147.Linux.gpp, testrpc 1.3.4, python 3.5.3 (default, Jan 19 2017, 14:11:04) [GCC 6.3.0 20170118]
web3 connection established, blockNumber = 20, node version string =  Parity-Ethereum//v2.0.6-stable-549e202-20180919/x86_64-linux-gnu/rustc1.29.0
first account of node is 0xF2cFf30d292a04CfEcFAC8D4427A6438490BadeB, balance is 0 Ether
nodeName: Parity, nodeType: Parity, consensus: ???, network: 17, chainName: myaura, chainId: 17
Block  20  - waiting for something to happen
(filedate 1537876315) last contract address: 0xc122e329A84D92a0D675a939e56f411d4b51A03e
(filedate 1537876475) new contract address: 0x03616253630ccFb1F283642Acf1036663d7736dF

blocknumber_start_here = 36
starting timer, at block 36 which has  1  transactions; at timecode 14276.339062377
block 36 | new #TX 578 / 5000 ms = 115.6 TPS_current | total: #TX  579 /  5.3 s = 110.2 TPS_average
block 37 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX  579 / 10.1 s =  57.1 TPS_average
block 38 | new #TX 327 / 5000 ms =  65.4 TPS_current | total: #TX  906 / 15.4 s =  58.9 TPS_average
block 39 | new #TX 327 / 5000 ms =  65.4 TPS_current | total: #TX 1233 / 20.3 s =  60.8 TPS_average
block 40 | new #TX 1077 / 5000 ms = 215.4 TPS_current | total: #TX 2310 / 25.3 s =  91.3 TPS_average
block 41 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 2310 / 29.9 s =  77.2 TPS_average
block 42 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 2310 / 35.1 s =  65.9 TPS_average
block 43 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 2310 / 40.0 s =  57.8 TPS_average
block 44 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 2310 / 45.1 s =  51.2 TPS_average
block 45 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 2310 / 50.0 s =  46.2 TPS_average
block 46 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 2310 / 54.9 s =  42.1 TPS_average
block 47 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 2310 / 60.0 s =  38.5 TPS_average
block 48 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 2310 / 64.9 s =  35.6 TPS_average
block 49 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 2310 / 70.1 s =  33.0 TPS_average
block 50 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 2310 / 74.9 s =  30.8 TPS_average
block 51 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 2310 / 80.1 s =  28.8 TPS_average
block 52 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 2310 / 85.0 s =  27.2 TPS_average
block 53 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 2310 / 89.8 s =  25.7 TPS_average
block 54 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 2310 / 95.0 s =  24.3 TPS_average
block 55 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 2310 / 100.2 s =  23.1 TPS_average
block 56 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 2310 / 105.1 s =  22.0 TPS_average
block 57 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 2310 / 109.9 s =  21.0 TPS_average
block 58 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 2310 / 115.1 s =  20.1 TPS_average
block 59 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 2310 / 119.9 s =  19.3 TPS_average
block 60 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 2310 / 125.1 s =  18.5 TPS_average
block 61 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 2310 / 130.0 s =  17.8 TPS_average
block 62 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 2310 / 135.1 s =  17.1 TPS_average
block 63 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 2310 / 140.0 s =  16.5 TPS_average
block 64 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 2310 / 144.9 s =  15.9 TPS_average
block 65 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 2310 / 150.1 s =  15.4 TPS_average
block 66 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 2310 / 154.9 s =  14.9 TPS_average
block 67 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 2310 / 160.1 s =  14.4 TPS_average
block 68 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 2310 / 164.9 s =  14.0 TPS_average
block 69 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 2310 / 170.1 s =  13.6 TPS_average
block 70 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 2310 / 175.2 s =  13.2 TPS_average

@drandreaskrueger
Copy link
Author

drandreaskrueger commented Sep 25, 2018

similar with the -currently newest beta- v2.1.1:

./tps.py 
versions: web3 4.3.0, py-solc: 2.1.0, solc 0.4.24+commit.e67f0147.Linux.gpp, testrpc 1.3.4, python 3.5.3 (default, Jan 19 2017, 14:11:04) [GCC 6.3.0 20170118]
web3 connection established, blockNumber = 2, node version string =  Parity-Ethereum//v2.1.1-beta-cb09330-20180919/x86_64-linux-gnu/rustc1.29.0
first account of node is 0x9aB435344522e522aCB4b2FBf66E6FA4A7052B19, balance is 0 Ether
nodeName: Parity, nodeType: Parity, consensus: ???, network: 17, chainName: myaura, chainId: 17
Block  2  - waiting for something to happen
(filedate 1537876475) last contract address: 0x03616253630ccFb1F283642Acf1036663d7736dF
(filedate 1537876930) new contract address: 0x13187e6fE1F2c1D3787Fa35014e0A63dBFEBedb7

blocknumber_start_here = 5
starting timer, at block 5 which has  1  transactions; at timecode 14731.21625647
block 5 | new #TX 327 / 5000 ms =  65.4 TPS_current | total: #TX  328 /  5.2 s =  62.6 TPS_average
block 6 | new #TX 130 / 5000 ms =  26.0 TPS_current | total: #TX  458 / 10.2 s =  45.1 TPS_average
block 7 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX  458 / 15.0 s =  30.5 TPS_average
block 8 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX  458 / 20.2 s =  22.7 TPS_average
block 9 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX  458 / 25.1 s =  18.3 TPS_average
block 10 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX  458 / 30.2 s =  15.2 TPS_average
block 11 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX  458 / 35.1 s =  13.0 TPS_average
block 12 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX  458 / 40.3 s =  11.4 TPS_average
block 13 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX  458 / 45.1 s =  10.1 TPS_average
block 14 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX  458 / 50.0 s =   9.2 TPS_average
block 15 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX  458 / 55.2 s =   8.3 TPS_average
block 16 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX  458 / 60.0 s =   7.6 TPS_average

@drandreaskrueger
Copy link
Author

In contrast, v1.11.11 keeps on accepting all transactions: "run 13"

https://gitlab.com/electronDLT/chainhammer/blob/master/parity.md#run-13

./tps.py 
versions: web3 4.3.0, py-solc: 2.1.0, solc 0.4.24+commit.e67f0147.Linux.gpp, testrpc 1.3.4, python 3.5.3 (default, Jan 19 2017, 14:11:04) [GCC 6.3.0 20170118]
web3 connection established, blockNumber = 6, node version string =  Parity//v1.11.11-stable-cb03f38-20180910/x86_64-linux-gnu/rustc1.28.0
first account of node is 0xf82Efc32ad23C8e325068ad5E2692Bf9406b6945, balance is 0 Ether
nodeName: Parity, nodeType: Parity, consensus: ???, network: 17, chainName: myaura, chainId: 17
Block  6  - waiting for something to happen
(filedate 1537877975) last contract address: 0x7A15AB144203f50D78b22F4b660E0066D0C5B978
(filedate 1537878000) new contract address: 0xB3D9f2595a33942cdc320fc6E5D9C300e385bdCA

blocknumber_start_here = 8
starting timer, at block 8 which has  1  transactions; at timecode 15801.264819387
block 8 | new #TX 216 / 5000 ms =  43.2 TPS_current | total: #TX  217 /  5.2 s =  41.9 TPS_average
block 9 | new #TX 284 / 5000 ms =  56.8 TPS_current | total: #TX  501 / 10.4 s =  48.4 TPS_average
block 10 | new #TX 211 / 5000 ms =  42.2 TPS_current | total: #TX  712 / 15.2 s =  46.8 TPS_average
block 11 | new #TX 342 / 5000 ms =  68.4 TPS_current | total: #TX 1054 / 20.2 s =  52.3 TPS_average
block 12 | new #TX 219 / 5000 ms =  43.8 TPS_current | total: #TX 1273 / 25.4 s =  50.2 TPS_average
block 13 | new #TX 275 / 5000 ms =  55.0 TPS_current | total: #TX 1548 / 30.2 s =  51.2 TPS_average
block 14 | new #TX 278 / 5000 ms =  55.6 TPS_current | total: #TX 1826 / 35.5 s =  51.5 TPS_average
block 15 | new #TX 292 / 5000 ms =  58.4 TPS_current | total: #TX 2118 / 40.1 s =  52.9 TPS_average
block 16 | new #TX 195 / 5000 ms =  39.0 TPS_current | total: #TX 2313 / 45.3 s =  51.1 TPS_average
block 17 | new #TX 279 / 5000 ms =  55.8 TPS_current | total: #TX 2592 / 50.2 s =  51.7 TPS_average
block 18 | new #TX 275 / 5000 ms =  55.0 TPS_current | total: #TX 2867 / 55.4 s =  51.8 TPS_average
block 19 | new #TX 321 / 5000 ms =  64.2 TPS_current | total: #TX 3188 / 60.3 s =  52.9 TPS_average
block 20 | new #TX 228 / 5000 ms =  45.6 TPS_current | total: #TX 3416 / 65.2 s =  52.4 TPS_average
block 21 | new #TX 279 / 5000 ms =  55.8 TPS_current | total: #TX 3695 / 70.4 s =  52.5 TPS_average
block 22 | new #TX 280 / 5000 ms =  56.0 TPS_current | total: #TX 3975 / 75.3 s =  52.8 TPS_average
block 23 | new #TX 270 / 5000 ms =  54.0 TPS_current | total: #TX 4245 / 80.1 s =  53.0 TPS_average
block 24 | new #TX 218 / 5000 ms =  43.6 TPS_current | total: #TX 4463 / 85.3 s =  52.3 TPS_average
block 25 | new #TX 271 / 5000 ms =  54.2 TPS_current | total: #TX 4734 / 90.2 s =  52.5 TPS_average
block 26 | new #TX 279 / 5000 ms =  55.8 TPS_current | total: #TX 5013 / 95.4 s =  52.5 TPS_average
block 27 | new #TX 297 / 5000 ms =  59.4 TPS_current | total: #TX 5310 / 100.0 s =  53.1 TPS_average
block 28 | new #TX 261 / 5000 ms =  52.2 TPS_current | total: #TX 5571 / 105.5 s =  52.8 TPS_average
block 29 | new #TX 276 / 5000 ms =  55.2 TPS_current | total: #TX 5847 / 110.4 s =  53.0 TPS_average
block 30 | new #TX 207 / 5000 ms =  41.4 TPS_current | total: #TX 6054 / 115.3 s =  52.5 TPS_average
block 31 | new #TX 334 / 5000 ms =  66.8 TPS_current | total: #TX 6388 / 120.2 s =  53.1 TPS_average
block 32 | new #TX 221 / 5000 ms =  44.2 TPS_current | total: #TX 6609 / 125.1 s =  52.8 TPS_average
block 33 | new #TX 279 / 5000 ms =  55.8 TPS_current | total: #TX 6888 / 130.3 s =  52.9 TPS_average
block 34 | new #TX 278 / 5000 ms =  55.6 TPS_current | total: #TX 7166 / 135.5 s =  52.9 TPS_average
block 35 | new #TX 286 / 5000 ms =  57.2 TPS_current | total: #TX 7452 / 140.4 s =  53.1 TPS_average
... etc.

@drandreaskrueger
Copy link
Author

I have new information for you.

I have tried again with your v2.2.0-nightly of 3 days ago.

It is actually working fine but only when single threaded hammering at the node.

It is not working (as described above) when multi-threaded hammering at the node.

Will post more details soon.

@drandreaskrueger
Copy link
Author

drandreaskrueger commented Oct 4, 2018

parity v2.2.0-rc6 on AWS 'free tier' t2.micro

easiest way is if you use my Amazon AMI

it now also works on the "free tier" small machine t2.micro = no costs.

Have a quick look here https://gitlab.com/electronDLT/chainhammer/blob/master/reproduce.md#readymade-amazon-ami
find my AMI --> https://eu-west-2.console.aws.amazon.com/ec2/v2/home?region=eu-west-2#Images:visibility=public-images;search=chainhammer;sort=name
and Launch a machine from that

then

cd ~/paritytech_parity-deploy
sudo ./clean.sh
docker kill $(docker ps -q); docker rm $(docker ps -a -q); docker rmi $(docker images -q)

ARGS="--db-compaction ssd --tracing off --gasprice 0 --gas-floor-target 100000000000 "
ARGS=$ARGS"--pruning fast --tx-queue-size 32768 --tx-queue-mem-limit 0 --no-warp "
ARGS=$ARGS"--jsonrpc-threads 8 --no-hardware-wallets --no-dapps --no-secretstore-http "
ARGS=$ARGS"--cache-size 4096 --scale-verifiers --num-verifiers 16 --force-sealing "

./parity-deploy.sh --nodes 4 --config aura --name myaura --geth $ARGS

sed -i 's/parity:stable/parity:v2.2.0-rc6/g' docker-compose.yml

jq ".engine.authorityRound.params.stepDuration = 5" deployment/chain/spec.json > tmp; mv tmp deployment/chain/spec.json

docker-compose up

@drandreaskrueger
Copy link
Author

drandreaskrueger commented Oct 4, 2018

for how to replicate this, see the post above this.

no-problem vs. problem = single-threaded versus multi-threaded

Compare the following two runs:

(1) single-threaded RPC calls ...

... are actually working fine !!!

All 20k transactions enter the chain if the chainhammer is started with

./deploy.py notest; ./send.py

See:

./tps.py 
versions: web3 4.7.2, py-solc: 3.1.0, solc 0.4.24+commit.e67f0147.Linux.gpp, testrpc 1.3.5, python 3.5.3 (default, Sep 27 2018, 17:25:39) [GCC 6.3.0 20170516]
web3 connection established, blockNumber = 5, node version string =  Parity-Ethereum//v2.2.0-nightly-ab61722-20181001/x86_64-linux-gnu/rustc1.29.0
first account of node is 0x3DcD1457f5F0a7717B63769A341308a84e393e9B, balance is 0 Ether
nodeName: Parity, nodeType: Parity, consensus: ???, network: 17, chainName: myaura, chainId: 17

Block  5  - waiting for something to happen
(filedate 1538676801) last contract address: 0x731a10897d267e19B34503aD902d0A29173Ba4B1
(filedate 1538677060) new contract address: 0xbe4bad4A14c3fcbA04Fa1B8C718a8f2CFDB0c82e

blocknumber_start_here = 9
starting timer, at block 9 which has  1  transactions; at timecode 3257.272526194
block 9 | new #TX 327 / 5000 ms =  65.4 TPS_current | total: #TX  328 /  5.6 s =  59.0 TPS_average
block 10 | new #TX 327 / 5000 ms =  65.4 TPS_current | total: #TX  655 / 10.6 s =  61.9 TPS_average
block 11 | new #TX 327 / 5000 ms =  65.4 TPS_current | total: #TX  982 / 15.8 s =  62.1 TPS_average
block 12 | new #TX 754 / 5000 ms = 150.8 TPS_current | total: #TX 1736 / 20.5 s =  84.8 TPS_average
block 13 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 1736 / 25.1 s =  69.2 TPS_average
block 14 | new #TX 327 / 5000 ms =  65.4 TPS_current | total: #TX 2063 / 30.6 s =  67.4 TPS_average
block 15 | new #TX 327 / 5000 ms =  65.4 TPS_current | total: #TX 2390 / 35.6 s =  67.2 TPS_average
block 16 | new #TX 988 / 5000 ms = 197.6 TPS_current | total: #TX 3378 / 40.5 s =  83.4 TPS_average
block 17 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 3378 / 45.2 s =  74.8 TPS_average
block 18 | new #TX 327 / 5000 ms =  65.4 TPS_current | total: #TX 3705 / 50.7 s =  73.1 TPS_average
block 19 | new #TX 327 / 5000 ms =  65.4 TPS_current | total: #TX 4032 / 55.7 s =  72.4 TPS_average
block 20 | new #TX 934 / 5000 ms = 186.8 TPS_current | total: #TX 4966 / 60.5 s =  82.1 TPS_average
block 21 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 4966 / 65.1 s =  76.3 TPS_average
block 22 | new #TX 327 / 5000 ms =  65.4 TPS_current | total: #TX 5293 / 70.6 s =  75.0 TPS_average
block 23 | new #TX 327 / 5000 ms =  65.4 TPS_current | total: #TX 5620 / 75.6 s =  74.4 TPS_average
block 24 | new #TX 911 / 5000 ms = 182.2 TPS_current | total: #TX 6531 / 80.5 s =  81.1 TPS_average
block 25 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 6531 / 85.2 s =  76.7 TPS_average
block 26 | new #TX 327 / 5000 ms =  65.4 TPS_current | total: #TX 6858 / 90.7 s =  75.6 TPS_average
block 27 | new #TX 327 / 5000 ms =  65.4 TPS_current | total: #TX 7185 / 95.6 s =  75.1 TPS_average
block 28 | new #TX 841 / 5000 ms = 168.2 TPS_current | total: #TX 8026 / 100.4 s =  79.9 TPS_average
block 29 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 8026 / 105.4 s =  76.2 TPS_average
block 30 | new #TX 327 / 5000 ms =  65.4 TPS_current | total: #TX 8353 / 110.6 s =  75.5 TPS_average
block 31 | new #TX 327 / 5000 ms =  65.4 TPS_current | total: #TX 8680 / 115.5 s =  75.1 TPS_average
block 32 | new #TX 834 / 5000 ms = 166.8 TPS_current | total: #TX 9514 / 120.5 s =  78.9 TPS_average
block 33 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 9514 / 125.1 s =  76.0 TPS_average
block 34 | new #TX 327 / 5000 ms =  65.4 TPS_current | total: #TX 9841 / 130.7 s =  75.3 TPS_average
block 35 | new #TX 327 / 5000 ms =  65.4 TPS_current | total: #TX 10168 / 135.6 s =  75.0 TPS_average
block 36 | new #TX 784 / 5000 ms = 156.8 TPS_current | total: #TX 10952 / 140.5 s =  78.0 TPS_average
block 37 | new #TX 248 / 5000 ms =  49.6 TPS_current | total: #TX 11200 / 145.5 s =  77.0 TPS_average
block 38 | new #TX 327 / 5000 ms =  65.4 TPS_current | total: #TX 11527 / 150.7 s =  76.5 TPS_average
block 39 | new #TX 327 / 5000 ms =  65.4 TPS_current | total: #TX 11854 / 155.6 s =  76.2 TPS_average
block 40 | new #TX 516 / 5000 ms = 103.2 TPS_current | total: #TX 12370 / 160.3 s =  77.2 TPS_average
block 41 | new #TX 327 / 5000 ms =  65.4 TPS_current | total: #TX 12697 / 165.6 s =  76.7 TPS_average
block 42 | new #TX 287 / 5000 ms =  57.4 TPS_current | total: #TX 12984 / 170.5 s =  76.1 TPS_average
block 43 | new #TX 327 / 5000 ms =  65.4 TPS_current | total: #TX 13311 / 175.8 s =  75.7 TPS_average
block 44 | new #TX 491 / 5000 ms =  98.2 TPS_current | total: #TX 13802 / 180.2 s =  76.6 TPS_average
block 45 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 13802 / 185.1 s =  74.6 TPS_average
block 46 | new #TX 327 / 5000 ms =  65.4 TPS_current | total: #TX 14129 / 190.6 s =  74.1 TPS_average
block 47 | new #TX 327 / 5000 ms =  65.4 TPS_current | total: #TX 14456 / 195.5 s =  73.9 TPS_average
block 48 | new #TX 739 / 5000 ms = 147.8 TPS_current | total: #TX 15195 / 200.5 s =  75.8 TPS_average
block 49 | new #TX 229 / 5000 ms =  45.8 TPS_current | total: #TX 15424 / 205.5 s =  75.1 TPS_average
block 50 | new #TX 327 / 5000 ms =  65.4 TPS_current | total: #TX 15751 / 210.7 s =  74.8 TPS_average
block 51 | new #TX 327 / 5000 ms =  65.4 TPS_current | total: #TX 16078 / 215.6 s =  74.6 TPS_average
block 52 | new #TX 468 / 5000 ms =  93.6 TPS_current | total: #TX 16546 / 220.2 s =  75.1 TPS_average
block 53 | new #TX 316 / 5000 ms =  63.2 TPS_current | total: #TX 16862 / 225.5 s =  74.8 TPS_average
block 54 | new #TX 327 / 5000 ms =  65.4 TPS_current | total: #TX 17189 / 230.7 s =  74.5 TPS_average
block 55 | new #TX 296 / 4000 ms =  74.0 TPS_current | total: #TX 17485 / 235.4 s =  74.3 TPS_average
block 56 | new #TX 421 / 6000 ms =  70.2 TPS_current | total: #TX 17906 / 240.3 s =  74.5 TPS_average
block 57 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 17906 / 245.2 s =  73.0 TPS_average
block 58 | new #TX 327 / 5000 ms =  65.4 TPS_current | total: #TX 18233 / 250.5 s =  72.8 TPS_average
block 59 | new #TX 327 / 5000 ms =  65.4 TPS_current | total: #TX 18560 / 255.4 s =  72.7 TPS_average
block 60 | new #TX 642 / 5000 ms = 128.4 TPS_current | total: #TX 19202 / 260.2 s =  73.8 TPS_average
block 61 | new #TX 226 / 5000 ms =  45.2 TPS_current | total: #TX 19428 / 265.4 s =  73.2 TPS_average
block 62 | new #TX 327 / 5000 ms =  65.4 TPS_current | total: #TX 19755 / 270.6 s =  73.0 TPS_average
block 63 | new #TX 246 / 5000 ms =  49.2 TPS_current | total: #TX 20001 / 275.2 s =  72.7 TPS_average
block 64 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 20001 / 280.1 s =  71.4 TPS_average
block 65 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 20001 / 284.9 s =  70.2 TPS_average

but:

(2) multi-threaded RPC calls with 23 workers ...

... i.e. if the chainhammer is started with

./deploy.py notest; ./send.py threaded2 23

then that leads to a refusal after a few thousand transactions:

./tps.py 

versions: web3 4.7.2, py-solc: 3.1.0, solc 0.4.24+commit.e67f0147.Linux.gpp, testrpc 1.3.5, python 3.5.3 (default, Sep 27 2018, 17:25:39) [GCC 6.3.0 20170516]
web3 connection established, blockNumber = 746, node version string =  Parity-Ethereum//v2.2.0-nightly-ab61722-20181001/x86_64-linux-gnu/rustc1.29.0
first account of node is 0x3DcD1457f5F0a7717B63769A341308a84e393e9B, balance is 0 Ether
nodeName: Parity, nodeType: Parity, consensus: ???, network: 17, chainName: myaura, chainId: 17

Block  746  - waiting for something to happen
(filedate 1538677060) last contract address: 0xbe4bad4A14c3fcbA04Fa1B8C718a8f2CFDB0c82e
(filedate 1538680790) new contract address: 0x5a9235856A9AC9808e82eD08F9A47931b5218826

blocknumber_start_here = 752
starting timer, at block 752 which has  1  transactions; at timecode 6987.320943314
block 752 | new #TX 273 / 5000 ms =  54.6 TPS_current | total: #TX  274 /  5.6 s =  48.8 TPS_average
block 753 | new #TX 501 / 5000 ms = 100.2 TPS_current | total: #TX  775 / 10.3 s =  75.4 TPS_average
block 754 | new #TX 327 / 5000 ms =  65.4 TPS_current | total: #TX 1102 / 15.6 s =  70.6 TPS_average
block 755 | new #TX 327 / 5000 ms =  65.4 TPS_current | total: #TX 1429 / 20.7 s =  69.1 TPS_average
block 756 | new #TX 327 / 5000 ms =  65.4 TPS_current | total: #TX 1756 / 25.7 s =  68.2 TPS_average
block 757 | new #TX 695 / 5000 ms = 139.0 TPS_current | total: #TX 2451 / 30.3 s =  81.0 TPS_average
block 758 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 2451 / 34.9 s =  70.1 TPS_average
block 759 | new #TX 327 / 5000 ms =  65.4 TPS_current | total: #TX 2778 / 40.5 s =  68.5 TPS_average
block 760 | new #TX 327 / 5000 ms =  65.4 TPS_current | total: #TX 3105 / 45.8 s =  67.8 TPS_average
block 761 | new #TX 968 / 5000 ms = 193.6 TPS_current | total: #TX 4073 / 50.5 s =  80.7 TPS_average
block 762 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 4073 / 55.2 s =  73.8 TPS_average
block 763 | new #TX 327 / 5000 ms =  65.4 TPS_current | total: #TX 4400 / 60.7 s =  72.4 TPS_average
block 764 | new #TX 327 / 5000 ms =  65.4 TPS_current | total: #TX 4727 / 65.7 s =  71.9 TPS_average
block 765 | new #TX 802 / 5000 ms = 160.4 TPS_current | total: #TX 5529 / 70.4 s =  78.6 TPS_average
block 766 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 5529 / 75.0 s =  73.7 TPS_average
block 767 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 5529 / 80.2 s =  68.9 TPS_average
block 768 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 5529 / 85.1 s =  64.9 TPS_average
block 769 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 5529 / 90.0 s =  61.4 TPS_average
block 770 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 5529 / 94.9 s =  58.2 TPS_average
block 771 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 5529 / 100.1 s =  55.2 TPS_average
block 772 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 5529 / 105.0 s =  52.7 TPS_average
block 773 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 5529 / 109.9 s =  50.3 TPS_average
block 774 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 5529 / 115.1 s =  48.0 TPS_average
block 775 | new #TX   0 / 5000 ms =   0.0 TPS_current | total: #TX 5529 / 120.0 s =  46.1 TPS_average
...

i.e. after 5529 transactions ... no new ones can enter the chain.

Then afterwards the node does not accept anything, also not if I restart my experiments.

Only thing that helps is Ctrl-C and restart docker-compose up.

Then e.g. the single-threaded chainhammering works fine again, I did a 2nd attempt - and again all 20k transactions go through.

(Just run my stuff yourself, it takes you only a few minutes.)

hope that helps to give you an idea how to fix this.

@drandreaskrueger drandreaskrueger changed the title broken parity 2.0.5-stable - refuses to accept new transactions after ~500 tx broken parity 2.0.5-stable - refuses to accept new transactions after ~500 tx [possibly a multi-threading issue, see new comments] Oct 4, 2018
@drandreaskrueger
Copy link
Author

P.S.: Problem did not exist yet in parity v1 that is why I have run all my recent experiments with v1.11.11

@5chdn 5chdn modified the milestones: 2.2, 2.3 Oct 29, 2018
@drandreaskrueger
Copy link
Author

@ddorgan - you had seen this too when I visited you in the Berlin office.

Any new ideas how to fix this? Then we could finally move up from v1.11.11 to parity v2.x.y ... ?

@5chdn
Copy link
Contributor

5chdn commented Dec 3, 2018

does this happen with 2.2.3?

cc @tomusdrw

@drandreaskrueger
Copy link
Author

drandreaskrueger commented Feb 19, 2019

chainhammer v55

You can now try it all yourself, the newest version is so automated, that you run & analyze a whole experiment with one or two CLI lines. That should help you to find the flaw.

Because of more general interest, and useful for several open issues, I have created a new issue:
#10382

@5chdn
Copy link
Contributor

5chdn commented Feb 20, 2019

Thanks for sharing. I would advise using random accounts to hammer the chain, this will prevent nonce hiccups.

@5chdn 5chdn closed this as completed Feb 20, 2019
@drandreaskrueger
Copy link
Author

drandreaskrueger commented Feb 20, 2019

Thanks for sharing.

You're welcome. If my work can help to fix bugs, and make parity faster, that would be great.

advise using random accounts to hammer the chain

yes, for some cases, this might be the way to go, for others not. However, IMHO it makes sense to benchmark the general case, not the special edge case. And geth, or quorum don't need such a workaround.

closed this ...

Perhaps reopen this issue, until the seen problem (of parity v2.x.y just refusing to accept any more transactions after a few thousand tx) is actually fixed?

@jam10o-new
Copy link
Contributor

@drandreaskreuger perhaps if it is still apparent after #10375 is merged?

@5chdn
Copy link
Contributor

5chdn commented Feb 20, 2019

after #10375

this and maybe keep everything in one ticket #10382

@drandreaskrueger
Copy link
Author

@joshua-mir yes and esp the title of #10344 really sounds like what I had been seeing there.

@ddorgan
Copy link
Collaborator

ddorgan commented Feb 20, 2019

@drandreaskrueger great to hear about a new chainhammer version, will check it out.

@drandreaskrueger
Copy link
Author

yes, it is muuuuuuch more automated. Looking forward to your feedback,

I ran it a few times yesterday, with parity v2.2.3 - and parity refused to acccept new transactions even when firing just single-threaded (with my "sequential" not the "threaded2 20" algo).

So ... completely switching back to v1.11.11 for now: drandreaskrueger/chainhammer@f8f2760

but with the new chainhammer version v55, I am optimistic that one of you will find the cause, and can run enough tests easily, to get rid of this #9582 issue soon.

Good luck ;-)

@Pzixel
Copy link
Contributor

Pzixel commented Feb 25, 2019

I still have issues on hanged TXs. I need investigate more to gather useful information but from user perspective if I set too much parallelism then I get multiple timeouts for everything (even for read-only eth_call!).

Thanks to this thread, I'm going to try 1.11.11

@jam10o-new
Copy link
Contributor

@Pzixel have you checked out the recommendations here: #10382 (comment)

@Pzixel
Copy link
Contributor

Pzixel commented Feb 25, 2019

Yep, I've bookmarked them. In a nutshell

  1. --gas-floor-target=40M - already set on 6B
    2-3. - Didn't try them yet, but I don't see it will change anything. Parity answer on RPC quite fast, it's problem with getting tx mined and executing EVM code
  2. Already set as per Parity RCP hangs after intensive tx generation #10321
  3. Minor thing that may save couple of tacts but changes nothing
  4. Gonna try this one. Not sure how it affects network behavior
  5. Won't affect me since I used client-side nonce mangement for the same reasons, as per Parity timeout on RPC after intensive work #8829 (comment)
  6. Already set at 5sec

@tomusdrw
Copy link
Collaborator

@Pzixel There is no scheduler for the calls, so if you spam the node with huge amount of requests most of them will simply timeout on the client side.
You need to increase or disable timeout on the client side.
Have you tried using different transports than HTTP?

@Pzixel
Copy link
Contributor

Pzixel commented Feb 25, 2019

@tomusdrw

Have you tried using different transports than HTTP?

I actually didn't because when I was writing this code it wasn't supported, so I could check it with the new version.

However, I don't really think 1-5k rps is a big amount that could remain responseless in 20 seconds. I may be wrong, so I'd like to hear your opinion if party can handle this rps.

@deanstef
Copy link

Hey @Pzixel are you still having that issue? I am experiencing something similar in my 4 nodes network with this configuration: #10382 (comment)

If I spam the nodes with 1k req/s (simple RPC eth_sendTransaction over HTTP) I see a big degradation of de throughput and a lot of request simply get ignored.

@tomusdrw Do you have any updates on that?

@Pzixel
Copy link
Contributor

Pzixel commented Jul 19, 2019

Under huge load parity eventually stops processing transacitons. It just takes as many as specified in max queue parameter in toml and then hangs, in terms it never generates the block with those transactions whilst new txs gets rejected with "queue limit exceeded" error message.

We didn't find any better approach than just reboot all the mining nodes when this happens. Parity tech seems to be focused on substrate so parity itself doesn't get much attention or support. Considering how bad ethereum is it leads to pretty sad development process.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Z0-unconfirmed 🤔 Issue might be valid, but it’s not yet known.
Projects
None yet
Development

No branches or pull requests

8 participants