Tune RPS of eth_getTransactionReceipt

#### System information

**Erigon version: `./erigon --version`**
erigon version 2.60.3-1f73ed55

same behavior was in older versions
erigon version 2.59.x, 2.60.x

**OS & Version: Windows/Linux/OSX**
linux, docker image thorax/erigon:v2.60.3
erigon, prysm, rpcdaemon running in one pod in kubernetes as the only workload on the node (+ some system stuff)

**Erigon Command (with flags/config):**
```
erigon --datadir /data/erigon --authrpc.jwtsecret /data/shared/jwt_secret --private.api.addr 127.0.0.1:7458 --port 31782 --nat extip:x.x.x.x --p2p.allowed-ports yyyyy,zzzzz --batchSize 512M --chain mainnet --metrics --metrics.addr 0.0.0.0 --metrics.port 9005 --authrpc.port 8551 --db.size.limit 7TB --db.pagesize 16KB --bodies.cache 6GB --maxpeers 300 --prune=
```

**RPCDaemon flags:**
```
rpcdaemon  \
  --datadir /data/erigon \
  --db.read.concurrency=48 \ # we have tried many different values 1, 2, 16, 24, 256, 1000, 10000
  --http.addr 0.0.0.0 --http.api eth,erigon,web3,net,debug,trace,txpool --http.port 8335 --http.vhosts * --http.timeouts.read 65s --rpc.batch.limit 400 --rpc.batch.concurrency 4 --rpc.returndata.limit 100000000 --ws \
  --txpool.api.addr localhost:7458 \
  --verbosity=debug --pprof --pprof.port=6062 --pprof.addr=0.0.0.0 --metrics --metrics.addr 0.0.0.0 --metrics.port 9007 \
# we have also trie --private.api.addr localhost:7458
```
we have also tried to omit the db.read.concurrency and set GOMAXPROCS to many different values ... no change

Consensus Layer: prysm

Chain/Network: ethereum mainnet

**HW Specs:**
GCP n4-standard-16 = 16vCPUs, 64GB Ram
disk: 4TB, (hyperdisk balanced), 36000IOPS, 500MB/s

#### Actual behaviour

We are running some performance tests, calling `eth_getTransactionReceipt` with many different txids, it looks like rpcdaemon is doing something synchronously or in one thread only:

```
# we have list of transactionHashes in transactions.txt altogether 32k tx ids, some from the latest blocks & some from blocks from the last 1/3 of the chain

$ cat transactions.txt | shuf | jq --raw-input -rcM '{"id": 1,"jsonrpc": "2.0","method": "eth_getTransactionReceipt","params": [ . ]}' | jq -rscM 'map({method: "POST", url: "http://127.0.0.1:8335", body: . | @base64 , header: {"Content-Type": ["application/json"]}}) | .[]' | vegeta attack -format json -rate=0 -max-workers 1 -duration 30s -timeout 1s| vegeta report
Requests      [total, rate, throughput]         723, 24.05, 24.04
Duration      [total, attack, wait]             30.076s, 30.068s, 8.641ms
Latencies     [min, mean, 50, 90, 95, 99, max]  586.332µs, 41.594ms, 35.886ms, 83.867ms, 93.202ms, 116.626ms, 148.232ms
Bytes In      [total, mean]                     1490000, 2060.86
Bytes Out     [total, mean]                     101943, 141.00
Success       [ratio]                           100.00%
Status Codes  [code:count]                      200:723
```

to emphasize:
```
$ cat ... | vegeta ... -max-workers 1 ...
Requests      [total, rate, throughput]         723, 24.05, 24.04
Latencies     [min, mean, 50, 90, 95, 99, max]  586.332µs, 41.594ms, 35.886ms, 83.867ms, 93.202ms, 116.626ms, 148.232ms
```
= 24RPS 

if we increase concurrency of the test the performance goes very slighly up but responses starts slow down significantly:
```
$ cat ... | vegeta attack ... -max-workers 5 ... 
Requests      [total, rate, throughput]         1029, 34.27, 34.09
Latencies     [min, mean, 50, 90, 95, 99, max]  500.26µs, 146.413ms, 142.153ms, 274.241ms, 312.796ms, 373.658ms, 483.553ms
```
= 34RPS

with more workers (10, 20, ...) it stays ~same 35


if we try to keep certain rate the API starts to slow down and eventually starts failing:
```
$ cat ... | vegeta attack ... -rate=200 -timeout 1s
Requests      [total, rate, throughput]         3000, 100.03, 6.61
Latencies     [min, mean, 50, 90, 95, 99, max]  848.307µs, 990.396ms, 1s, 1.001s, 1.001s, 1.001s, 1.002s
Success       [ratio]                           6.83%
Status Codes  [code:count]                      0:2795  200:205
```

The daemon is running 22 threads:
```
ps -T -p $RPCDAEMON_PID | wc -l
23
```

CPU utilization goes from 5% (idle) to ~10-14% during tests
Disk
**peak** write iops ~15k not changed during the test
**peak** read iops ~1.5k(idle) -> 3k(during tests)
^ far far away from the limit of 36k (actually have tried to run additional workload (fio) during tests and was able to achieve ~70k peak iops)

By far it is not using all available hardware...

#### Expected behaviour

with increasing clients concurrency I would expect increasing performance approximately linearly until the HW is saturated - to utilize parallelization.

pprof during the `-max-workers 10` tests

![cpu](https://github.com/ledgerwatch/erigon/assets/791679/b69fc526-49e8-4485-8519-ed18f7a6f6ec)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Tune RPS of eth_getTransactionReceipt #11090

System information

Actual behaviour

Expected behaviour

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Tune RPS of eth_getTransactionReceipt #11090

Description

System information

Actual behaviour

Expected behaviour

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions