Skip to content

Commit

Permalink
Report Skeleton + Result for homogeneous network (tendermint#291)
Browse files Browse the repository at this point in the history
* Report Skeleton

* First batch of results

* All metrics in

* Develop introductory sections

* Polishing the text
  • Loading branch information
sergio-mena committed Feb 9, 2023
1 parent d82bc77 commit b6571c0
Show file tree
Hide file tree
Showing 11 changed files with 289 additions and 0 deletions.
289 changes: 289 additions & 0 deletions docs/qa/v034/README.md
Expand Up @@ -276,3 +276,292 @@ transactions, via RPC, from the load runner process.
Date: 2022-10-10

Version: a28c987f5a604ff66b515dd415270063e6fb069d

# v0.34.x - From Tendermint Core to CometBFT

This section reports on the QA process we followed before releasing the first `v0.34.x` version
from our CometBFT repository.

The changes with respect to the last version of `v0.34.x`
(namely `v0.34.26`, released from the Informal Systems' Tendermint Core fork)
are minimal, and focus on rebranding Tendermint Core to CometBFT at places
where there is no substantial risk of breaking compatibility
with earlier Tendermint Core versions of `v0.34.x`.

Indeed, CometBFT versions of `v0.34.x` (`v0.34.27` and subsequent) should fulfill
the following compatibility-related requirements.

* Operators can easily upgrade a `v0.34.x` version of Tendermint Core to CometBFT.
* Upgrades from Tendermint Core to CometBFT can be uncoordinated for versions of the `v0.34.x` branch.
* Nodes running CometBFT must be interoperable with those running Tendermint Core in the same chain,
as long as all are running a `v0.34.x` version.

These QA tests focus on the third bullet, whereas the first two bullets are tested using our _e2e tests_.

It would be prohibitively time consuming to test mixed networks of all combinations of existing `v0.34.x`
versions, combined with the CometBFT release candidate under test.
Therefore our testing focuses on the last Tendermint Core version (`v0.34.26`) and the CometBFT release
candidate under test.

We only run the _200 node test_, and not the _rotating node test_.
Since the changes to the system's logic are minimal, we are interested in these performance requirements:

* The CometBFT release candidate under test performs similarly to Tendermint Core
* when used at scale (i.e., in a large network of CometBFT nodes)
* when used at scale in a mixed network (i.e., some nodes are running CometBFT
and others are running an older Tendermint Core version)

Therefore we carry out a complete run of the _200-node test_ on the following networks:

* A homogeneous 200-node testnet, where all nodes are running the CometBFT release candidate under test.
* A mixed network where 1/3 of the nodes are running the CometBFT release candidate under test,
and the rest are running Tendermint Core `v0.34.26`.
* A mixed network where 2/3 of the nodes are running the CometBFT release candidate under test,
and the rest are running Tendermint Core `v0.34.26`.

## 200 Node Testnet

TODO: Get rid of this level of subsection (as there is no rotating node test).
Not doing it now to save merge conflicts to Lásaro and Jasmina

### Saturation Point

As the CometBFT release candidate under test has minimal changes
with respect to Tendermint Core `v0.34.26`, other than the rebranding changes,
we can confidently reuse the results from the `v0.34.x` baseline test regarding
the [saturation point](#finding-the-saturation-point).

Therefore, we will simply use a load of `r=200,c=2`
(see the explanation [here](#finding-the-saturation-point)).

### Examining latencies

In this section and the remaining, we provide the results of the _200 node test_.
Each section is divided into three parts,
reporting on the homogeneous network (all CometBFT nodes),
mixed network with 1/3 of Tendermint Core nodes,
and mixed network with 2/3 of Tendermint Core nodes.

On each of the three networks, the experiment consists of 4 or 5 runs, with the goal
to make sure the data obtained is consistent.
On each of the networks, we pick only one representative run,
and present the results for that run.

#### CometBFT Homogeneous network

![latencies](./img/v034_200node_homog_latencies.png)

TODO: Explain


#### 1/3 Tendermint Core - 2/3 CometBFT

TODO

#### 2/3 Tendermint Core - 1/3 CometBFT

TODO

#### Prometheus Metrics

This section reports on the key prometheus metrics extracted from the experiments.

* For the CometBFT homogeneous network, we choose to present the third run
(see the latencies section above), as its latency date is representative, and
it contains the maximum latency of all runs (worst case scenario).
* For the mixed network with 1/3 of nodes running Tendermint Core `v0.34.26`
and 2/3 running CometBFT.
TODO
* For the mixed network with 2/3 of nodes running Tendermint Core `v0.34.26`
and 213 running CometBFT.
TODO

##### Mempool Size

For reference, the plots below correspond to the baseline results.
The first shows the evolution over time of the cumulative number of transactions
inside all full nodes' mempools at a given time.

![mempool-cumulative](./img/v034_r200c2_mempool_size.png)

The second one shows evolution of the average over all full nodes, which oscillates between 1500 and 2000
outstanding transactions.

![mempool-avg](./img/v034_r200c2_mempool_size_avg.png)

###### CometBFT Homogeneous network

The mempool size was as stable at all full nodes as in the baseline.
These are the corresponding plots for the homogeneous network test.

![mempool-cumulative-homogeneous](./img/v034_homog_mempool_size.png)

![mempool-avg-homogeneous](./img/v034_homog_mempool_size_avg.png)

###### 1/3 Tendermint Core - 2/3 CometBFT

TODO

###### 2/3 Tendermint Core - 1/3 CometBFT

TODO

##### Peers

The plot below corresponds to the baseline results, for reference.
It shows the stability of peers throughout the experiment.
Seed nodes typically have a higher number of peers.
The fact that non-seed nodes reach more than 50 peers is due to
[#9548](https://github.com/tendermint/tendermint/issues/9548).

![peers](./img/v034_r200c2_peers.png)

###### CometBFT Homogeneous network

The plot below shows the result for the homogeneous network.
It is very similar to the baseline. The only difference being that
the seed nodes seem to loose peers in the middle of the experiment.
However this cannot be attributed to the differences in the code,
which are mainly rebranding.

![peers-homogeneous](./img/v034_homog_peers.png)

###### 1/3 Tendermint Core - 2/3 CometBFT

TODO

###### 2/3 Tendermint Core - 1/3 CometBFT

TODO

##### Consensus Rounds per Height

For reference, this is the baseline plot.

![rounds](./img/v034_r200c2_rounds.png)


###### CometBFT Homogeneous network

Most heights took just one round, some nodes needed to advance to round 1 at various moments,
and a few nodes even needed to advance to the third round at one point.
This coincides with the time at which we observed the biggest peak in mempool size
on the corresponding plot, shown above.

![rounds-homogeneous](./img/v034_homog_rounds.png)

###### 1/3 Tendermint Core - 2/3 CometBFT

TODO

###### 2/3 Tendermint Core - 1/3 CometBFT

TODO

##### Blocks Produced per Minute, Transactions Processed per Minute

The blocks produced per minute are the slope of this plot, which corresponds to the baseline results.

![heights](./img/v034_r200c2_heights.png)

The transactions processed per minute are the slope of this plot,
which, again, corresponds to the baseline results.

![total-txs](./img/v034_r200c2_total-txs.png)

###### CometBFT Homogeneous network

![heights-homogeneous](./img/v034_homog_heights.png)

Over a period of 2 minutes and 4 seconds, the height goes from 251 to 295.
This results in an average of 21.3 blocks produced per minute.

![total-txs-homogeneous](./img/v034_homog_total-txs.png)

Over a period of 1 minute and 45 seconds (adjusted time window),
the total goes from 70201 to 104537 transactions,
resulting in 19620 transactions per minute.
This is similar to the baseline.

###### 1/3 Tendermint Core - 2/3 CometBFT

TODO

###### 2/3 Tendermint Core - 1/3 CometBFT

TODO

##### Memory Resident Set Size

Reference plot for Resident Set Size (RSS) of all monitored processes.

![rss](./img/v034_r200c2_rss.png)

And this is the baseline average plot.

![rss-avg](./img/v034_r200c2_rss_avg.png)

###### CometBFT Homogeneous network

This is the plot for the homogeneous network, which slightly more stable than the baseline over
the time of the experiment.

![rss-homogeneous](./img/v034_homog_rss.png)

And this is the average plot. It oscillates around 560 MiB, which is noticeably lower than the baseline.

![rss-avg-homogeneous](./img/v034_homog_rss_avg.png)

###### 1/3 Tendermint Core - 2/3 CometBFT

TODO

###### 2/3 Tendermint Core - 1/3 CometBFT

TODO

##### CPU utilization

This is the baseline `load1` plot, for reference.

![load1](./img/v034_r200c2_load1.png)

###### CometBFT Homogeneous network

![load1-homogeneous](./img/v034_homog_load1.png)

Similarly to the baseline, it is contained in most cases below 5.

###### 1/3 Tendermint Core - 2/3 CometBFT

TODO

###### 2/3 Tendermint Core - 1/3 CometBFT

TODO

### Test Results

#### CometBFT Homogeneous network

**Result: PASS**

Date: 2023-02-08

Version: 3b783434f26b0e87994e6a77c5411927aad9ce3f

#### 1/3 Tendermint Core - 2/3 CometBFT

**Result: ????**

Date: 2023-02-08

Version: xxxxxxxxxxxxxxxxx

#### 2/3 Tendermint Core - 1/3 CometBFT

**Result: ????**

Date: 2023-02-08

Version: xxxxxxxxxxxxxxxxx
Binary file added docs/qa/v034/img/v034_200node_homog_latencies.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/qa/v034/img/v034_homog_heights.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/qa/v034/img/v034_homog_load1.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/qa/v034/img/v034_homog_mempool_size.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/qa/v034/img/v034_homog_mempool_size_avg.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/qa/v034/img/v034_homog_peers.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/qa/v034/img/v034_homog_rounds.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/qa/v034/img/v034_homog_rss.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/qa/v034/img/v034_homog_rss_avg.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/qa/v034/img/v034_homog_total-txs.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit b6571c0

Please sign in to comment.