PBTS: experimental values to be evaluated in QA experiments #2323

cason · 2024-02-13T10:33:56Z

Overview

To define good default synchronous parameters for PBTS, collecting some relevant metrics is essential.

Original issue: tendermint/tendermint#7202

Currently planned test cases

QA/PBTS: Run 200-nodes test with v0.38.x's saturation point #2460 [Auxiliary run - not part of regular QA report]

Run on v1
PBTS enable height: 1
Run QA 200-node TC, but only with v0.38.x saturation point (r=200, c=2)
Run the experiment (90 secs each) 4-5 times
- or just run the experiment for 3-4 mins
Using latency emulation
output: PBTS team can look at ProposalTimestampDifference to set default values

QA/v1: Run 200-nodes test for final report without latency emulation #2461 [Part of regular QA report]

Run on v1
PBTS enable height: 1
Running saturation discovery
Duration ~40 mins
NOT Using latency emulation
output: QA report, 200-node test, with saturation point report section, and metrics report section (for sat point)
output: compare metrics between (v0.38.x, bft time) and (v1, PBTS)

QA/v1: Run 200-nodes test for final report with latency emulation #2513 [Part of regular QA report]

Run on v1
PBTS enable height: 1
Running saturation discovery (mainly because with latencies the saturation point is likely to be different)
Duration ~40 mins
Using latency emulation
output: baseline for future releases (200-node without latency emulation is now deprecated)
output: compare metrics between (v1, no-lat-emulation) and (v1, lat-emulation)

New test case: X nodes [PBTS specific - not part of the regular QA report]

Run on v1
PBTS enable height: 1
Duration ~Y mins
Using latency emulation
No Tx load
clock_skew enable on: close to 1/3 of voting power
clock_skew value: +5 seconds (i.e. 5 s in the future)
output: average (or histogram of) number of rounds to decide, and block latency
This test case is a DRAFT:
- Play around with e2e on docker (Daniel volunteered 😄)

The text was updated successfully, but these errors were encountered:

cason · 2024-02-27T13:54:06Z

Some comments regarding the existing metrics:

QuorumPrevoteDelay and FullPrevoteDelay: is anyone using? What is the value of the data produced by it?
ProposalTimestampDifference: it is current enabled only when PBTS is enabled.
- I wonder why not to enable it all the times, as it can provide useful data for chains that are considering to switch to PBTS.
- I don't see much point on creating two data sets, for timely and for untimely proposals. The bounds for timely proposals are known, determined by consensus params. Moreover, considering the previous item, what value we put when PBTS is disabled?

sergio-mena · 2024-02-27T20:27:49Z

@cason Regarding your comments to ProposalTimestampDifference. I fully agree with both

cason · 2024-02-28T12:07:20Z

Ok, I will fix that in #2321

cason · 2024-02-29T11:29:39Z

This issue mixes experiments to be performed with metrics to be implemented/reviewed.

Should we break the concerns into different issues?

Contributes to #2323. Add several buckets to better track `ProposalTimestampDifference` in QA experiments. Buckets: `-Inf, -1.5, -1.0, -0.5, 0, 0.5, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 6.0, 8.0, 10.0, +Inf` If they are too much, let me know. --- #### PR checklist - [ ] Tests written/updated - [ ] Changelog entry added in `.changelog` (we use [unclog](https://github.com/informalsystems/unclog) to manage our changelog) - [ ] Updated relevant documentation (`docs/` or `spec/`) and code comments - [ ] Title follows the [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec

cason · 2024-02-29T17:39:35Z

Metrics for QA experiments for PBTS were updated by #2479.

cason · 2024-02-29T17:57:29Z

This issue mixes experiments to be performed with metrics to be implemented/reviewed.

Should we break the concerns into different issues?

Created #2480 to track in production metrics.

cason · 2024-03-28T08:38:03Z

Are we running scenario 4., namely experiments with clock skew?

sergio-mena · 2024-03-28T16:23:52Z

@hvanz and I experimented with (a somewhat similar version of) scenario 4, when we were troubleshooting the several problems in our e2e nightlies. In particular, in one testnet where the clock-skewed validators amounted to more that 1/3 of the voting power. When troubleshooting those runs, we could see that the adaptive mechanism we put in place (#2432) is doing its job properly.

I think we can close this issue, as all the other 3 cases are being tracked by @hvanz in the Q1 tracking issue.

cason · 2024-03-28T16:56:10Z

Ok, closing this issue as experiment 4. was performed and it worked. We didn't have the goal of publishing its results, as it was a proof of concept.

cason mentioned this issue Feb 13, 2024

consensus: proposer-based timestamps (PBTS) #1731

Closed

30 tasks

cason added metrics qa Quality assurance pbts labels Feb 13, 2024

cason assigned hvanz and unassigned hvanz Feb 13, 2024

cason changed the title ~~QA: define relevant metrics for evaluating PBTS in a distributed environment~~ PBTS: define relevant metrics to be evaluated in QA experiments Feb 13, 2024

hvanz mentioned this issue Feb 21, 2024

v1 QA #2100

Closed

10 tasks

cason mentioned this issue Feb 27, 2024

feat(consensus): improve logging for timely and untimely messages #2321

Merged

4 tasks

cason linked a pull request Feb 27, 2024 that will close this issue

feat(consensus): improve logging for timely and untimely messages #2321

Merged

4 tasks

cason changed the title ~~PBTS: define relevant metrics to be evaluated in QA experiments~~ PBTS: experimental values and metrics to be evaluated in QA experiments Feb 29, 2024

This was referenced Feb 29, 2024

feat(metrics): more buckets for ProposalTimestampDifference #2479

Merged

PBTS: define and adapt metrics to debug timely proposals #2480

Closed

cason removed a link to a pull request Feb 29, 2024

feat(consensus): improve logging for timely and untimely messages #2321

Merged

4 tasks

cason changed the title ~~PBTS: experimental values and metrics to be evaluated in QA experiments~~ PBTS: experimental values to be evaluated in QA experiments Feb 29, 2024

cason mentioned this issue Mar 4, 2024

PBTS: Review and update documentation #2514

Closed

1 task

cason added this to the 2024-Q1 milestone Mar 28, 2024

cason closed this as completed Mar 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PBTS: experimental values to be evaluated in QA experiments #2323

PBTS: experimental values to be evaluated in QA experiments #2323

cason commented Feb 13, 2024 •

edited by hvanz

cason commented Feb 27, 2024

sergio-mena commented Feb 27, 2024

cason commented Feb 28, 2024

cason commented Feb 29, 2024

cason commented Feb 29, 2024

cason commented Feb 29, 2024

cason commented Mar 28, 2024

sergio-mena commented Mar 28, 2024

cason commented Mar 28, 2024

PBTS: experimental values to be evaluated in QA experiments #2323

PBTS: experimental values to be evaluated in QA experiments #2323

Comments

cason commented Feb 13, 2024 • edited by hvanz

Overview

Currently planned test cases

cason commented Feb 27, 2024

sergio-mena commented Feb 27, 2024

cason commented Feb 28, 2024

cason commented Feb 29, 2024

cason commented Feb 29, 2024

cason commented Feb 29, 2024

cason commented Mar 28, 2024

sergio-mena commented Mar 28, 2024

cason commented Mar 28, 2024

cason commented Feb 13, 2024 •

edited by hvanz