Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

timeout params [Tracking issue] #2873

Open
2 of 4 tasks
melekes opened this issue Apr 23, 2024 · 7 comments
Open
2 of 4 tasks

timeout params [Tracking issue] #2873

melekes opened this issue Apr 23, 2024 · 7 comments
Labels
tracking A complex issue broken down into sub-problems
Milestone

Comments

@melekes
Copy link
Contributor

melekes commented Apr 23, 2024

[Discussion]

Summary

timeout_propose = "3s"
timeout_propose_delta = "500ms"
timeout_prevote = "1s"
timeout_prevote_delta = "500ms"
timeout_precommit = "1s"
timeout_precommit_delta = "500ms"
timeout_commit = "1s"

skip_timeout_commit = false

When speaking about timeout parameters, there are multiple concerns.

  1. We don't need that many parameters
  2. App developers want consistency across validators, even though params may vary depending on validator's network topology and runtime
  3. App developers want a way to set block intervals (think of it as if all validators set the same timeout_commit value).

Note there are other potential ways to align parameters in a network #1617.

Tasks

  1. melekes
  2. enhancement
  3. config consensus spec
    nenadmilosevic95

So (1) will be solved by 1,2,3. (3) - by 4. (2) - by social consensus & external tools #1617

@melekes melekes added the tracking A complex issue broken down into sub-problems label Apr 23, 2024
@sergio-mena
Copy link
Contributor

sergio-mena commented Apr 23, 2024

These are my thoughts:

  • Task 1. Fully agree
  • Task 2. Makes sense in principle, although the relations pointed out in this comment are inequations, so they don't really tell you the value of some params in terms of others, but rather bounds (i.e., min or max values). I recently heard @cason say something along these lines.
    • Something I do see clearly here is that I don't see any practical reason not to merge timeout_prevote and timeout_precommit into one single value. As the events these timeouts relate to are vote reception (same number, similar size)
  • Task 3. Fully agree with timeout_prevote_delta and timeout_precommit_delta (both relating to votes). Not 100% sure about timeout_propose_delta, as it has to do with a proposal and its block, which can be way bigger in size than votes.
  • Task 4. Fully agree

@adizere adizere added this to the 2024-Q2 milestone Apr 23, 2024
@melekes
Copy link
Contributor Author

melekes commented Apr 25, 2024

@josef-widder @milosevic how feasible it is to actually calculate values of timeout_prevote (1s) & timeout_precommit (1s) from the round delta (500ms) - tendermint/tendermint#2920 (comment)? As @sergio-mena pointed out they define "min" values. But this is exactly what we're looking for! "min" values, which if validators fail to agree in round 0, get increased in round 1.

@melekes
Copy link
Contributor Author

melekes commented Apr 25, 2024

cc @ebuchman ^

@cason
Copy link
Contributor

cason commented May 8, 2024

See #2547

@cason
Copy link
Contributor

cason commented May 8, 2024

Something I do see clearly here is that I don't see any practical reason not to merge timeout_prevote and timeout_precommit into one single value. As the events these timeouts relate to are vote reception (same number, similar size)

They don't have similar sizes when we have vote extensions......

@cason
Copy link
Contributor

cason commented May 8, 2024

So, jumping here later.

Timeouts should be based on the size of the messages they refer to.

timeout_propose is strictly dependent on the typical block size, it should be computed based on the maximum block size in a network. timeout_propose_delta is the correction for this value, when underestimated. This should also depend on the block size.

timeout_prevote is more an optimization than anything. It defines how much time we still wait, when failing to observe a 2/3+ quorum for the same value, for additional votes that might change our action (we may see a 2/3+ quorum if we wait a little longer). There is some important discussion regarding the removal of this timestamp, or at least ignoring it when we realize we can't get a 2/3+ quorum at all from the already received messages (typical case: 2/3+ prevotes for nil).

timeout_precommit is really relevant for liveness. It is not an optimization. Once we reach GST, i.e., the system respects the current timeouts, we need all correct validators to see a 2/3+ quorum of precommits in the case one correct validator has seen it. Without this, the system might incur in the hidden lock issue, which prevents progress.

Before vote extensions, there was no reason to consider that prevotes and precommits would have distinct latencies, as both are fixed-size messages. This changes with vote extensions.

@melekes
Copy link
Contributor Author

melekes commented May 17, 2024

melekes added a commit that referenced this issue May 17, 2024
melekes added a commit that referenced this issue May 17, 2024
instead of removing it
Refs #2926

#2873 (comment)

---

#### PR checklist

- [x] Tests written/updated
- [x] Changelog entry added in `.changelog` (we use
[unclog](https://github.com/informalsystems/unclog) to manage our
changelog)
- [x] Updated relevant documentation (`docs/` or `spec/`) and code
comments
- [x] Title follows the [Conventional
Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tracking A complex issue broken down into sub-problems
Projects
Status: Todo
Development

No branches or pull requests

4 participants