Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix poh recorder not flushing virtual ticks immediately #5277

Merged
merged 2 commits into from Jul 25, 2019

Conversation

@sagar-solana
Copy link
Contributor

commented Jul 25, 2019

Problem

The problem that likely took town the TDS cluster exposes it self like so,

  1. While timing out a no-show leader for more than 1 slot, poh recorder can build up enough virtual ticks such that a bug is exposed. virtual ticks can't flush since it has no bank.
  2. When the node becomes a leader, poh recorder is finally given a bank
  3. A tx goes through banking stage and passes the age check (this is the bug; there can be a tx using a blockhash that will be purged by the virtual ticks).
  4. banking stage calls record and poh flushes all the virtual ticks before recording the new tx (this is fine).
  5. Rest of the cluster sees BlockhashNotFound on any transaction that passed since flushing virtual ticks moved the tx's blockhash out of age range and the tx failed the age check when they were replaying the tx.
  6. all ledgers(including the leader's) will fail verify with BlockhashNotFound.

Summary of Changes

Make sure all virtual ticks are flushed as soon as poh recorder is given a bank.
This will block set_bank until the virtual ticks are written to the newly set bank.

@carllin

This comment has been minimized.

Copy link
Contributor

commented Jul 25, 2019

Yaaay. Now where's your integration test :P

@codecov

This comment has been minimized.

Copy link

commented Jul 25, 2019

Codecov Report

Merging #5277 into master will decrease coverage by 13.6%.
The diff coverage is 81.8%.

@@            Coverage Diff            @@
##           master   #5277      +/-   ##
=========================================
- Coverage    77.3%   63.6%   -13.7%     
=========================================
  Files         200     201       +1     
  Lines       37306   45372    +8066     
=========================================
+ Hits        28857   28881      +24     
- Misses       8449   16491    +8042
1 similar comment
@codecov

This comment has been minimized.

Copy link

commented Jul 25, 2019

Codecov Report

Merging #5277 into master will decrease coverage by 13.6%.
The diff coverage is 81.8%.

@@            Coverage Diff            @@
##           master   #5277      +/-   ##
=========================================
- Coverage    77.3%   63.6%   -13.7%     
=========================================
  Files         200     201       +1     
  Lines       37306   45372    +8066     
=========================================
+ Hits        28857   28881      +24     
- Misses       8449   16491    +8042

@sagar-solana sagar-solana changed the title Fix poh not flushing virtual ticks immediately Fix poh recorder not flushing virtual ticks immediately Jul 25, 2019

@sagar-solana sagar-solana merged commit a233a1c into solana-labs:master Jul 25, 2019

12 checks passed

Rule: v0.16 backport (backport) Backports have been created
Details
Summary 2 rules match and 2 potential rules
Details
buildkite/solana Build #9592 passed (30 minutes, 28 seconds)
Details
buildkite/solana/bench Passed (11 minutes, 54 seconds)
Details
buildkite/solana/checks Passed (1 minute, 10 seconds)
Details
buildkite/solana/coverage Passed (24 minutes, 51 seconds)
Details
buildkite/solana/move-demo Passed (7 minutes, 45 seconds)
Details
buildkite/solana/pipeline-upload Passed (8 seconds)
Details
buildkite/solana/shellcheck Passed (21 seconds)
Details
buildkite/solana/stable Passed (29 minutes, 7 seconds)
Details
buildkite/solana/stable-perf Passed (25 minutes, 53 seconds)
Details
ci-gate Pull Request accepted for CI pipeline

@sagar-solana sagar-solana deleted the sagar-solana:flush_virtual_ticks branch Jul 25, 2019

mergify bot pushed a commit that referenced this pull request Jul 25, 2019

Fix poh recorder not flushing virtual ticks immediately (#5277)
* Fix poh not flushing virtual ticks immediately

* Fix test_would_be_leader_soon

(cherry picked from commit a233a1c)

solana-grimes added a commit that referenced this pull request Jul 25, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.