Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not block pool processing on rebroadcast #13560

Merged
merged 2 commits into from
Jul 13, 2023

Conversation

georgeee
Copy link
Member

@georgeee georgeee commented Jul 10, 2023

Problem: in certain conditions processing of a pool item may stall block processing. This happens because transition frontier extension event is written to a synchronous pipe which is read from the merged reader of a few items:

Strict_pipe.Reader.Merge.iter_sync
             [ Strict_pipe.Reader.map tf_diffs ~f:(fun diff ->
                   Transition_frontier_extension diff )
             ; remote_r
             ; local_r
             ]

Despite remote and local pool items being given a lower priority, processing of them may block transition frontier extension handling. It leads to a significant (multiple seconds) delay on one block, but also leads to accumulation of unprocessed blocks. If the delay repeats a few times (which is probabilistic), block delay may start growing as an avalanche, leading to hundreds of seconds of delays in block processing.

Explain your changes:

  • Wrap broadcasting of a pool item into don't_wait_for

Explain how you tested your changes:

  • Tested on a private cluster

PR is built on top of #13550 because that contains Remove snark tables lock change which significantly simplifies pool processing code and makes implementation of this PR easier.

Checklist:

  • Dependency versions are unchanged
    • Notify Velocity team if dependencies must change in CI
  • Modified the current draft of release notes with details on what is completed or incomplete within this project
  • Document code purpose, how to use it
    • Mention expected invariants, implicit constraints
  • Tests were added for the new behavior
    • Document test purpose, significance of failures
    • Test names should reflect their purpose
  • All tests pass (CI will check this if you didn't)
  • Serialized types are in stable-versioned modules
  • Does this close issues?

@georgeee georgeee requested a review from a team as a code owner July 10, 2023 21:50
@georgeee georgeee force-pushed the georgeee/dont-block-pool-processing-on-broadcast branch from 3eb3721 to 54f3e63 Compare July 11, 2023 10:27
@mergify mergify bot mentioned this pull request Jul 11, 2023
7 tasks
@georgeee georgeee force-pushed the georgeee/dont-block-pool-processing-on-broadcast branch from 54f3e63 to 3bfd073 Compare July 13, 2023 16:17
@georgeee georgeee requested a review from a team as a code owner July 13, 2023 16:17
@georgeee georgeee changed the base branch from georgeee/fix-snark-pool-long-async-cycles to berkeley July 13, 2023 16:17
@nholland94
Copy link
Member

!ci-build-me

@nholland94 nholland94 merged commit 043337c into berkeley Jul 13, 2023
45 checks passed
@nholland94 nholland94 deleted the georgeee/dont-block-pool-processing-on-broadcast branch July 13, 2023 21:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants