Skip to content

test(flatkv): wait for catchup before partial-loss digest compare#3516

Merged
blindchaser merged 1 commit into
mainfrom
yirenz/fix-flatkv-partial-loss-flaky
May 29, 2026
Merged

test(flatkv): wait for catchup before partial-loss digest compare#3516
blindchaser merged 1 commit into
mainfrom
yirenz/fix-flatkv-partial-loss-flaky

Conversation

@blindchaser

Copy link
Copy Markdown
Contributor

The dump-flatkv tool clones a snapshot + WAL into a temp dir and only retries 3 times if a live writer rolls a new snapshot and truncates the WAL mid-clone. Running the digest comparison right after a 20s sleep while the victim is still blocksyncing reliably loses that race on busy CI runners, panicking with "source kept churning".

  • increase default catchup timeout/tolerance
  • print per-node heights on catchup timeout for diagnosis

Describe your changes and provide context

Testing performed to validate your change

The dump-flatkv tool clones a snapshot + WAL into a temp dir and only retries
3 times if a live writer rolls a new snapshot and truncates the WAL
mid-clone. Running the digest comparison right after a 20s sleep while the
victim is still blocksyncing reliably loses that race on busy CI runners,
panicking with "source kept churning".

- increase default catchup timeout/tolerance
- print per-node heights on catchup timeout for diagnosis
@cursor

cursor Bot commented May 28, 2026

Copy link
Copy Markdown

PR Summary

Low Risk
Changes only an integration shell contract and timing/diagnostics; no production FlatKV or node logic is modified.

Overview
The FlatKV partial-loss integration contract now waits for the victim validator to near-sync with peers before running cross-node FlatKV digest checks.

It adds configurable CATCHUP_TIMEOUT (default 240s) and CATCHUP_TOLERANCE (default 10 blocks), plus a wait_for_catchup loop that polls seid status heights every 5s. On timeout it prints per-node heights and dumps the victim log. The digest step runs only after catch-up (or failure), avoiding flaky dump-flatkv failures when a still-blocksyncing node churns snapshots/WAL faster than the tool’s limited clone retries.

Reviewed by Cursor Bugbot for commit 208e601. Bugbot is set up for automated code reviews on this repo. Configure here.

@github-actions

github-actions Bot commented May 28, 2026

Copy link
Copy Markdown

The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed✅ passed✅ passed✅ passedMay 28, 2026, 3:55 PM

@codecov

codecov Bot commented May 28, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 58.21%. Comparing base (1b322f0) to head (208e601).

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3516      +/-   ##
==========================================
- Coverage   59.04%   58.21%   -0.83%     
==========================================
  Files        2199     2129      -70     
  Lines      182096   173921    -8175     
==========================================
- Hits       107510   101249    -6261     
+ Misses      64935    63685    -1250     
+ Partials     9651     8987     -664     
Flag Coverage Δ
sei-db 70.41% <ø> (ø)
sei-db-state-db ?

Flags with carried forward coverage won't be shown. Click here to find out more.
see 70 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@blindchaser blindchaser added this pull request to the merge queue May 28, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks May 28, 2026
@blindchaser blindchaser added this pull request to the merge queue May 29, 2026
Merged via the queue into main with commit 0ce9176 May 29, 2026
56 of 57 checks passed
@blindchaser blindchaser deleted the yirenz/fix-flatkv-partial-loss-flaky branch May 29, 2026 20:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants