Skip to content

add time budget for integrity checks#20714

Merged
sudeepdino008 merged 15 commits intomainfrom
bound_integrity
Apr 23, 2026
Merged

add time budget for integrity checks#20714
sudeepdino008 merged 15 commits intomainfrom
bound_integrity

Conversation

@sudeepdino008
Copy link
Copy Markdown
Member

@sudeepdino008 sudeepdino008 commented Apr 21, 2026

Add --integrity.budget <duration> to erigon seg integrity that caps total wall-clock time for an integrity run.

  • When set, checks are sorted cheap → heavy (based on integrity.FastChecks order) and each receives a per-check deadline of remaining / remaining_checks, so finishing early on a cheap check rolls budget forward to the heavy tail.
  • On per-check timeout: log [integrity] budget exhausted, moving on at INFO and continue to the next check. Not treated as an integrity failure — --failFast only reacts to actual integrity errors.
  • When not set (default): no change in behavior — user-supplied order preserved, no deadline applied.
  • BlockReader.IntegrityTxnID now takes ctx and polls every 1000 segments so BlocksTxnID honors the budget.
  • CheckKvi producer now checks ctx every 100k keys, so skip-heavy runs (e.g. CommitmentKvi with SampleRatio=0) don't overshoot their slice by minutes.
  • doPublishable takes datadir.Dirs directly instead of *cli.Context.

follow up:

  • pick to release/3.4
  • change automation script to allot 30min budget

@sudeepdino008 sudeepdino008 marked this pull request as ready for review April 22, 2026 07:28
@sudeepdino008 sudeepdino008 changed the title [wip] Bound integrity add time budget for integrity checks Apr 22, 2026
@AskAlexSharov AskAlexSharov requested a review from Copilot April 22, 2026 09:38
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an optional wall-clock time budget to the erigon seg integrity command so integrity runs can be capped and (when budgeted) checks are executed in a cheap→heavy order with per-check time slices.

Changes:

  • Add --integrity.budget <duration> and run integrity checks sequentially with per-check timeouts derived from remaining budget.
  • Make long-running integrity routines more responsive to cancellation/deadlines (BlockReader.IntegrityTxnID and CheckKvi).
  • Refactor doPublishable to accept datadir.Dirs instead of *cli.Context.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
db/snapshotsync/freezeblocks/block_reader.go Add context.Context to IntegrityTxnID and poll for cancellation periodically.
db/integrity/integrity_kvi.go Add periodic ctx polling in the hot skip path and propagate ctx.Err() on cancellation.
db/integrity/integrity_action_type.go Reorder FastChecks and add SortChecksByCost helper for budgeted runs.
cmd/utils/app/snapshots_cmd.go Add --integrity.budget, implement per-check budgeting/timeout behavior, and adjust doPublishable signature/call sites.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread db/integrity/integrity_action_type.go
Comment thread cmd/utils/app/snapshots_cmd.go
@sudeepdino008 sudeepdino008 removed this pull request from the merge queue due to a manual request Apr 22, 2026
@sudeepdino008 sudeepdino008 enabled auto-merge April 22, 2026 10:05
@sudeepdino008 sudeepdino008 added this pull request to the merge queue Apr 23, 2026
Merged via the queue into main with commit 868f5eb Apr 23, 2026
37 checks passed
@sudeepdino008 sudeepdino008 deleted the bound_integrity branch April 23, 2026 03:09
AskAlexSharov added a commit that referenced this pull request Apr 24, 2026
…0773)

Cherry-pick of #20714 to release/3.4.

Co-authored-by: moskud <sudeepdino008@gmail.com>
sudeepdino008 added a commit that referenced this pull request Apr 27, 2026
Add `--integrity.budget <duration>` to `erigon seg integrity` that caps
total wall-clock time for an integrity run.

- When set, checks are sorted cheap → heavy (based on
`integrity.FastChecks` order) and each receives a per-check deadline of
`remaining / remaining_checks`, so finishing early on a cheap check
rolls budget forward to the heavy tail.
- On per-check timeout: log `[integrity] budget exhausted, moving on` at
INFO and continue to the next check. Not treated as an integrity failure
— `--failFast` only reacts to actual integrity errors.
- When not set (default): no change in behavior — user-supplied order
preserved, no deadline applied.
- `BlockReader.IntegrityTxnID` now takes `ctx` and polls every 1000
segments so `BlocksTxnID` honors the budget.
- `CheckKvi` producer now checks `ctx` every 100k keys, so skip-heavy
runs (e.g. `CommitmentKvi` with `SampleRatio=0`) don't overshoot their
slice by minutes.
- `doPublishable` takes `datadir.Dirs` directly instead of
`*cli.Context`.

follow up:
- pick to release/3.4
- change automation script to allot 30min budget

---------

Co-authored-by: Alex Sharov <AskAlexSharov@gmail.com>
wmitsuda added a commit that referenced this pull request Apr 27, 2026
… (#20838)

Cherry-picks the integrity-area chain from `main` onto `performance`, in
order:

1. #20435 — `integrity: restore StateProgress check`
2. #20680 — `integrity, stagedsync: cap state collation at block
snapshots progress`
3. #20714 — `add time budget for integrity checks`
4. #20835 — `cmd/utils/app: skip commitment history integrity checks
when disabled`

#20714 is the structural refactor of `doIntegrity`
(`errgroup.Go(switch{...})` → `runCheck := func{}` + budget loop).
Conflicted with performance-only Bloatnet customizations on the same
function (#20806, #20810, #20813); resolved by adopting #20714's shape
and re-applying the Bloatnet handling for `CommitmentKvDeref`,
`CommitmentHistVal`, and `StateRootVerifyByHistory`. `failFast` arg on
`CheckCommitmentHistAtBlkRange` preserved (matches performance's
signature).

Submodule pointer for `execution/tests/execution-spec-tests` left
unchanged.

`make erigon` + `make lint` clean.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants