Skip to content

db: EliasFano: Seek perf on big sequences#19788

Merged
anacrolix merged 3 commits intomainfrom
alex/ef_interpol_34
Apr 15, 2026
Merged

db: EliasFano: Seek perf on big sequences#19788
anacrolix merged 3 commits intomainfrom
alex/ef_interpol_34

Conversation

@AskAlexSharov
Copy link
Copy Markdown
Collaborator

Story: part of HistoryRange() optimizations for erigon snapshots check-commitment-hist-at-blk-range command.

search in logarithmically-distributed array works not well.
so, can

 ┌────────────────────┬────────────────────────┬─────────────────────┬─────────┐
  │     Benchmark      │ release/3.4 (baseline) │ alex/ef_interpol_34 │ Speedup │
  ├────────────────────┼────────────────────────┼─────────────────────┼─────────┤
  │ n32/stride1        │ 56 ns                  │ 26 ns               │ 2.2×    │
  ├────────────────────┼────────────────────────┼─────────────────────┼─────────┤
  │ n32/stride123      │ 84 ns                  │ 39 ns               │ 2.2×    │
  ├────────────────────┼────────────────────────┼─────────────────────┼─────────┤
  │ n32/stride1000     │ 84 ns                  │ 38 ns               │ 2.2×    │
  ├────────────────────┼────────────────────────┼─────────────────────┼─────────┤
  │ n1000/stride1      │ 200 ns                 │ 46 ns               │ 4.3×    │
  ├────────────────────┼────────────────────────┼─────────────────────┼─────────┤
  │ n1000/stride123    │ 228 ns                 │ 63 ns               │ 3.6×    │
  ├────────────────────┼────────────────────────┼─────────────────────┼─────────┤
  │ n1000/stride1000   │ 229 ns                 │ 61 ns               │ 3.8×    │
  ├────────────────────┼────────────────────────┼─────────────────────┼─────────┤
  │ n100000/stride1    │ 354 ns                 │ 47 ns               │ 7.5×    │
  ├────────────────────┼────────────────────────┼─────────────────────┼─────────┤
  │ n100000/stride123  │ 388 ns                 │ 62 ns               │ 6.3×    │
  ├────────────────────┼────────────────────────┼─────────────────────┼─────────┤
  │ n100000/stride1000 │ 389 ns                 │ 63 ns               │ 6.2×    │
  └────────────────────┴────────────────────────┴─────────────────────┴─────────┘

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR is part of optimizing HistoryRange() usage by improving EliasFano seek performance on large sequences, especially when the distribution makes plain binary search less efficient.

Changes:

  • Replaced pure binary search in EliasFano seek paths with an interpolation-guess + exponential bracketing + binary search approach for upper-bits probing.
  • Reworked/expanded benchmarks to cover Seek across multiple sequence sizes and added Get2/DoubleEliasFano access benchmarks.
  • Hardened torrent data-file path derivation by using strings.TrimSuffix(..., ".torrent") instead of fixed-length slicing.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
db/recsplit/eliasfano32/elias_fano.go Adds searchUpperForward/searchUpperReverse helpers and wires them into forward/reverse seek paths to reduce upper() probes.
db/recsplit/eliasfano32/elias_fano_seek_bench_test.go Updates benchmarks to better reflect the new seek behavior and adds new access benchmarks.
db/integrity/torrent_verify.go Switches .torrent stripping logic to TrimSuffix for correctness/robustness.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread db/recsplit/eliasfano32/elias_fano_seek_bench_test.go
@yperbasis yperbasis changed the title [wip] EliasFano: Seek perf on big sequences db: EliasFano: Seek perf on big sequences Apr 14, 2026
@anacrolix anacrolix added this pull request to the merge queue Apr 14, 2026
Merged via the queue into main with commit b7efec2 Apr 15, 2026
36 checks passed
@anacrolix anacrolix deleted the alex/ef_interpol_34 branch April 15, 2026 00:04
@AskAlexSharov AskAlexSharov requested a review from Copilot April 15, 2026 02:58
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

}
n := int(hiIdx - lo)
if n <= 0 {
return lo
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In searchUpperReverse, the n <= 0 fast path returns lo, but when lo == hiIdx == ef.count this corresponds to the “no solution” case (e.g., hi < upper(0)), where the correct offset should be ef.count+1 (matching the previous sort.Search(int(ef.count+1), ...) behavior). Returning ef.count causes searchReverse to still probe idx=0, touching underlying data on guaranteed-miss queries and changing the function’s documented semantics. Consider returning lo+1 here (mirroring searchUpperForward) so misses return count+1 and the caller loop is skipped.

Suggested change
return lo
return lo + 1

Copilot uses AI. Check for mistakes.
github-merge-queue bot pushed a commit that referenced this pull request Apr 15, 2026
…tion fast path (#20566)

Fixes CR comment from #19788.

In `searchUpperReverse`, the `n <= 0` fast path (when `hiIdx == lo`)
returned `lo`, causing `searchReverse` to still probe `idx=0` on
guaranteed-miss queries. The correct return is `lo + 1`, mirroring
`searchUpperForward` and ensuring the caller loop is skipped entirely
when there is no solution.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants