Use a more coarse-grained competitive iterator for skipper-based numeric sorts #15632

romseygeek · 2026-01-29T14:09:09Z

Numeric sorts against a field with DocValuesSkippers enabled currently use
DocValuesRangeIterator to implement competitive iterators. This has a number
of disadvantages:

DVRI cannot efficiently implement docIDRunEnd() or intoBitSet(), meaning that
bulk conjunction filtering may end up falling into slower code paths
For field value distributions that are essentially random, DVRI falls back to
doc-by-doc value checking, meaning that no skipping happens at all, but adding
overhead.

This commit adds a new SkipBlockRangeIterator that only skips whole blocks
where no document will be competitive, avoiding any individual doc-by-doc value
checks. The docIDRunEnd() and intoBitSet() implementations are very fast and
mean that bulk conjunction filtering will be efficient. The overheads as a whole
are very low, so randomly distributed values are much less adversarial, while
queries against indexes where the document order is roughly correlated with the
query sort get significant boosts.

This commit introduces a SkipBlockRangeIterator, that performs better than DocValuesRangeIterator as a competitive iterator due to more useful docIdRunEnd() and intoBitSet() implementations.

romseygeek · 2026-01-29T14:11:24Z

NB: I initially tried to extend this to TermOrdValComparator as well, but that causes test failures. This is because the existing tests use randomly-distributed data, so the SkipBlockRangeIterator can't actually filter out any values. This is probably still a better implementation than the existing one, however, because currently we are using a competitive iterator that ends up checking the value of every document in turn. I think we can address this in a follow-up by adjusting the test to use an index sort.

romseygeek · 2026-01-29T14:12:28Z

Internal elasticsearch benchmarks show that switching to this implementation doubles the performance of wholly adversarial sorts (eg sort by descending timestamp against an index sorted by ascending timestamp) without regressions elsewhere.

romseygeek added 4 commits January 27, 2026 16:40

wip: needs tests and benchmarking

8b0626b

Use block-by-block iteration for numeric competitive iterators

e4cfc34

This commit introduces a SkipBlockRangeIterator, that performs better than DocValuesRangeIterator as a competitive iterator due to more useful docIdRunEnd() and intoBitSet() implementations.

javadocs; reset termordval

d56ec6d

Merge remote-tracking branch 'origin/main' into skipper/block-iterator

23789e1

romseygeek self-assigned this Jan 29, 2026

github-actions bot added module:core/search module:core/codecs module:test-framework labels Jan 29, 2026

romseygeek mentioned this pull request Jan 29, 2026

Use SkipBlockRangeIterator as a competitive iterator for numeric sorts elastic/elasticsearch#141522

Open

Add changes

075cc29

github-actions bot added this to the 10.4.0 milestone Jan 29, 2026

tidy

99e3b02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use a more coarse-grained competitive iterator for skipper-based numeric sorts #15632

Use a more coarse-grained competitive iterator for skipper-based numeric sorts #15632

romseygeek commented Jan 29, 2026

Uh oh!

romseygeek commented Jan 29, 2026

Uh oh!

romseygeek commented Jan 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Use a more coarse-grained competitive iterator for skipper-based numeric sorts #15632

Are you sure you want to change the base?

Use a more coarse-grained competitive iterator for skipper-based numeric sorts #15632

Conversation

romseygeek commented Jan 29, 2026

Uh oh!

romseygeek commented Jan 29, 2026

Uh oh!

romseygeek commented Jan 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant