PanamaDocValuesRangeSupport: OR mask directly into FixedBitSet instead of per-bit loop by sgup432 · Pull Request #16123 · apache/lucene

sgup432 · 2026-05-25T22:12:17Z

Description

As a followup to one of the comment in this PR - #16050 by @neoremind and @benwtrent, I am adding this change.

Earlier, after the vectorized comparison, we would loop over the matching bits and do bit by bit update on the bitset. This is not needed as ideally we could directly perform a OR operation and set it in the bitset in just one operation.

Also handles the case where if we need to span across two words, for example:
base = 60, and we compared 8 docs together giving the maskBits as 0b10110001.

Then docs 60-67 span two words:

Docs 60-63 are in bits[0] (bit positions 60-63)
Docs 64-67 are in bits[1] (bit positions 0-3)

Results below. This change gave a further boost in some of the cases!

c5.2xlarge (AVX-512):

Pattern	docCount	Fields	Before OR mask (ops/s)	With OR mask (ops/s)	Change
random	1M	1	200.2	273.5	+37%
random	1M	3	66.8	91.5	+37%
random	1M	5	62.6	81.9	+31%
random	10M	1	24.4	34.4	+41%
random	10M	3	8.1	11.1	+37%
random	10M	5	7.2	9.8	+36%
clustered	1M	1	8,856	11,345	+28%
clustered	1M	3	33,846	36,577	+8%
clustered	1M	5	35,055	34,975	~0%
clustered	10M	1	1,308	1,324	+1%
clustered	10M	3	24,388	24,052	~0%
clustered	10M	5	12,940	12,884	~0%

Mac (Apple M-series, 128-bit NEON):

Pattern	docCount	Fields	Before OR mask (ops/s)	With OR mask (ops/s)	Change
random	1M	1	219.5	250.2	+14%
random	1M	3	86.9	93.6	+8%
random	1M	5	80.2	80.2	~0%
random	10M	1	27.1	28.2	+4%
random	10M	3	9.0	7.5	-17%* (did another run where it gave ~5% improvement)
random	10M	5	7.7	8.7	+12%
clustered	1M	1	27,137	36,003	+33%
clustered	1M	3	56,333	81,988	+46%
clustered	1M	5	77,159	78,315	+2%
clustered	10M	1	3,502	3,617	+3%
clustered	10M	3	57,900	61,969	+7%
clustered	10M	5	31,728	31,502	~0%

…d of per-bit loop

neoremind · 2026-05-26T11:02:33Z

Nice follow-up! The speedup looks solid!

One minor comment, non-blocking: would it be better to move this to a method like orMask in FixedBitSet? Now it feels like leaking the internal bit manipulation outside, maybe other experts have more opinion on the inline way vs. encapsulation in FixetBitSet.

benwtrent · 2026-05-26T12:08:50Z

+        int wordIndex = base >> 6;
+        int bitOffset = base & 63;
+        long[] bits = bitSet.getBits();
+        if (bitOffset + vectorLen <= 64) {
+          bits[wordIndex] |= maskBits << bitOffset;
+        } else {
+          bits[wordIndex] |= maskBits << bitOffset;
+          bits[wordIndex + 1] |= maskBits >>> (64 - bitOffset);


my main concern here is correctness. I don't remember from the previous PR, but do we have a test that takes random docs of various densities, etc. and ensures that the "slow iterative path" and "fast path" produce the exact same results?

Also, I am fine with a new method just on FixedBitSet to achieve this called orBits or orMask or something.

PanamaDocValuesRangeSupport: OR mask directly into FixedBitSet instea…

09f98e8

…d of per-bit loop

github-actions Bot added the module:core/other label May 25, 2026

Add changelog

4e26583

github-actions Bot added this to the 10.5.0 milestone May 25, 2026

benwtrent reviewed May 26, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PanamaDocValuesRangeSupport: OR mask directly into FixedBitSet instead of per-bit loop#16123

PanamaDocValuesRangeSupport: OR mask directly into FixedBitSet instead of per-bit loop#16123
sgup432 wants to merge 2 commits into
apache:mainfrom
sgup432:or_mask_doc_values

sgup432 commented May 25, 2026 •

edited

Loading

Uh oh!

neoremind commented May 26, 2026

Uh oh!

benwtrent May 26, 2026

Uh oh!

benwtrent May 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

sgup432 commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Uh oh!

neoremind commented May 26, 2026

Uh oh!

benwtrent May 26, 2026

Choose a reason for hiding this comment

Uh oh!

benwtrent May 26, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

sgup432 commented May 25, 2026 •

edited

Loading