Skip to content

perf: cache CometDecodedVector validityBufferAddress#4435

Merged
mbutrovich merged 3 commits into
apache:mainfrom
mbutrovich:close_4279
May 26, 2026
Merged

perf: cache CometDecodedVector validityBufferAddress#4435
mbutrovich merged 3 commits into
apache:mainfrom
mbutrovich:close_4279

Conversation

@mbutrovich
Copy link
Copy Markdown
Contributor

@mbutrovich mbutrovich commented May 26, 2026

Which issue does this PR close?

Closes #4279.

Rationale for this change

CometDecodedVector.isNullAt already caches the validity bitmap byte, but it re-resolved valueVector.getValidityBuffer().memoryAddress() on every byte-cache miss. The buffer address is stable for the vector's lifetime, so the lookup can be cached once.

What changes are included in this PR?

  • Cache validityBufferAddress in the CometDecodedVector constructor; drop the per-miss lookup in isNullAt.

How are these changes tested?

New TestCometPlainVector.testIsNullAtSequentialAcrossValidityBytes exercises sequential isNullAt reads (forward and reverse) across multiple validity bytes with mixed null/non-null rows.

Otherwise, existing tests.

@mbutrovich mbutrovich self-assigned this May 26, 2026
@mbutrovich mbutrovich changed the title perf: cache CometDecodedVector validityBufferAddress, simplify CometVector API perf: cache CometDecodedVector validityBufferAddress May 26, 2026
@mbutrovich
Copy link
Copy Markdown
Contributor Author

Just noticed that #4416 has some of the same API changes. Reduced the scope of this PR to just the caching, and will resolve any conflicts after #4416 merges.

@mbutrovich mbutrovich marked this pull request as draft May 26, 2026 14:25
# Conflicts:
#	spark/src/main/java/org/apache/comet/vector/CometDelegateVector.java
#	spark/src/main/java/org/apache/comet/vector/CometPlainVector.java
#	spark/src/main/java/org/apache/comet/vector/CometSelectionVector.java
@mbutrovich mbutrovich marked this pull request as ready for review May 26, 2026 16:58
@mbutrovich mbutrovich requested a review from andygrove May 26, 2026 18:22
Copy link
Copy Markdown
Member

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mbutrovich

@mbutrovich mbutrovich merged commit 0d5c592 into apache:main May 26, 2026
62 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CometPlainVector: validity-bitmap byte cache for sequential reads

2 participants