Skip to content

KAFKA-20597: Fix prefixScan overflow handling in in-memory state stores#22310

Open
pchintar wants to merge 1 commit into
apache:trunkfrom
pchintar:prefix_scan_overflow
Open

KAFKA-20597: Fix prefixScan overflow handling in in-memory state stores#22310
pchintar wants to merge 1 commit into
apache:trunkfrom
pchintar:prefix_scan_overflow

Conversation

@pchintar
Copy link
Copy Markdown

@pchintar pchintar commented May 18, 2026

prefixScan currently behaves inconsistently across state store
implementations when the prefix has no lexicographically larger upper
bound.

For example, given the following keys:

  • FF
  • FF 00
  • FF 10
  • FE

Calling prefixScan(FF) should return:

  • FF
  • FF 00
  • FF 10

RocksDBStore already handles this case correctly by treating the upper
bound as unbounded when incrementing the prefix overflows.

However, InMemoryKeyValueStore, MemoryNavigableLRUCache, and
CachingKeyValueStore directly call ByteUtils.increment(...), which
throws IndexOutOfBoundsException for prefixes such as 0xFF.

This causes prefixScan to fail before iteration begins for in-memory
state store implementations.

This PR centralizes overflow-safe increment handling in
ByteUtils.incrementWithoutOverflow(...) and updates the affected state
stores to use the shared implementation when handling prefixes without
an upper bound.

Regression tests named shouldPrefixScanPrefixWithNoUpperBound have
been added for each of InMemoryKeyValueStore, CachingKeyValueStore,
and MemoryNavigableLRUCache to reproduce the failure for the prefix
0xFF.

Testing

./gradlew spotlessApply
./gradlew :streams:test \
  --tests
org.apache.kafka.streams.state.internals.InMemoryKeyValueStoreTest.shouldPrefixScanPrefixWithNoUpperBound
\
  --tests
org.apache.kafka.streams.state.internals.CachingInMemoryKeyValueStoreTest.shouldPrefixScanPrefixWithNoUpperBound
\
  --tests
org.apache.kafka.streams.state.internals.InMemoryLRUCacheStoreTest.shouldPrefixScanPrefixWithNoUpperBound
./gradlew :streams:checkstyleTest

Before the fix, my regression test failed with
IndexOutOfBoundsException. After the fix, the test passes
successfully.

Reviewers: Sanskar
Jhajhariasjhajharia@confluent.io,
Chia-Ping Tsai chia7712@gmail.com

@github-actions github-actions Bot added triage PRs from the community streams clients small Small PRs labels May 18, 2026
@pchintar pchintar force-pushed the prefix_scan_overflow branch from 622a262 to a0acb77 Compare May 18, 2026 16:21
Copy link
Copy Markdown
Contributor

@UladzislauBlok UladzislauBlok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left minor comment
LGTM overall

Comment on lines +69 to +71
/**
* Returns the incremented {@link Bytes} value, or {@code null} if incrementing would overflow.
*/
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please keep documentation in same format as for increment method

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @UladzislauBlok thnx for the feedback, I have just updated the comments as you suggested

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope my new comments resolved this? thnx

Copy link
Copy Markdown
Author

@pchintar pchintar May 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @UladzislauBlok just a small reminder, could you kindly pls approve and merge my PR if what I've done is adequate enough for approval? thnx.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey
I don't have a write access (I'm not committee yet)
@mjsax
Hello Matthias, sorry for ping this PR was waiting for some time. Can you take a look please. Imo looks good

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mjsax and @chia7712 and @sjhajharia it's been several days, could you kindly pls respond to this PR? thnx

Copy link
Copy Markdown
Author

@pchintar pchintar May 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@UladzislauBlok they haven't responded yet, I'm new to Kafka community, so what do I do now?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello
We're close to KIP freeze for kafka 4.4, so I think Matthias is busy rn
I suggest to just be patient, and I'm sure PR will be approved.
For example my PR is also already waiting for a week

@pchintar pchintar force-pushed the prefix_scan_overflow branch from a0acb77 to 8345245 Compare May 19, 2026 18:09
@github-actions github-actions Bot removed the triage PRs from the community label May 20, 2026
Comment on lines +451 to +452
@Test
public void shouldPrefixScanPrefixWithNoUpperBound() {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add similar regression tests for CachingKeyValueStore and MemoryNavigableLRUCache? The fix touches both but their existing prefixScan tests at e.g. https://github.com/apache/kafka/blob/trunk/streams/src/test/java/org/apache/kafka/streams/state/internals/CachingInMemoryKeyValueStoreTest.java#L416 use ASCII prefixes, so the overflow path isn't hit.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Phixsura for the feedback, I've now added focused tests for CachingKeyValueStore and MemoryNavigableLRUCache as well.

The PR now includes overflow-prefix tests for all 3:

  • InMemoryKeyValueStore
  • CachingKeyValueStore
  • MemoryNavigableLRUCache

@Phixsura
Copy link
Copy Markdown

Thanks for the fix. One thought on the test coverage below, no blocker.

@pchintar pchintar force-pushed the prefix_scan_overflow branch from 8345245 to f08806e Compare May 21, 2026 16:25
@github-actions github-actions Bot removed the small Small PRs label May 21, 2026
@pchintar
Copy link
Copy Markdown
Author

@Phixsura I've updated my code by adding test cases to separately check all the 3 (InMemoryKeyValueStore, CachingKeyValueStore, MemoryNavigableLRUCache) as you've suggested, so I hope this resolves everything for this PR?

@pchintar
Copy link
Copy Markdown
Author

pchintar commented Jun 5, 2026

Hi @smjn could you kindly, if possible, pls check and approve my PR which has already been peer-reviewed and passes all the CI checks as well? My PR has been hanging out there for a while unf, so I really hope you or someone with access to PR approval can check this out? thnx

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants