Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Custom ReadOnlyKeyValueStore for 'partitionedKeyValueStore' type optimised interactive queries #23

Closed
hartmut-co-uk opened this issue Jul 10, 2023 · 1 comment
Assignees
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@hartmut-co-uk
Copy link
Collaborator

hartmut-co-uk commented Jul 10, 2023

Idea came up while writing a blog post on the topic
-> https://thriving.dev/blog/interactive-queries-with-kafka-streams-cassandra-state-store

Problem

The Cassandra partitionedKeyValueStore is partitioned (CQL table primary key > partition key) by the streams taskId
to support range and prefixScan queries.

In theory, no RPC would be required since each instance still can access all table rows with the taskId as
CQL query condition - but results for interactive queries other than get(K key) still need to be queried separately
for all tasks and the results combined/merged.

Currently, no custom ReadOnlyKeyValueStore implementation is provided to do that. Just like for RocksDB/InMemory
state stores the 'RPC layer' pattern has to be utilised for this store type.

Kafka-Streams_Cassandra-State-Store_REST-API_v1_og

Feature

  • implementation of a custom ReadOnlyKeyValueStore facade for 'partitionedKeyValueStore' as described above
    • for get lookup, fetch taskId from metadata and query Cassandra with partition key
    • for all, reverseAll, prefixScan, range, and reverseRange we query Cassandra in parallel for all taskIds (from metadata) and merge the results
    • for approximateNumEntries a plain `SELECT COUNT(*) from table disregarding partition keys should satisfy (if opted-in)

Merging results for KeyValueIterator

org.apache.kafka.streams.state.internals.CompositeKeyValueIterator may be used/copied.

@hartmut-co-uk hartmut-co-uk added enhancement New feature or request good first issue Good for newcomers labels Jul 10, 2023
@hartmut-co-uk hartmut-co-uk self-assigned this Jul 15, 2023
hartmut-co-uk added a commit that referenced this issue Jul 22, 2023
TODOs:
- polishing
- docs
- see inline TODOs
hartmut-co-uk added a commit that referenced this issue Jul 22, 2023
@hartmut-co-uk
Copy link
Collaborator Author

solved by #25

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

1 participant