feat: Custom ReadOnlyKeyValueStore for 'partitionedKeyValueStore' type optimised interactive queries #23

hartmut-co-uk · 2023-07-10T14:40:33Z

Idea came up while writing a blog post on the topic
-> https://thriving.dev/blog/interactive-queries-with-kafka-streams-cassandra-state-store

Problem

The Cassandra partitionedKeyValueStore is partitioned (CQL table primary key > partition key) by the streams taskId
to support range and prefixScan queries.

In theory, no RPC would be required since each instance still can access all table rows with the taskId as
CQL query condition - but results for interactive queries other than get(K key) still need to be queried separately
for all tasks and the results combined/merged.

Currently, no custom ReadOnlyKeyValueStore implementation is provided to do that. Just like for RocksDB/InMemory
state stores the 'RPC layer' pattern has to be utilised for this store type.

Feature

implementation of a custom ReadOnlyKeyValueStore facade for 'partitionedKeyValueStore' as described above
- for get lookup, fetch taskId from metadata and query Cassandra with partition key
- for all, reverseAll, prefixScan, range, and reverseRange we query Cassandra in parallel for all taskIds (from metadata) and merge the results
- for approximateNumEntries a plain `SELECT COUNT(*) from table disregarding partition keys should satisfy (if opted-in)

Merging results for `KeyValueIterator`

org.apache.kafka.streams.state.internals.CompositeKeyValueIterator may be used/copied.

The text was updated successfully, but these errors were encountered:

TODOs: - polishing - docs - see inline TODOs

hartmut-co-uk · 2023-07-23T10:22:55Z

solved by #25

hartmut-co-uk added enhancement New feature or request good first issue Good for newcomers labels Jul 10, 2023

hartmut-co-uk self-assigned this Jul 15, 2023

hartmut-co-uk added a commit that referenced this issue Jul 22, 2023

feat(#23): impl. of CassandraReadOnlyKeyValueStore

a47b46c

TODOs: - polishing - docs - see inline TODOs

hartmut-co-uk mentioned this issue Jul 22, 2023

feat(#23): impl. of CassandraReadOnlyKeyValueStore #25

Merged

4 tasks

hartmut-co-uk added a commit that referenced this issue Jul 22, 2023

feat(#23): adds new example 'partitioned-store-restapi'

8d16680

hartmut-co-uk added a commit that referenced this issue Jul 22, 2023

feat(#23): updates README

6572b99

hartmut-co-uk added a commit that referenced this issue Jul 23, 2023

feat(#23): impl. of CassandraReadOnlyKeyValueStore (#25)

e808f50

hartmut-co-uk closed this as completed Jul 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Custom ReadOnlyKeyValueStore for 'partitionedKeyValueStore' type optimised interactive queries #23

feat: Custom ReadOnlyKeyValueStore for 'partitionedKeyValueStore' type optimised interactive queries #23

hartmut-co-uk commented Jul 10, 2023 •

edited

hartmut-co-uk commented Jul 23, 2023

feat: Custom ReadOnlyKeyValueStore for 'partitionedKeyValueStore' type optimised interactive queries #23

feat: Custom ReadOnlyKeyValueStore for 'partitionedKeyValueStore' type optimised interactive queries #23

Comments

hartmut-co-uk commented Jul 10, 2023 • edited

Problem

Feature

Merging results for KeyValueIterator

hartmut-co-uk commented Jul 23, 2023

hartmut-co-uk commented Jul 10, 2023 •

edited

Merging results for `KeyValueIterator`