You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The Cassandra partitionedKeyValueStore is partitioned (CQL table primary key > partition key) by the streams taskId
to support range and prefixScan queries.
In theory, no RPC would be required since each instance still can access all table rows with the taskId as
CQL query condition - but results for interactive queries other than get(K key) still need to be queried separately
for all tasks and the results combined/merged.
Currently, no custom ReadOnlyKeyValueStore implementation is provided to do that. Just like for RocksDB/InMemory
state stores the 'RPC layer' pattern has to be utilised for this store type.
Feature
implementation of a custom ReadOnlyKeyValueStore facade for 'partitionedKeyValueStore' as described above
for get lookup, fetch taskId from metadata and query Cassandra with partition key
for all, reverseAll, prefixScan, range, and reverseRange we query Cassandra in parallel for all taskIds (from metadata) and merge the results
for approximateNumEntries a plain `SELECT COUNT(*) from table disregarding partition keys should satisfy (if opted-in)
Merging results for KeyValueIterator
org.apache.kafka.streams.state.internals.CompositeKeyValueIterator may be used/copied.
The text was updated successfully, but these errors were encountered:
Idea came up while writing a blog post on the topic
-> https://thriving.dev/blog/interactive-queries-with-kafka-streams-cassandra-state-store
Problem
The Cassandra partitionedKeyValueStore is partitioned (CQL table primary key > partition key) by the streams taskId
to support
range
andprefixScan
queries.In theory, no RPC would be required since each instance still can access all table rows with the taskId as
CQL query condition - but results for interactive queries other than
get(K key)
still need to be queried separatelyfor all tasks and the results combined/merged.
Currently, no custom
ReadOnlyKeyValueStore
implementation is provided to do that. Just like for RocksDB/InMemorystate stores the 'RPC layer' pattern has to be utilised for this store type.
Feature
get
lookup, fetch taskId from metadata and query Cassandra with partition keyall
,reverseAll
,prefixScan
,range
, andreverseRange
we query Cassandra in parallel for all taskIds (from metadata) and merge the resultsapproximateNumEntries
a plain `SELECT COUNT(*) from table disregarding partition keys should satisfy (if opted-in)Merging results for
KeyValueIterator
org.apache.kafka.streams.state.internals.CompositeKeyValueIterator
may be used/copied.The text was updated successfully, but these errors were encountered: