- https://cassandra.apache.org/doc/latest/
- https://cassandra.apache.org/doc/latest/cassandra/architecture/index.html
- https://cassandra.apache.org/doc/latest/cassandra/data_modeling/index.html
- https://www.datastax.com/blog/basic-rules-cassandra-data-modeling
- You can also use List, Set, Map as columns in Cassandra: https://www.datastax.com/blog/coming-12-collections-support-cql3
- https://www.datastax.com/blog/cql-improvements-cassandra-21
- https://www.datastax.com/blog/whats-new-cassandra-21-better-implementation-counters
- https://www.datastax.com/blog/lightweight-transactions-cassandra-20
- https://www.datastax.com/examples
- https://www.datastax.com/examples/astra-netflix
- https://www.datastax.com/examples/astra-tik-tok
- https://youtu.be/fcohNYJ1FAI
- https://youtu.be/u6pKIrfJgkU
- https://academy.datastax.com/#/courses/c5b626ca-d619-45b3-adf2-a7d2b940a7ee
- https://www.youtube.com/playlist?list=PLqq-6Pq4lTTYzKjjA1C_jbic95tmq9FQt
- https://docs.datastax.com/en/archived/cql/3.1/cql/cql_using/use_counter_t.html
- https://www.datastax.com/blog/putting-some-structure-storage-engine
Empty folder (no sent items yet):
A new email lands in all receiving user's inbox:
And is also kept in sent items of sender user:
- These are the two high-level goals for data modeling in Cassandra:
- Spread data evenly around the cluster
- Minimize the number of partitions you read from
Lets see what all tables are created. If we describe main
cluster, we can see all the table schema.
token@cqlsh> use main;
token@cqlsh:main> describe main;
CREATE KEYSPACE main WITH replication = {'class': 'NetworkTopologyStrategy', 'asia-south1': '3'} AND durable_writes = true;
CREATE TABLE main.folders_by_user (
user_id text,
created_at_uuid uuid,
label text,
color text,
PRIMARY KEY (user_id, created_at_uuid, label)
) WITH CLUSTERING ORDER BY (created_at_uuid ASC, label ASC)
AND additional_write_policy = '99PERCENTILE'
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.UnifiedCompactionStrategy'}
AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair = 'BLOCKING'
AND speculative_retry = '99PERCENTILE';
CREATE TABLE main.messages_by_id (
id uuid PRIMARY KEY,
body text,
"from" text,
subject text,
"to" list<text>
) WITH additional_write_policy = '99PERCENTILE'
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.UnifiedCompactionStrategy'}
AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair = 'BLOCKING'
AND speculative_retry = '99PERCENTILE';
CREATE TABLE main.messages_by_user_folder (
user_id text,
label text,
created_at_uuid uuid,
"from" text,
subject text,
unread boolean,
PRIMARY KEY ((user_id, label), created_at_uuid)
AND additional_write_policy = '99PERCENTILE'
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.UnifiedCompactionStrategy'}
AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair = 'BLOCKING'
AND speculative_retry = '99PERCENTILE';
Sending email to multiple users at a time
DB console after this new message:
- counter tables in Cassandra has to be a dedicated table.
- i.e., columns in a counter table can be either primarykey (partitioning key or clustering key) column, or counter column itself
- that means, you cannot have non-priparykey, non-counter column in a counter table.
- you can run increment/decrement queries on counter columns of a counter table, like -
note that here the record does not have to exist in advance with matching 'where clause'. If the record with the given where clause does not exist, cassandra will create a new entry with initial value for all counter columns as 0. So in this case if there is no entry forSOME_OTHER_COL = SOME_VAL
when the query is executed, then cassandra will create a new entry and the value forCOUNTER_COL_NAME
for that entry will be updated from 0 to 1.