- https://cassandra.apache.org/doc/latest/
- https://cassandra.apache.org/doc/latest/cassandra/architecture/index.html
--
- https://cassandra.apache.org/doc/latest/cassandra/data_modeling/index.html
- https://www.datastax.com/blog/basic-rules-cassandra-data-modeling
- You can also use List, Set, Map as columns in Cassandra: https://www.datastax.com/blog/coming-12-collections-support-cql3
- https://www.datastax.com/blog/cql-improvements-cassandra-21
- https://www.datastax.com/blog/whats-new-cassandra-21-better-implementation-counters
- https://www.datastax.com/blog/lightweight-transactions-cassandra-20
- https://www.datastax.com/examples
- https://www.datastax.com/examples/astra-netflix
- https://www.datastax.com/examples/astra-tik-tok
- https://youtu.be/fcohNYJ1FAI
- https://youtu.be/u6pKIrfJgkU
- https://academy.datastax.com/#/courses/c5b626ca-d619-45b3-adf2-a7d2b940a7ee
- https://www.youtube.com/playlist?list=PLqq-6Pq4lTTYzKjjA1C_jbic95tmq9FQt
- https://docs.datastax.com/en/archived/cql/3.1/cql/cql_using/use_counter_t.html
- https://www.datastax.com/blog/putting-some-structure-storage-engine
Empty folder (no sent items yet):
A new email lands in all receiving user's inbox:
And is also kept in sent items of sender user:
- These are the two high-level goals for data modeling in Cassandra:
- Spread data evenly around the cluster
- Minimize the number of partitions you read from
Lets see what all tables are created. If we describe main
cluster, we can see all the table schema.
token@cqlsh> use main;
token@cqlsh:main> describe main;
CREATE KEYSPACE main WITH replication = {'class': 'NetworkTopologyStrategy', 'asia-south1': '3'} AND durable_writes = true;
CREATE TABLE main.folders_by_user (
user_id text,
created_at_uuid uuid,
label text,
color text,
PRIMARY KEY (user_id, created_at_uuid, label)
) WITH CLUSTERING ORDER BY (created_at_uuid ASC, label ASC)
AND additional_write_policy = '99PERCENTILE'
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.UnifiedCompactionStrategy'}
AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair = 'BLOCKING'
AND speculative_retry = '99PERCENTILE';
CREATE TABLE main.messages_by_id (
id uuid PRIMARY KEY,
body text,
"from" text,
subject text,
"to" list<text>
) WITH additional_write_policy = '99PERCENTILE'
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.UnifiedCompactionStrategy'}
AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair = 'BLOCKING'
AND speculative_retry = '99PERCENTILE';
CREATE TABLE main.messages_by_user_folder (
user_id text,
label text,
created_at_uuid uuid,
"from" text,
subject text,
unread boolean,
PRIMARY KEY ((user_id, label), created_at_uuid)
) WITH CLUSTERING ORDER BY (created_at_uuid DESC)
AND additional_write_policy = '99PERCENTILE'
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.UnifiedCompactionStrategy'}
AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair = 'BLOCKING'
AND speculative_retry = '99PERCENTILE';
Sending email to multiple users at a time
DB console after this new message:
- counter tables in Cassandra has to be a dedicated table.
- i.e., columns in a counter table can be either primarykey (partitioning key or clustering key) column, or counter column itself
- that means, you cannot have non-priparykey, non-counter column in a counter table.
- you can run increment/decrement queries on counter columns of a counter table, like -
update COUNTER_TABLE_NAME set COUNTER_COL_NAME = COUNTER_COL_NAME + 1 where SOME_OTHER_COL = SOME_VAL
note that here the record does not have to exist in advance with matching 'where clause'. If the record with the given where clause does not exist, cassandra will create a new entry with initial value for all counter columns as 0. So in this case if there is no entry forSOME_OTHER_COL = SOME_VAL
when the query is executed, then cassandra will create a new entry and the value forCOUNTER_COL_NAME
for that entry will be updated from 0 to 1.