Atomicity and Batches in Apache Cassandra® ℹ️ For technical support, please contact us via email or LinkedIn.

⬅️ Back Step 8 of 8 Next ➡️

Batch limitations

Single-partition batches are quite efficient and can performance better than individual statements because batches save on client-coordinator and coordinator-replicas communication. However, sending a large batch with hundreds of statements to one coordinator node can also negatively affect workload balancing.

Multi-partition batches are substantially more expensive as they require maintaining a batchlog in a separate Cassandra table. Therefore, even with respect to the main use case of updating the same data duplicated across multiple partitions due to denormalization, use multi-partition batches only when atomicity is truly important for your application. There are other ways to check and ensure consistency among duplicates for less critical data without sacrificing write performance.

Finally, do not use batches to group operations just for the sake of grouping. This example is an anti-pattern:

-- This is an anti-pattern
BEGIN BATCH
  INSERT INTO users (email, name, age, date_joined) 
  VALUES ('joe@datastax.com', 'Joe', 25, '2020-01-01');
  INSERT INTO users (email, name, age, date_joined) 
  VALUES ('jen@datastax.com', 'Jen', 27, '2020-01-01');
  INSERT INTO movies (title, year, duration, avg_rating, price) 
  VALUES ('Alice in Wonderland', 2010, 108, 8.33, 1.99);
  INSERT INTO movies (title, year, duration, avg_rating, price) 
  VALUES ('Alice in Wonderland', 1951, 75, 6.5, 0.99);  
APPLY BATCH;

⬅️ Back Next ➡️

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

step8-cassandra.md

step8-cassandra.md

Files

step8-cassandra.md

Latest commit

History

step8-cassandra.md

File metadata and controls