Skip to content

Commit 153e334

Browse files
author
TheovanKraay
committed
changes after review
1 parent ec5dc29 commit 153e334

File tree

3 files changed

+25
-21
lines changed

3 files changed

+25
-21
lines changed

articles/cosmos-db/TOC.yml

+3-3
Original file line numberDiff line numberDiff line change
@@ -573,9 +573,7 @@
573573
- name: What is Cassandra API in Cosmos DB?
574574
href: cassandra-introduction.md
575575
- name: Wire protocol support
576-
href: cassandra-support.md
577-
- name: Frequently asked questions
578-
href: cassandra-faq.md
576+
href: cassandra-support.md
579577
- name: Quickstarts
580578
items:
581579
- name: .NET
@@ -599,6 +597,8 @@
599597
href: cassandra-import-data.md
600598
- name: Concepts
601599
items:
600+
- name: Frequently asked questions
601+
href: cassandra-faq.md
602602
- name: Elastic scale
603603
href: manage-scale-cassandra.md
604604
- name: Secondary Indexes

articles/cosmos-db/cassandra-faq.md

+16-16
Original file line numberDiff line numberDiff line change
@@ -10,31 +10,26 @@ ms.custom: seodec20
1010
---
1111
# Frequently asked questions about the Azure Cosmos DB API for Cassandra
1212

13-
### What are some reasons I would want to chose the Cosmos DB Cassandra API over a native Apache Cassandra implementation?
13+
### What are some key differences between Apache Cassandra and Cassandra API?
1414

15-
- Cassandra API is fully managed Platform as a Service (PaaS) offering, providing consistent performance for reads/writes and throughput without the need for touching any of the typical configuration settings in a native Apache Cassandra setup. This significantly simplifies maintenance.
16-
- Despite the PaaS nature of Cassandra API, it still supports a large surface area of the native [Apache Cassandra wire protocol](cassandra-support.md), allowing you to build applications on a widely used and cloud agnostic open source standard.
17-
- Cassandra API provides a number of mechanisms and choices for [elastic scale](manage-scale-cassandra.md), which are much more difficult to achieve in a native Apache Cassandra implementation.
18-
- Cassandra API provides access to a persistent change log, the [Change Feed](cassandra-change-feed.md), which can facilitate event sourcing directly from the database. In native Cassandra, the only equivalent is change data capture (CDC), which is merely a mechanism to flag specific tables for archival as well as rejecting writes to those tables once a configurable size-on-disk for the CDC log is reached (these capabilities are redundant in Cosmos DB as the relevant aspects are automatically governed).
19-
- Apache Cassandra has a 100MB limit on the size of a partition key. Cassandra API allows up to 10GB per partition.
15+
- Apache Cassandra recommends a 100MB limit on the size of a partition key. Cassandra API allows up to 10GB per partition.
2016
- Apache Cassandra allows you to disable durable commits - i.e. skip writing to the commit log and go directly to the Memtable(s). This can lead to data loss if the node goes down prior to Memtables being flushed to SStables on disk. Cosmos DB always does durable commits so you will never have data loss.
2117
- Apache Cassandra can see diminished performance if the workload involves a lot of replaces and/or deletes. The reason for this is tombstones that the read workload needs to skip over to fetch the latest data. Cassandra API will not see diminished read performance when the workload has a lot of replaces and/or deletes.
22-
- In addition, during high replace workload scenarios, compaction needs to run to merge SSTables on disk (merge is needed because native Cassandra's writes are append only, thus multiple updates are stored as individual SSTable entries that need to be periodically merged). This can also lead to lowered read performance during compaction. This is not a problem with Cassandra API as it does not allow compaction.
23-
- Setting a replication factor of 1 is possible with native Cassandra. However, this will lead to low availability should the only node with the data go down. This is not an issue with Cassandra API as there is always a replication factor = 4 (quorum of 3).
24-
- Adding/removing nodes in native Cassandra requires a lot of manual intervention, but also high CPU on the new node while existing nodes move some of their token ranges to the new node. This is the same when decommissioning an existing node. However, scaling out is done seamlessly under the hood in the Cassandra API, without any issues observed in the service/application.
25-
- There is no need to set num_tokens on each node in the cluster as in native Cassandra. Nodes and token ranges are fully managed by Cosmos DB.
26-
- No need for nodetool commands such as repair, decommission etc. as in native Cassandra. Cassandra API is fully managed by Cosmos DB and none of this is needed.
27-
18+
- During high replace workload scenarios, compaction needs to run to merge SSTables on disk (merge is needed because Apache Cassandra's writes are append only, thus multiple updates are stored as individual SSTable entries that need to be periodically merged). This can also lead to lowered read performance during compaction. This does not occur in Cassandra API as it does not implement compaction.
19+
- Setting a replication factor of 1 is possible with Apache Cassandra. However, this will lead to low availability should the only node with the data go down. In Cassandra API as there is always a replication factor = 4 (quorum of 3).
20+
- Adding/removing nodes in Apache Cassandra requires a lot of manual intervention, but also high CPU on the new node while existing nodes move some of their token ranges to the new node. This is the same when decommissioning an existing node. However, scaling out is done seamlessly under the hood in the Cassandra API, without any issues observed in the service/application.
21+
- There is no need to set num_tokens on each node in the cluster as in Apache Cassandra. Nodes and token ranges are fully managed by Cosmos DB.
22+
- There is no need for nodetool commands such as repair, decommission etc. as in native Cassandra. Cassandra API is fully managed by Cosmos DB and none of this is needed.
2823

2924
### What is the protocol version supported by Azure Cosmos DB Cassandra API? Is there a plan to support other protocols?
3025

31-
Apache Cassandra API for Azure Cosmos DB supports today CQL version 4. If you have feedback about supporting other protocols, let us know via [user voice feedback](https://feedback.azure.com/forums/263030-azure-cosmos-db) or send an email to [askcosmosdbcassandra@microsoft.com](mailto:askcosmosdbcassandra@microsoft.com).
26+
Apache Cassandra API for Azure Cosmos DB supports CQL version 3.x. If you have feedback about supporting other protocols, let us know via [user voice feedback](https://feedback.azure.com/forums/263030-azure-cosmos-db) or send an email to [askcosmosdbcassandra@microsoft.com](mailto:askcosmosdbcassandra@microsoft.com).
3227

3328
### Why is choosing a throughput for a table a requirement?
3429

3530
Azure Cosmos DB sets default throughput for your container based on where you create the table from - portal or CQL.
3631
Azure Cosmos DB provides guarantees for performance and latency, with upper bounds on operation. This guarantee is possible when the engine can enforce governance on the tenant's operations. Setting throughput ensures that you get the guaranteed throughput and latency, because the platform reserves this capacity and guarantees operation success.
37-
You can elastically change throughput to benefit from the seasonality of your application and save costs.
32+
You can [elastically change throughput](manage-scale-cassandra.md) to benefit from the seasonality of your application and save costs.
3833

3934
The throughput concept is explained in the [Request Units in Azure Cosmos DB](request-units.md) article. The throughput for a table is distributed across the underlying physical partitions equally.
4035

@@ -100,7 +95,7 @@ There's no physical limit on number of keyspaces as they're metadata containers,
10095

10196
### Is it possible to bring in lot of data after starting from normal table?
10297

103-
The storage capacity is automatically managed and increases as you push in more data. So you can confidently import as much data as you need without managing and provisioning nodes, and more.
98+
Yes, assuming uniformly distributed partitions, the storage capacity is automatically managed and increases as you push in more data. So you can confidently import as much data as you need without managing and provisioning nodes, and more. However, if you are anticipating a lot of immediate data growth, it makes more sense to directly [provision for the anticipated throughput](set-throughput) rather than starting lower and increasing it immediately.
10499

105100
### Is it possible to supply yaml file settings to configure Apache Casssandra API of Azure Cosmos DB behavior?
106101

@@ -158,14 +153,19 @@ You can add as many regions as you want for the account and control where it can
158153

159154
### Does the Apache Cassandra API index all attributes of an entity by default?
160155

161-
Cassandra API is planning to support Secondary indexing to help create selective index on certain attributes.
156+
No. Cassandra API supports [secondary indexes](cassandra-secondary-index.md), which behaves in a very similar way to Apache Cassandra. It does not index every attribute by default.
162157

163158

164159
### Can I use the new Cassandra API SDK locally with the emulator?
165160

166161
Yes this is supported. You can find details of how to enable this [here](local-emulator.md#cassandra-api)
167162

168163

164+
### How can I migrate data from their Apache Cassandra clusters to Cosmos DB?
165+
166+
You can read about migration options [here](cassandra-import-data.md).
167+
168+
169169
### Feature x of regular Cassandra API isn't working as today, where can the feedback be provided?
170170

171171
Provide feedback via [user voice feedback](https://feedback.azure.com/forums/263030-azure-cosmos-db).

articles/cosmos-db/cassandra-introduction.md

+6-2
Original file line numberDiff line numberDiff line change
@@ -14,24 +14,28 @@ ms.date: 05/21/2019
1414

1515
Azure Cosmos DB Cassandra API can be used as the data store for apps written for [Apache Cassandra](https://cassandra.apache.org). This means that by using existing [Apache drivers](https://cassandra.apache.org/doc/latest/getting_started/drivers.html?highlight=driver) compliant with CQLv4, your existing Cassandra application can now communicate with the Azure Cosmos DB Cassandra API. In many cases, you can switch from using Apache Cassandra to using Azure Cosmos DB's Cassandra API, by just changing a connection string.
1616

17-
The Cassandra API enables you to interact with data stored in Azure Cosmos DB using the Cassandra Query Language (CQL) , Cassandra-based tools (like cqlsh) and Cassandra client drivers that youre already familiar with.
17+
The Cassandra API enables you to interact with data stored in Azure Cosmos DB using the Cassandra Query Language (CQL) , Cassandra-based tools (like cqlsh) and Cassandra client drivers that you're already familiar with.
1818

1919
## What is the benefit of using Apache Cassandra API for Azure Cosmos DB?
2020

2121
**No operations management**: As a fully managed cloud service, Azure Cosmos DB Cassandra API removes the overhead of managing and monitoring a myriad of settings across OS, JVM, and yaml files and their interactions. Azure Cosmos DB provides monitoring of throughput, latency, storage, availability, and configurable alerts.
2222

23+
**Open source standard**: Despite being a fully managed service, Cassandra API still supports a large surface area of the native [Apache Cassandra wire protocol](cassandra-support.md), allowing you to build applications on a widely used and cloud agnostic open source standard.
24+
2325
**Performance management**: Azure Cosmos DB provides guaranteed low latency reads and writes at the 99th percentile, backed up by the SLAs. Users do not have to worry about operational overhead to ensure high performance and low latency reads and writes. This means that users do not need to deal with scheduling compaction, managing tombstones, setting up bloom filters and replicas manually. Azure Cosmos DB removes the overhead to manage these issues and lets you focus on the application logic.
2426

2527
**Ability to use existing code and tools**: Azure Cosmos DB provides wire protocol level compatibility with existing Cassandra SDKs and tools. This compatibility ensures you can use your existing codebase with Azure Cosmos DB Cassandra API with trivial changes.
2628

27-
**Throughput and storage elasticity**: Azure Cosmos DB provides guaranteed throughput across all regions and can scale the provisioned throughput with Azure portal, PowerShell, or CLI operations. You can elastically scale storage and throughput for your tables as needed with predictable performance.
29+
**Throughput and storage elasticity**: Azure Cosmos DB provides guaranteed throughput across all regions and can scale the provisioned throughput with Azure portal, PowerShell, or CLI operations. You can [elastically scale](manage-scale-cassandra.md) storage and throughput for your tables as needed with predictable performance.
2830

2931
**Global distribution and availability**: Azure Cosmos DB provides the ability to globally distribute data across all Azure regions and serve the data locally while ensuring low latency data access and high availability. Azure Cosmos DB provides 99.99% high availability within a region and 99.999% read and write availability across multiple regions with no operations overhead. Learn more in [Distribute data globally](distribute-data-globally.md) article.
3032

3133
**Choice of consistency**: Azure Cosmos DB provides the choice of five well-defined consistency levels to achieve optimal tradeoffs between consistency and performance. These consistency levels are strong, bounded-staleness, session, consistent prefix and eventual. These well-defined, practical, and intuitive consistency levels allow developers to make precise trade-offs between consistency, availability, and latency. Learn more in [consistency levels](consistency-levels.md) article.
3234

3335
**Enterprise grade**: Azure cosmos DB provides [compliance certifications](https://www.microsoft.com/trustcenter) to ensure users can use the platform securely. Azure Cosmos DB also provides encryption at rest and in motion, IP firewall, and audit logs for control plane activities.
3436

37+
**Event Sourcing**: Cassandra API provides access to a persistent change log, the [Change Feed](cassandra-change-feed.md), which can facilitate event sourcing directly from the database. In Apache Cassandra, the only equivalent is change data capture (CDC), which is merely a mechanism to flag specific tables for archival as well as rejecting writes to those tables once a configurable size-on-disk for the CDC log is reached (these capabilities are redundant in Cosmos DB as the relevant aspects are automatically governed).
38+
3539
## Next steps
3640

3741
* You can quickly get started with building the following language-specific apps to create and manage Cassandra API data:

0 commit comments

Comments
 (0)