From 344cb3b2ccc37098d8d9fbb2f6d18636e5fef7fa Mon Sep 17 00:00:00 2001 From: Jesse Seldess Date: Tue, 31 Jan 2017 23:53:05 -0500 Subject: [PATCH 1/2] cockroachdb and CAP --- frequently-asked-questions.md | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/frequently-asked-questions.md b/frequently-asked-questions.md index c3bc0cdb290..d7f7d7bb297 100644 --- a/frequently-asked-questions.md +++ b/frequently-asked-questions.md @@ -70,6 +70,21 @@ CockroachDB guarantees the SQL isolation level "serializable", the highest defin It does so by combining the Raft consensus algorithm for writes and a custom time-based synchronization algorithms for reads. See our description of [strong consistency](strong-consistency.html) for more details. +## How is CockroachDB both highly availability and strongly consistent? + +The [CAP theorem](https://en.wikipedia.org/wiki/CAP_theorem) states that it is impossible for a distributed system to simultaneously provide more than two out of the following three guarantees: + +- Consistency +- Availability +- Partition Tolerance + +CockroachDB is a CP (consistent and partition tolerant) system. This means +that, in the presence of partitions, the system will become unavailable rather than do anything which might cause inconsistent results. For example, writes require acknowledgements from a majority of replicas, and reads require a lease, which can only be transferred to a different node when writes are possible. + +Separately, CockroachDB is also Highly Available, although "available" here means something a bit different than the way that term is used in the CAP theorem. In the CAP theorem, availability is a binary property, but for High Availability, we talk about availability as a spectrum (using terms like "five nines" for a system that is available 99.999% of the time). + +So being both CP and HA means that whenever a majority of replicas can talk to each other, they should be able to make progress. For example, if you deploy CockroachDB to three datacenters and the network link to one of them fails, the other two datacenters should be able to operate normally with only a few seconds' disruption. We do this by attempting to detect partitions and failures quickly and efficiently, transferring leadership to nodes that are able to communicate with the majority, and routing internal traffic away from nodes that are partitioned away. + ## Why is CockroachDB SQL? At the lowest level, CockroachDB is a distributed, strongly-consistent, transactional key-value store, but the external API is Standard SQL with extensions. This provides developers familiar relational concepts such as schemas, tables, columns, and indexes and the ability to structure, manipulate, and query data using well-established and time-proven tools and processes. Also, since CockroachDB supports the PostgreSQL wire protocol, it’s simple to get your application talking to Cockroach; just find your [PostgreSQL language-specific driver](install-client-drivers.html) and start building. From 02fae811ffe9b8652567b291ad2250a55f206a83 Mon Sep 17 00:00:00 2001 From: Jesse Seldess Date: Wed, 1 Feb 2017 17:56:41 -0500 Subject: [PATCH 2/2] when cockroach is good and not good choice --- frequently-asked-questions.md | 18 ++++++++---------- 1 file changed, 8 insertions(+), 10 deletions(-) diff --git a/frequently-asked-questions.md b/frequently-asked-questions.md index d7f7d7bb297..2752dd7d9ca 100644 --- a/frequently-asked-questions.md +++ b/frequently-asked-questions.md @@ -15,12 +15,11 @@ CockroachDB is inspired by Google's [Spanner](http://research.google.com/archive ## When is CockroachDB a good choice? -CockroachDB is especially well suited for applications that require: +CockroachDB is well suited for applications that require reliable, available, and correct data regardless of scale. It is built to automatically replicate, rebalance, and recover with minimal configuration and operational overhead. Specific use cases include: -- Horizontal scalability -- Business continuity (survivability) -- High levels of consistency -- Support for distributed ACID transactions +- Distributed or replicated OLTP +- Multi-datacenter deployments +- Cloud-native infrastructure initiatives ## When is CockroachDB not a good choice? @@ -28,9 +27,8 @@ CockroachDB is not a good choice when very low latency reads and writes are crit Also, CockroachDB is not yet suitable for: -- Use cases requiring SQL joins ([the feature still needs optimization](https://www.cockroachlabs.com/blog/cockroachdbs-first-join/)) -- Use cases requiring JSON/Protobuf data ([slated for an upcoming release](https://github.com/cockroachdb/cockroach/issues/2969)) -- Real-time analytics (on our long-term roadmap) +- Complex SQL JOINS ([the feature still needs optimization](https://www.cockroachlabs.com/blog/cockroachdbs-first-join/)) +- Heavy analytics / OLAP ## How easy is it to install CockroachDB? @@ -81,9 +79,9 @@ The [CAP theorem](https://en.wikipedia.org/wiki/CAP_theorem) states that it is i CockroachDB is a CP (consistent and partition tolerant) system. This means that, in the presence of partitions, the system will become unavailable rather than do anything which might cause inconsistent results. For example, writes require acknowledgements from a majority of replicas, and reads require a lease, which can only be transferred to a different node when writes are possible. -Separately, CockroachDB is also Highly Available, although "available" here means something a bit different than the way that term is used in the CAP theorem. In the CAP theorem, availability is a binary property, but for High Availability, we talk about availability as a spectrum (using terms like "five nines" for a system that is available 99.999% of the time). +Separately, CockroachDB is also Highly Available, although "available" here means something different than the way it is used in the CAP theorem. In the CAP theorem, availability is a binary property, but for High Availability, we talk about availability as a spectrum (using terms like "five nines" for a system that is available 99.999% of the time). -So being both CP and HA means that whenever a majority of replicas can talk to each other, they should be able to make progress. For example, if you deploy CockroachDB to three datacenters and the network link to one of them fails, the other two datacenters should be able to operate normally with only a few seconds' disruption. We do this by attempting to detect partitions and failures quickly and efficiently, transferring leadership to nodes that are able to communicate with the majority, and routing internal traffic away from nodes that are partitioned away. +Being both CP and HA means that whenever a majority of replicas can talk to each other, they should be able to make progress. For example, if you deploy CockroachDB to three datacenters and the network link to one of them fails, the other two datacenters should be able to operate normally with only a few seconds' disruption. We do this by attempting to detect partitions and failures quickly and efficiently, transferring leadership to nodes that are able to communicate with the majority, and routing internal traffic away from nodes that are partitioned away. ## Why is CockroachDB SQL?