# System Design Concepts - CAP Theorem

An exploration of some the concepts related to systems design inspired by an article found [here](http://www.acodersjourney.com/2018/07/system-design-interview-cap-theorem/). The __CAP Theorem__ states that:

> _Any distributed computer system can support only any two among_ ___consistency, availability, and partition tolerance___.

***

### 'C' is for 'Consistency'

When we consider ___consistency___, it's importance cannot be overstated. There are systems that require consistent behaviour based on some human interaction should work the same way each time that action is taken, e.g., _retweeting a tweet should always work the same, or following a user and following a topic can be made to provide a similar experience._

This principle creates an environment where a user can anticipate or develop an _expected_ outcome ___before___ performing the action. Consistency becomes something like a promise that is made to the user, whilst being able to bring their expectations into new features of the product that they haven't explored - speeding up the learning curve.

In the context of systems design,

>- When data is partitioned (distributed), all the nodes see the same data at a given time, and this should be true for all times.
>- When queried, each node will return the latest data. If not, the system will return an error message, or exit with an error.
>- Consistency is achieved by ___updating___ several nodes before allowing further reads.

So in principle, consistency implies the similarity in behaviour associated with an action.

***

### 'A' is for 'Availability'

Availability is the probability that a system will work as required when required. This includes the non-operational periods associated with ___reliability, maintenance, and logistics___. There are two types of availability: ___operational___ and ___predicted___. Operational availability tends to be same as predicted availability before the observation of operational metrics. Predicted availability is based on a model of the system before it is built. We would to have a system that has minimal downtimes (or none if possible). That is to say:

>- At all times, every request being fired at the system generates a valid response.
>- While doing this, it doesn't mean that every request will receive a response with the latest information (data).
>- Availability is achieved by replicating the data across different servers.

***

### 'P' is for 'Partition Tolerance'

Network partition is the decomposition of a network into relatively independent subnets for their seperate optimization as well as network split due to the failure of network devices. In both cases the partition-tolerant behavior of subnets is expected, that is, even after a network is partitioned into multiple sub-systems, it still works correctly.

>- The system is able to perform continuously even if a network failure or data loss occures.
>- Partition tolerance can be achieved by replicating data and system functionality sufficiently across a cluster of nodes and networks.
>- The redundancy introduced ensures the system as a whole continues to function even in situations where a node or a set of nodes cannot communicate with each other.

An example of network partition with multiple subnets where nodes 'A' and 'B' are located in one subnet, and nodes 'C' and 'D' are in another. A partition occurs if the two subnets fail, where node 'A' and 'B' are unable to communicate with nodes 'C' and 'D', but all nodes 'A' - 'D' work the same as before.

***

### System Classification based on CAP Theorem

Because only two of the three properties stated by CAP can be guaranteed at any time, systems are usually classified into three types under CAP Theorem:

- **CAP System**: Data is consistent between all nodes, and you can read/wrtie from any node, while you cannot afford to let your network go down.

> Examples: 
>- RDBMS (MySQL, MSSQL Server, Oracle and columnar relational stores)

- **CP System**: Data is consistent and maintains tolerance for partitioning and preventing data going out of sync.

> Examples: 
>- MemcacheDB
>- Redis
>- Google Big Table
>- MongoDB (document oriented)
>- HBase (columnar)

- **AP System**: Nodes are online always, but they may not get you the latest data; however, they sync whenever the lines are up.

> Examples: 
>- DynamoDB
>- Voldemort
>- CouchDB (document oriented)
>- Cassandra (columnar)
>- SimpleDB
>- Riak




***

#### References

1. _System Design Interview Concepts - CAP Theorem_ - The inspiration for this article and can be found [here](http://www.acodersjourney.com/2018/07/system-design-interview-cap-theorem).

2. _Why is consistency important in design?_ - Take a look at [David Cole](https://www.quora.com/profile/David-Cole)'s (he's the director of design at [Quora](https://www.quora.com)) answer especially when he speaks about consistent behaviour and it's importance.

3. Wikipedia is good place to start in understanding systems design principles. Start [here](https://en.wikipedia.org/wiki/CAP_theorem) for an introductory discourse on the __CAP Theorem__. The article on availability is fairly useful as a starting point as well and can be found [here](https://en.wikipedia.org/wiki/Availability_(system)).