Skip to content

Latest commit

 

History

History
25 lines (20 loc) · 4.81 KB

File metadata and controls

25 lines (20 loc) · 4.81 KB

Data Replication

👍 Advantages of Replication

Advantage Description
Scalability Being able to handle a higher volume of reads than a single machine could handle, by performing reads on replicas
High availability/Redundancy Keeping the system running, even when one machine (or several machines, or an entire datacenter) goes down.
Disconnected operation Allowing an application to continue working when there is a network interruption.
Latency Placing data geographically close to users, so that users can interact with it faster

Popular Algos of Replication

Algo Use Cases Description
Single-Leader replication SQL-DBs like Amazon Aurora, PostgreSQL etc.
- Message Brokers like Kafka etc.
- NoSQL-DBs like DynamoDB, MongoDB etc.
Clients send all writes to a single node (the leader), which sends a stream of data change events to the other replicas (followers).
- Reads can be performed on any replica, but reads from followers might be stale.
Leaderless Replication Casandra, Dynamo systems etc. Clients send each write to several nodes, and read from several nodes in parallel in order to detect and correct nodes with stale data.
Multi-Leader replication Data-center (a leader in each datacenter)
- Clients with offline operation (like mobile apps)
- Collaborative editing (like Google Docs) etc.
Clients send each write to one of several leader nodes, any of which can accept writes.
- The leaders send streams of data change events to each other and to any follower nodes.
- Biggest problem with this algo is write conflict & its resolution.

References