Data Replication

Replication is having multiple copies of the same data on different nodes.
It is widely used in the distributed & highly available database management systems (DBMS).

👍 Advantages of Replication

Advantage	Description
Scalability	Being able to handle a higher volume of reads than a single machine could handle, by performing reads on replicas
High availability/Redundancy	Keeping the system running, even when one machine (or several machines, or an entire datacenter) goes down.
Disconnected operation	Allowing an application to continue working when there is a network interruption.
Latency	Placing data geographically close to users, so that users can interact with it faster

Algo	Use Cases	Description
⭐ Single-Leader replication	SQL-DBs like Amazon Aurora, PostgreSQL etc. - Message Brokers like Kafka etc. - NoSQL-DBs like DynamoDB, MongoDB etc.	Clients send all writes to a single node (the leader), which sends a stream of data change events to the other replicas (followers). - Reads can be performed on any replica, but reads from followers might be stale.
Leaderless Replication	Casandra, Dynamo systems etc.	Clients send each write to several nodes, and read from several nodes in parallel in order to detect and correct nodes with stale data.
Multi-Leader replication	Data-center (a leader in each datacenter) - Clients with offline operation (like mobile apps) - Collaborative editing (like Google Docs) etc.	Clients send each write to one of several leader nodes, any of which can accept writes. - The leaders send streams of data change events to each other and to any follower nodes. - Biggest problem with this algo is write conflict & its resolution.