This is the course project of MIT 6.824 Distributed Systems. This project is a distributed key-value store based on the Raft consensus algorithm developed in GoLang. It is implemented to support concurrent requests utilizing Mutex, Channel, and Goroutine. It is tested under a simulated environment, which covers the circumstances such as network loss, network partition, and partial cluster down.
- Leader Election
- Log Replication
- Log Persistence
Passed all the test cases of Lab 2. Test result is shown as below:
Test (2A): initial election ...
labgob warning: Decoding into a non-default variable/field ConflictIndex may not work
... Passed -- 3.0 3 64 0
Test (2A): election after network failure ...
... Passed -- 4.6 3 105 0
Test (2B): basic agreement ...
... Passed -- 0.8 5 32 3
Test (2B): agreement despite follower disconnection ...
... Passed -- 5.7 3 110 8
Test (2B): no agreement if too many followers disconnect ...
... Passed -- 3.7 5 172 4
Test (2B): concurrent Start()s ...
... Passed -- 0.6 3 26 6
Test (2B): rejoin of partitioned leader ...
... Passed -- 5.2 3 165 4
Test (2B): leader backs up quickly over incorrect follower logs ...
... Passed -- 15.0 5 2756 102
Test (2B): RPC counts aren't too high ...
... Passed -- 2.0 3 41 12
Test (2C): basic persistence ...
... Passed -- 13.4 3 271 6
Test (2C): more persistence ...
... Passed -- 20.8 5 1159 16
Test (2C): partitioned leader and one follower crash, leader restarts ...
... Passed -- 1.4 3 35 4
Test (2C): Figure 8 ...
... Passed -- 35.4 5 1024 33
Test (2C): unreliable agreement ...
... Passed -- 1.7 5 1120 246
Test (2C): Figure 8 (unreliable) ...
... Passed -- 36.0 5 15715 576
Test (2C): churn ...
... Passed -- 16.1 5 11763 2096
Test (2C): unreliable churn ...
... Passed -- 16.1 5 7956 1519
PASS
ok github.com/dianchengwangCHN/raft-key-value-store/raft 181.813s
- Client Interface
- Log Compaction
Note:
- A tricky problem in Lab 3 is directly
append()
a part of old log to the new log will not eliminate the reference to the underlying array of old log. Thus, be careful to copy the part of the old slice first and thenappend()
it to the new slice. - When performing
InstallSnapshot
, be sure to check whether should apply snapshot to server. When server state is ahead of snapshot state, applying snapshot will roll back server state, such that will lead to inconsistency.
Passed all test cases of Lab 3. Test result is shown as below:
Test: one client (3A) ...
labgob warning: Decoding into a non-default variable/field LeaderID may not work
... Passed -- 15.1 5 18277 2317
Test: many clients (3A) ...
... Passed -- 15.3 5 12648 2456
Test: unreliable net, many clients (3A) ...
... Passed -- 15.8 5 12009 1625
Test: concurrent append to same key, unreliable (3A) ...
... Passed -- 1.2 3 277 52
Test: progress in majority (3A) ...
... Passed -- 0.5 5 61 2
Test: no progress in minority (3A) ...
... Passed -- 1.0 5 268 3
Test: completion after heal (3A) ...
... Passed -- 1.0 5 58 3
Test: partitions, one client (3A) ...
... Passed -- 22.8 5 52918 2158
Test: partitions, many clients (3A) ...
... Passed -- 22.8 5 112397 2293
Test: restarts, one client (3A) ...
... Passed -- 19.8 5 26400 2512
Test: restarts, many clients (3A) ...
... Passed -- 20.0 5 45776 2862
Test: unreliable net, restarts, many clients (3A) ...
... Passed -- 20.3 5 13392 1665
Test: restarts, partitions, many clients (3A) ...
... Passed -- 26.8 5 87722 2355
Test: unreliable net, restarts, partitions, many clients (3A) ...
... Passed -- 27.7 5 28405 1236
Test: unreliable net, restarts, partitions, many clients, linearizability checks (3A) ...
... Passed -- 25.8 7 42744 1180
Test: InstallSnapshot RPC (3B) ...
... Passed -- 4.5 3 6012 63
Test: snapshot size is reasonable (3B) ...
... Passed -- 0.3 3 2396 800
Test: restarts, snapshots, one client (3B) ...
... Passed -- 20.1 5 158888 27146
Test: restarts, snapshots, many clients (3B) ...
... Passed -- 21.3 5 182495 26132
Test: unreliable net, snapshots, many clients (3B) ...
... Passed -- 15.6 5 13239 1778
Test: unreliable net, restarts, snapshots, many clients (3B) ...
... Passed -- 20.5 5 15269 1843
Test: unreliable net, restarts, partitions, snapshots, many clients (3B) ...
... Passed -- 27.9 5 23393 1040
Test: unreliable net, restarts, partitions, snapshots, many clients, linearizability checks (3B) ...
... Passed -- 25.9 7 65346 1833
PASS
ok github.com/dianchengwangCHN/raft-key-value-store/kvraft 372.602s
- Shard Master
- Sharded Key-Value Server
Shard Master plays the role of processing the configuration update. Specifically, it assigns the paritions to different server groups as evenly as possible. It aslo dynamically change the assignment according to the change on the number of server groups.
Note: In Lab 4A, when we pass the commands to Raft, we should not pass the pointers.
Passed test cases of Lab 4A:
Test: Basic leave/join ...
labgob warning: Decoding into a non-default variable/field ConflictIndex may not work
... Passed
Test: Historical queries ...
... Passed
Test: Move ...
... Passed
Test: Concurrent leave/join ...
... Passed
Test: Minimal transfers after joins ...
... Passed
Test: Minimal transfers after leaves ...
... Passed
Test: Multi-group join/leave ...
... Passed
Test: Concurrent multi leave/join ...
... Passed
Test: Minimal transfers after multijoins ...
... Passed
Test: Minimal transfers after multileaves ...
... Passed
PASS
ok github.com/dianchengwangCHN/raft-key-value-store/shardmaster 1.965s