Raft Consensus Algorithm
Clone or download
Latest commit 950b2fb Jul 1, 2018
Permalink
Failed to load latest commit information.
raft_performance adding a better performance version Jun 25, 2018
.gitignore HTML from login page Mar 13, 2018
3700kvstore merging wip_5 with master Jun 25, 2018
LICENSE.md added license Jul 1, 2018
README renamed file Jun 25, 2018
advanced-1.json moved project into a different dir May 28, 2018
advanced-2.json moved project into a different dir May 28, 2018
advanced-3.json moved project into a different dir May 28, 2018
advanced-4.json moved project into a different dir May 28, 2018
crash-1.json moved project into a different dir May 28, 2018
crash-2.json moved project into a different dir May 28, 2018
crash-3.json moved project into a different dir May 28, 2018
crash-4.json moved project into a different dir May 28, 2018
partition-1.json moved project into a different dir May 28, 2018
partition-2.json moved project into a different dir May 28, 2018
partition-3.json moved project into a different dir May 28, 2018
partition-4.json moved project into a different dir May 28, 2018
run.py merging wip_5 with master Jun 25, 2018
run.pyc moved project into a different dir May 28, 2018
simple-1.json moved project into a different dir May 28, 2018
simple-2.json moved project into a different dir May 28, 2018
test.py moved project into a different dir May 28, 2018
try.py moved project into a different dir May 28, 2018
unreliable-1.json moved project into a different dir May 28, 2018
unreliable-2.json moved project into a different dir May 28, 2018
unreliable-3.json moved project into a different dir May 28, 2018

README

******High-level approach******:

We split the machines into different roles. The following roles have different functionalities,
and react differently when receiving different messages.

Leader:
	Can receive:
		1. Client requests
		2. ACKs from followers
	Can send:
		1. Heartbeats to maintain authority
		2. AppendLog RPCs to servers
		3. Response to client

Candidate:
	Can receive:
		1. RequestVote RPCs --> refuse
		2. RequestVote RPC Responses -> processCollectVote
		3. Heartbeat --> Follower
	Can send:
		RequestVote RPCs

Follower:
	Can receive:
		1. Heartbeats from leader (reset timeout)
		2. RequestVote RPCs (if haven't voted and candidate term is better or log is longer, vote for them)
		3. Client request -> redirect to leader
		4. AppendLog RPCs
	Can send:
		1. Response to RequestVote


We also splitted the project into different phases.
Phase 0. Implemented leader election, made sure it works before moving on to implementing
the next part
Phase 1. Implemented log replication
Phase 2. Improve performance by reducing number of messages sent between servers
Phase 3. Deal with partition

Beginning:
	1. Everything is a follower. Nobody will receive heartbeats
	2. If a follower receives no communication during its election timeout, it starts an election

Election:
	1. To begin an election, a timed out follower:
		-Increments its current term
		-Becomes a candidate
		-Votes for itself
		-Sends RequestVote RPCs
	2. A candidate continues in the election until:
		a. It wins the election
			-Candidate received votes from the majority of servers
			-It becomes leader and sends heartbeats to all of the other servers to establish
			its authority and prevent new elections
		b. Another server establishes itself as leader
			-Candidate receives heartbeat from claimed leader
			-If the claimed leader's term is at least as large as the current term, the leader
			is accepted as legitimate
			-If the claimed leader's term is smaller, it rejects the RPC and continues in
			candidate state
		c. A period of time goes by with no winner
			-Each candidate times out and starts a new election by incrementing its term
			-Without randomized timeouts, split votes can remain indefinitely

Log Replication:
	1. Leader receives client command
	2. Leader appends the command to its log as a new entry and then issues AppendEntries
	    RPCs in parallel to each of the other servers
	3. When the entry has been safely replicated (process described below), the leader will
	    apply the entry to its state machine and return the result to the client
	4. If followers crash or packets are lost, the leader retries AppendEntires RPCs
	    indefinitely until all followers eventually store all log entries

Log Organization:
	1. Each log entry stores a state machine command along with the term number when the entry was received by the leader
	2. The term numbers in log entries are used to detect inconsistencies between logs
	3. A log entry is committed once the leader that created the entry has has replicated it on a majority of servers
	4. This also commits all preceding entries in the leader's log, including entries created by preceding leaders
	5. Leader keeps track of the highest index it knows to be committed, and *includes
	    that index in future AppendEntries RPCs (including heartbeats)* so that other servers eventually find out
	6. Once a follower learns that a log entry is committed, it applies the entry to its local state machine (in log order)
		-Consistency check: When sending an AppendEntries RPC,
		    the leader includes the index and term of the entry in its log
		    that immediately precedes the new entries. If the follower does not find
		    an entry in its log with the same index and term, it refuses the new entries
	7. To bring a follower's log into consistency with its own:
		-The leader must find the latest log entry where the two logs agree
		-Delete any entries in the follower's log after that point
		-Send the follower all of the leader's entries after that point
		-The leader maintains a nextIndex for each follower, which is the index of the
		    next log entry the header will send to that follower. When a leader first comes
		    to power, it initializes all the nextIndex values to the index just after the last one in its log
		-If a follower's log is inconsistent with the leader's the AppendEntries
		    consistency check will fail and the leader will decrement nextIndex and retry.
		    Eventually nextIndex will reach a point where the leader and follower logs match
	8. The RequestVote RPC includes information about the candidate's log,
	    and the voter denies its vote if its own log is more up-to-date than that of the candidate

******Challenges faced******:
One of the challenges we faced is to clearly indentify errors in the code.
Debugging distributed systems is hard. When there are several machines, it is extremely difficult to identify
why a test failed, and in what scenario it failed, and how to recreate the situation, and how to modify the code such
that it works properly.
We followed closely to the Raft paper and implemented everything in the paper.


******An overview of how we tested our code******:
Isolated code into different part. Created an isolated environment, which we could
understand. Tested code chunks in each individual environment and used print 
statements to make sure that the produced result is what we expected.
Wrote smaller programs that simulates the code, and tested the simulated program
carefully.

In addition, we used extremely effective print statements to print the state of the machines,
to make sure that the machines are behaving exactly the way we want.