Skip to content

s4ayub/huffleraft

Repository files navigation

huffleraft

Replicated fault-tolerant key-value store driven by hashicorp/raft and dgraph-io/badger

Please read the example to see the system in action, especially for CRUD operations on a 5 node cluster!
For full documentation and more thorough descriptions: godoc

import "github.com/s4ayub/huffleraft"

Features and Purpose:

A RaftStore instance encapsulates a raft node (from hashicorp/raft) for consensus, a storage system (from dgraph-io/badger) for key-value pairs and an HTTP server to accept accept and redirect requests to the leader node.

The purpose of this package is to explore the raft consensus algorithm, specifically hashicorp's implementation. This package is effective for experimenting with raft because the API is such that the user doesn't even have to build any HTTP requests, rather, the package does that in the background through Get, Set, Delete, and Join. This makes it very easy and quick to play around with raft in conjunction with other code.

  • Perform CRUD operations on draph-io's Badger storage in a distributed manner
  • Fault tolerant as per the raft consensus algorithm
  • Commands can be performed on any node in a cluster and they'll be redirected to the leader of the cluster
  • Logs are truncated using the Snapshot, Restore and Persist mechanism provided by hashicorp/raft (checkout fsm.go)

Example logs:

Once a couple RaftStore instances are running, and a cluster is formed, the logs may look something like this after a Set command has been performed on them:

2017/09/04 04:02:07 [DEBUG] raft: Node 127.0.0.1:8740 updated peer set (2): [127.0.0.1:8730 127.0.0.1:8750 127.0.0.1:8740]
2017/09/04 04:02:07 [DEBUG] raft-net: 127.0.0.1:8730 accepted connection from: 127.0.0.1:62560
2017/09/04 04:02:07 [DEBUG] raft-net: 127.0.0.1:8740 accepted connection from: 127.0.0.1:62561
[raftStore | 127.0.0.1:8750]2017/09/04 04:02:11 key: yes, value: no, set

Motivation and References:

I wanted to dive deeper into distributed systems by building one of my own. The following repository was helpful in the development of huffleraft:

  • hraftd
    • This is a simple reference use of hashicorp/raft also made to study the implementation of the consensus algorithm. However, it required the user to interact with the system as a client and use curl to submit requests to the server. My package builds on top of this by being embeddable inside code, implementing command redirection to the leader node and by using dragph-io's badger as a faster storage compared to a native Go map. The people at hashicorp were very quick to answer any questions I had as well.

Design Decisions:

  • Leader re-direction:

    • The RaftStore struct encases an http listener to handle requests and a RaftServer to handle the consensus. There are two important addresses associated with each raft store. The httpAddr listens for http requests, and the raftAddr which the raft server uses for its transport layer between raft nodes. As per the consensus algorithm, commands such as setting, deleting and joining must be sent to the leader node only. The leader's raftAddr can easily be accessed in a cluster by raftServer.Leader() (raftServer being an instance of Raft from hashicorp/raft). However, the httpAddr associated cannot be easily retrieved, hence, a link must be made between the raftAddr and the httpAddr, since the httpAddr is the one that the user would be sending their commands to. I decided that the httpAddr would always be 1 port away from the raftAddr. The httpAddr for a RaftStore instance can always be retrieved now by adding 1 to its raftAddr.
  • Using dgraph-io/badger:

    • Bagder was proposed as a performant Go alternative to RocksDB which made it an attractive storage back-end for the key-value store.

Caveats and Suggested Improvements:

  • Cannot use side by side ports (8866 and 8867 for example) for two raft servers because the port next to the one on which a raft server is initiated on, is used for http requests. This is done to establish a link between the raftAddr and the httpAddr as discussed above. This routing system can likely be improved upon using a custom multiplexer setup where requests can be sent to the same port that the raft node exists on. Something like this in rqlite: https://github.com/rqlite/rqlite/tree/master/tcp
  • More robust error handling

About

Replicated key-value store driven by the raft consensus protocol 🚵

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages