GitHub - apurvi96/Distributed-Key-Value-Store-Piazza: A distributed hash key-value store that runs across multiple nodes.Replication of data across Slave Servers and uses Consistent Hashing for redistribution.

Distributed Key Value Store (Piazza)

Multiple clients will be communicating with a single coordinating server (Master) in a JSON based message format and send the data through sockets using TCP channel.
Fault Tolerence: Data storage is durable, i.e., the system does not lose data if a single node fails.Data replication factor of 2 using consistent hashing.
Atomic Operations(Using 2 Phase Commit Protocol):Operations on the store are atomic. i.e., either the operation will succeed completely or fail altogether without any side effects.

System Architecture

The System consists of 3 entities.

Piazza Clients
They register or login to the system and access the data using following opeartions:
1. GET <key>
2. PUT <key> <value>
3. DELETE <key>
4. UPDATE <key> <new_value>

Coordination Server
1. Acts as an intermediate between the clients and slave server.
2. Whenever a new slave server registers itself, Coordination Server hashes the ip:port of the Slave Server. Each Slave Server is hashed and placed in a ring structure.
3. Whenever a client sends a request, key is hashed and and placed in the own table of next nearest hashed slave server and also in the prev table of the predecessor of this slave server.
4. WRITE THROUGH CACHE:The coordinator contains a write-through LRU cache, and it uses the cache to serve requests without going to the (slave) key-value servers it coordinates. The slave key-value servers are contacted for a request only upon a cache miss on the coordinator.

Slave Servers
Stores the actual data,i.e, key-value pair. Each slave server consistes of two tables:
1. OWN TABLE : stores the first copies of keys with hash values greater than the ID of its immediate predecessor up to its own ID.
2. PREV TABLE : stores the keys whose first copies are stored in its predecessor.

Data Migration

At any time, a key should be replicated in two slave servers. So whenever a new Server comes up or a Server goes down, data must be migrated to ensure the required redundancy.
Case 1: When a new Server registers:-

Predecessor, successor and successor of successor are found using hashed id of new registered slave server.
Successor updates its 'own' table by removing keys hashed between predecessor and new registered server.
These removed keys are filled in 'previous' table of successor server.
Removed keys are filled in 'own' table of newly registered slave server.
Newly registered slave server updates its 'previous' table with 'own' table of predecessor.
Successor of successor updates its 'previous' table with 'own' table of successor.

Case 2: When a Servers goes down:-

The successor, predecessor and successor of successor is found out using the hashed id of the dead slave.
The successor(also known as leader) of dead slave is supposed to copy the content of it's 'previous' hash table to it's 'own' hash table All the values lying between predecessor and the successor of the dead slave, will now move to successor of dead slave; so we copy the previous of dead slave to own of successor.
The successor then sends the values of it's updated 'own' table to it's successor i.e successor of successor of the dead slave server.

Heartbeat Implementation

Heartbeat technique is used to find out if a slave server has gone down.Slave servers periodically send a message to the cordination server (CS) to inform it that it is alive. When the CS senses that message has not arrived from the slave server side, it gets to know that the particular slave server is down.
A map is maintained at CS for each slave server(ip:port) and count of map is increamented on receiving of each alive msg.
A dedicated heartbeat thread is present on the slave server side which makes a UDP connection with the CS. This thread sends heartbeat message to CS for every 5 seconds. Also, there is a heart beat listener thread on the CS side to receive heartbeat message from different slave servers.
A timer thread is maintained at Coordination Server side which wakes up in every 30 sec and iterates over the map. Once count 0 is discovered, it knows that this slave server is down and thus data migration is initiated.

Compile and Run Program

Compile Coordination_Server.cpp (use -lpthreads to support pthreads) and run it as:
./coord <ip><port>
Register Slave Servers at the CS by running slaveServer.cpp as:
./slave <ip><port>
Multiple clients can connect to the Coordination Server by running client.cpp as:
./client <ip><port>

A file "cs_config.txt" contains ip and port of the Coordination Server.

Name		Name	Last commit message	Last commit date
Latest commit History 74 Commits
rapidjson		rapidjson
COMMON_FUNCN.h		COMMON_FUNCN.h
COORD_FUNC.h		COORD_FUNC.h
Coordination_Server.cpp		Coordination_Server.cpp
GLOBAL_CS.h		GLOBAL_CS.h
README.md		README.md
avl.hpp		avl.hpp
client.cpp		client.cpp
consistent-hashing.png		consistent-hashing.png
jsonstring.h		jsonstring.h
lru_cache.hpp		lru_cache.hpp
slaveServer.cpp		slaveServer.cpp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Distributed Key Value Store (Piazza)

System Architecture

Data Migration

Heartbeat Implementation

Compile and Run Program

About

Releases

Packages

Contributors 4

Languages

apurvi96/Distributed-Key-Value-Store-Piazza

Folders and files

Latest commit

History

Repository files navigation

Distributed Key Value Store (Piazza)

System Architecture

Data Migration

Heartbeat Implementation

Compile and Run Program

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages