# Implementing a Key-Value Database

In this project, we'll implement a key-value database that will be flexible and easy for other developers to use in their own projects. We'll build off of a B-Tree class and turn it into a fully functional key-value store.

A key-value database operates similarly to a Python dictionary, but it allows the users to perform a range of queries. We'll be building a database similar to other open-source implementations of a key-value store like Redis, CouchDB, Mongo, and Cassandra. 

## Implementing a B-Tree

Before we create our key-value database, we'll need to add in a B-tree class implementation.

### Node Class

To implement a B-tree from scratch, we'll need to create a `Node` class so that we can use two separate lists to represent the keys and the children.

We'll also implement an `is_leaf` method so we know if the node is a leaf – has no children – or not. We'll also add a `__repr__()` method so that we can track the number of keys contained within a B-tree node.

In [1]:
import bisect

class Node:

    def __init__(self, keys=None, values=None, children=None, parent=None):
        self.keys = keys or []
        self.values = values or []
        self.parent = parent
        self.set_children(children)

    def set_children(self, children):
        self.children = children or []
        for child in self.children:
            child.parent = self

    def is_leaf(self):
        return len(self.children) == 0

    def contains_key(self, key):
        return key in self.keys
    
    def get_value(self, key):
        for i, k in enumerate(self.keys):
            if k == key:
                return self.values[i]
        return None

    def get_insert_index(self, key):
        return bisect.bisect(self.keys, key)

    def insert_entry(self, key, value):
        insert_index = self.get_insert_index(key)
        self.keys.insert(insert_index, key)
        self.values.insert(insert_index, value)
        return insert_index

    def split(self):
        if self.parent is None:
            return self.split_no_parent()
        return self.split_with_parent()

    def split_no_parent(self):
        split_index = len(self) // 2
        key_to_move_up = self.keys[split_index]
        value_to_move_up = self.values[split_index]
        # Create right node
        right_node = Node(self.keys[split_index+1:], self.values[split_index+1:], self.children[split_index+1:])
        # Update left node (self)
        self.keys = self.keys[:split_index]
        self.values = self.values[:split_index]
        self.children = self.children[:split_index+1]
        # Create parent
        parent = Node([key_to_move_up], [value_to_move_up], [self, right_node])
        return parent

    def insert_child(self, insert_index, child): 
        self.children.insert(insert_index, child)
        child.parent = self

    def split_with_parent(self): 
        split_index = len(self) // 2
        key_to_move_up = self.keys[split_index]
        value_to_move_up = self.values[split_index]
        # Create right node
        right_node = Node(self.keys[split_index+1:], self.values[split_index+1:], self.children[split_index+1:])
        # Update left node (self)
        self.keys = self.keys[:split_index]
        self.values = self.values[:split_index]
        self.children = self.children[:split_index+1]
        # Add new child to parent
        key_insert_index = self.parent.insert_entry(key_to_move_up, value_to_move_up)
        self.parent.insert_child(key_insert_index + 1, right_node)
        return self.parent

    def __len__(self):
        return len(self.values)

### B-Tree Class

The following code represents the B-tree class.

In [2]:
class BTree:

    def __init__(self, split_threshold):
        self.root = Node()
        self.split_threshold = split_threshold 
        self.height = 0
        self.size = 0

    def __len__(self):
        return self.size
    
    def _find_node(self, current_node, key):
        if current_node.contains_key(key):
            return current_node
        if current_node.is_leaf():
            return None
        child_index = current_node.get_insert_index(key) 
        return self._find_node(current_node.children[child_index], key)
    
    def contains(self, key):
        node = self._find_node(self.root, key)
        if node is None:
            return False
        return True
    
    def _add(self, current_node, key, value):
        if current_node.is_leaf(): 
            current_node.insert_entry(key, value) 
        else:
            child_index = current_node.get_insert_index(key) 
            self._add(current_node.children[child_index], key, value)
        if len(current_node) > self.split_threshold: 
            parent = current_node.split()  
            if current_node == self.root: 
                self.root = parent
                self.height += 1
                
    def add(self, key, value):
        self._add(self.root, key, value)
        self.size += 1
        
    def get_value(self, key):
        node = self._find_node(self.root, key)
        if node is None:
            return None
        return node.get_value(key)

## Override the Initializer

Next, we'll declare a new class called `KVStore` which will be a new extension of the `BTree` class above.

In [3]:
class KVStore(BTree):
    
    def __init__(self):
        super().__init__(split_threshold=2)
        
    def add(self, key, value):
        node = self._find_node(self.root, key)
        if node is None:
            super().add(key, value)
        else:
            for i, node_key in enumerate(node.keys):
                if node_key == key:
                    node.values[i] = value

### Testing KVStore( )

We're going to test the implementation we just created to make sure KVStore works properly. To do this, we're going to add assertions that will ensure the state of the object is what it's supposed to be. 

In [4]:
kvs = KVStore()

# Test the split threshold
assert kvs.split_threshold == 2, "Split Threshold is 2"

# Test the .add() and .get_value() methods
for i in range(10):
    kvs.add(i, i)
    
for i in range(10):
    assert kvs.get_value(i) == i, "i is i"

# Testing two entries with the same key and different values
for i in range(10):
    kvs.add(i, i + 1)
    
for i in range(10):
    assert kvs.get_value(i) == i + 1, "i is i + 1"

## Implement the Get & Set

We'll copy our KVStore() class from above and add `__setitem__` and `__getitem__`.

In [5]:
class KVStore(BTree):
    
    def __init__(self):
        super().__init__(split_threshold=2)
        
    def add(self, key, value):
        node = self._find_node(self.root, key)
        if node is None:
            super().add(key, value)
        else:
            for i, node_key in enumerate(node.keys):
                if node_key == key:
                    node.values[i] = value
                    
    def __setitem__(self, key, value):
        self.add(key, value)
        
    def __getitem__(self, key):
        return self.get_value(key)

### Testing the Get & Set Methods

Currently in progress...