# B-Tree 
- a specialized m-way tree designed to optimize data access, especially on disk-based storage systems.


## Applications of B-Trees
- large databases to access data stored on the disk
- Searching for data in a data set can be achieved in significantly less time using the B-Tree
  - With the indexing feature, multilevel indexing can be achieved.
- Most of the servers also use the B-tree approach.
- used in 
  - CAD systems to organize & search geometric data.
  - other eg: nlp, computer networks, cryptography.

## Advantages of B-Trees
- have a guaranteed time complexity of O(log n) for basic operations eg: insertion, deletion, searching
  - makes them suitable for large data sets and real-time applications.
- self-balancing.
- High-concurrency & high-throughput.
- Efficient storage utilization.

## Disadvantages of B-Trees
- B-Trees are based on disk-based data structures & can have a high disk usage.
- Not the best for all cases.
- For small datasets, the search time in a B-Tree might be slower compared to a binary search tree, as each node may contain multiple keys.


| Sr. No. | Operation | Time Complexity | Description |
| :-- | :-- | :-- | :-- |
| 1 | Search | O(log⁡n)O(\log n) | Searches for a value in the BST by repeatedly comparing the target value with the current node. |
| 2 | Insert | O(log⁡n)O(\log n) | Inserts a new value into the BST by finding the appropriate location while maintaining the BST structure. |
| 3 | Delete | O(log⁡n)O(\log n) | Removes a value from the BST by locating the node and appropriately reorganizing the tree structure. |
| 4 | Traverse | O(n)O(n) | Visits all the nodes of the BST (e.g., in-order, pre-order, post-order, or level-order traversal). |

In [0]:
# To insert a key k, 
# 1. start from the root and traverse down the tree until we reach the appropriate leaf node. 
# Note:
# Unlike Binary Search Trees (BSTs), nodes in a B-Tree have a predefined range for the number of keys they can hold. 
# Before inserting a key, we ensure the node has enough space. 
# If the node is full, an operation called splitChild() is performed to create space by splitting the node. 

# 1: procedure B-Tree-Insert (Node x, Key k)
# 2:    find i such that x:keys[i] > k or i >=numkeys(x)
# 3:    if x is a leaf then
# 4:       Insert k into x.keys at i
# 5:    else
# 6:      if x:child[i] is full then
# 7:         Split x:child[i]
# 8:         if k > x:key[i] then
# 9:             i = i + 1
# 10:        end if
# 11:      end if
# 12:     B-Tree-Insert(x:child[i]; k)
# 13:  end if
# 14: end procedure

# Inserting a key on a B-tree in Python

class BTreeNode:
    def __init__(self, t, leaf):
        self.keys = [None] * (2 * t - 1) # An array of keys
        self.t = t # Minimum degree (defines the range for number of keys)
        self.C = [None] * (2 * t) # An array of child pointers
        self.n = 0 # Current number of keys
        self.leaf = leaf # Is true when node is leaf. Otherwise false

    # A utility function to insert a new key in the subtree rooted with
    # this node. The assumption is, the node must be non-full when this
    # function is called
    def insertNonFull(self, k):
        i = self.n - 1
        if self.leaf:
            while i >= 0 and self.keys[i] > k:
                self.keys[i + 1] = self.keys[i]
                i -= 1
            self.keys[i + 1] = k
            self.n += 1
        else:
            while i >= 0 and self.keys[i] > k:
                i -= 1
            if self.C[i + 1].n == 2 * self.t - 1:
                self.splitChild(i + 1, self.C[i + 1])
                if self.keys[i + 1] < k:
                    i += 1
            self.C[i + 1].insertNonFull(k)

    # A utility function to split the child y of this node. i is index of y in
    # child array C[].  The Child y must be full when this function is called
    def splitChild(self, i, y):
        z = BTreeNode(y.t, y.leaf)
        z.n = self.t - 1
        for j in range(self.t - 1):
            z.keys[j] = y.keys[j + self.t]
        if not y.leaf:
            for j in range(self.t):
                z.C[j] = y.C[j + self.t]
        y.n = self.t - 1
        for j in range(self.n, i, -1):
            self.C[j + 1] = self.C[j]
        self.C[i + 1] = z
        for j in range(self.n - 1, i - 1, -1):
            self.keys[j + 1] = self.keys[j]
        self.keys[i] = y.keys[self.t - 1]
        self.n += 1

    # A function to traverse all nodes in a subtree rooted with this node
    def traverse(self):
        for i in range(self.n):
            if not self.leaf:
                self.C[i].traverse()
            print(self.keys[i], end=' ')
        if not self.leaf:
            self.C[i + 1].traverse()

            
    # A function to search a key in the subtree rooted with this node.
    def search(self, k):
        i = 0
        while i < self.n and k > self.keys[i]:
            i += 1
        if i < self.n and k == self.keys[i]:
            return self
        if self.leaf:
            return None
        return self.C[i].search(k)

# A BTree
class BTree:
    def __init__(self, t):
        self.root = None # Pointer to root node
        self.t = t  # Minimum degree

    # function to traverse the tree
    def traverse(self):
        if self.root != None:
            self.root.traverse()

    # function to search a key in this tree
    def search(self, k):
        return None if self.root == None else self.root.search(k)

    # The main function that inserts a new key in this B-Tree
    def insert(self, k):
        if self.root == None:
            self.root = BTreeNode(self.t, True)
            self.root.keys[0] = k # Insert key
            self.root.n = 1
        else:
            if self.root.n == 2 * self.t - 1:
                s = BTreeNode(self.t, False)
                s.C[0] = self.root
                s.splitChild(0, self.root)
                i = 0
                if s.keys[0] < k:
                    i += 1
                s.C[i].insertNonFull(k)
                self.root = s
            else:
                self.root.insertNonFull(k)
                
# Driver program to test above functions
if __name__ == '__main__':
    t = BTree(3) # A B-Tree with minimum degree 3
    t.insert(10)
    t.insert(20)
    t.insert(5)
    t.insert(6)
    t.insert(12)
    t.insert(30)
    t.insert(7)
    t.insert(17)

    print("Traversal of the constructed tree is ", end = ' ')
    t.traverse()
    print()

    k = 6
    if t.search(k) != None:
        print("Present")
    else:
        print("Not Present")

    k = 15
    if t.search(k) != None:
        print("Present")
    else:
        print("Not Present")

![](/Workspace/Users/jif170122@gmail.com/python_algorithm/pictures/b-tree.png)

In [0]:
class BTreeNode:
    
    def __init__(self, t, leaf):
        # Constructor for BTreeNode
        self.t = t  # Minimum degree (defines the range for the number of keys)
        self.leaf = leaf  # Is true when the node is leaf, otherwise false
        self.keys = [0] * (2 * t - 1)  # An array of keys
        self.C = [None] * (2 * t)  # An array of child pointers
        self.n = 0  # Current number of keys

    def find_key(self, k):
        # A utility function to find the index of the first key greater than or equal to k
        idx = 0
        while idx < self.n and self.keys[idx] < k:
            idx += 1
        return idx

    def remove(self, k):
        # A function to remove key k from the sub-tree rooted with this node
        idx = self.find_key(k)

        if idx < self.n and self.keys[idx] == k:
            if self.leaf:
                self.remove_from_leaf(idx)
            else:
                self.remove_from_non_leaf(idx)
        else:
            if self.leaf:
                print(f"The key {k} does not exist in the tree")
                return

            flag = idx == self.n
            if self.C[idx].n < self.t:
                self.fill(idx)

            if flag and idx > self.n:
                self.C[idx - 1].remove(k)
            else:
                self.C[idx].remove(k)

    def remove_from_leaf(self, idx):
        # A function to remove the idx-th key from this node, which is a leaf node
        for i in range(idx + 1, self.n):
            self.keys[i - 1] = self.keys[i]
        self.n -= 1

    def remove_from_non_leaf(self, idx):
        # A function to remove the idx-th key from this node, which is a non-leaf node
        k = self.keys[idx]

        if self.C[idx].n >= self.t:
            pred = self.get_pred(idx)
            self.keys[idx] = pred
            self.C[idx].remove(pred)
        elif self.C[idx + 1].n >= self.t:
            succ = self.get_succ(idx)
            self.keys[idx] = succ
            self.C[idx + 1].remove(succ)
        else:
            self.merge(idx)
            self.C[idx].remove(k)

    def get_pred(self, idx):
        # A function to get the predecessor of the key at the idx-th position in the node
        cur = self.C[idx]
        while not cur.leaf:
            cur = cur.C[cur.n]

        return cur.keys[cur.n - 1]

    def get_succ(self, idx):
        # A function to get the successor of the key at the idx-th position in the node
        cur = self.C[idx + 1]
        while not cur.leaf:
            cur = cur.C[0]

        return cur.keys[0]

    def fill(self, idx):
        # A function to fill child C[idx] which has fewer than t-1 keys
        if idx != 0 and self.C[idx - 1].n >= self.t:
            self.borrow_from_prev(idx)
        elif idx != self.n and self.C[idx + 1].n >= self.t:
            self.borrow_from_next(idx)
        else:
            if idx != self.n:
                self.merge(idx)
            else:
                self.merge(idx - 1)

    def borrow_from_prev(self, idx):
        # A function to borrow a key from C[idx-1] and insert it into C[idx]
        child, sibling = self.C[idx], self.C[idx - 1]

        for i in range(child.n - 1, -1, -1):
            child.keys[i + 1] = child.keys[i]

        if not child.leaf:
            for i in range(child.n, -1, -1):
                child.C[i + 1] = child.C[i]

        child.keys[0] = self.keys[idx - 1]

        if not child.leaf:
            child.C[0] = sibling.C[sibling.n]

        self.keys[idx - 1] = sibling.keys[sibling.n - 1]

        child.n += 1
        sibling.n -= 1

    def borrow_from_next(self, idx):
        # A function to borrow a key from C[idx+1] and place it in C[idx]
        child, sibling = self.C[idx], self.C[idx + 1]

        child.keys[child.n] = self.keys[idx]

        if not child.leaf:
            child.C[child.n + 1] = sibling.C[0]

        self.keys[idx] = sibling.keys[0]

        for i in range(1, sibling.n):
            sibling.keys[i - 1] = sibling.keys[i]

        if not sibling.leaf:
            for i in range(1, sibling.n + 1):
                sibling.C[i - 1] = sibling.C[i]

        child.n += 1
        sibling.n -= 1

    def merge(self, idx):
        # A function to merge C[idx] with C[idx+1]
        child, sibling = self.C[idx], self.C[idx + 1]

        child.keys[self.t - 1] = self.keys[idx]

        for i in range(sibling.n):
            child.keys[i + self.t] = sibling.keys[i]

        if not child.leaf:
            for i in range(sibling.n + 1):
                child.C[i + self.t] = sibling.C[i]

        for i in range(idx + 1, self.n):
            self.keys[i - 1] = self.keys[i]

        for i in range(idx + 2, self.n + 1):
            self.C[i - 1] = self.C[i]

        child.n += sibling.n + 1
        self.n -= 1

    def insert_non_full(self, k):
        # A utility function to insert a new key in this node
        # The assumption is that the node must be non-full when this function is called
        i = self.n - 1

        if self.leaf:
            while i >= 0 and self.keys[i] > k:
                self.keys[i + 1] = self.keys[i]
                i -= 1

            self.keys[i + 1] = k
            self.n += 1
        else:
            while i >= 0 and self.keys[i] > k:
                i -= 1

            i += 1
            if self.C[i].n == (2 * self.t - 1):
                self.split_child(i, self.C[i])

                if self.keys[i] < k:
                    i += 1

            self.C[i].insert_non_full(k)

    def split_child(self, i, y):
        # A utility function to split the child y of this node
        # i is the index of y in the child array C[]
        z = BTreeNode(y.t, y.leaf)
        z.n = self.t - 1

        for j in range(self.t - 1):
            z.keys[j] = y.keys[j + self.t]

        if not y.leaf:
            for j in range(self.t):
                z.C[j] = y.C[j + self.t]

        y.n = self.t - 1

        for j in range(self.n, i, -1):
            self.C[j + 1] = self.C[j]

        self.C[i + 1] = z

        for j in range(self.n - 1, i - 1, -1):
            self.keys[j + 1] = self.keys[j]

        self.keys[i] = y.keys[self.t - 1]
        self.n += 1

    def traverse(self):
        # A function to traverse all nodes in a subtree rooted with this node
        i = 0
        while i < self.n:
            if not self.leaf:
                self.C[i].traverse()
            print(self.keys[i], end=" ")
            i += 1

        if not self.leaf:
            self.C[i].traverse()


class BTree:
    def __init__(self, t):
        # Constructor for BTree
        self.root = None  # Pointer to the root node
        self.t = t  # Minimum degree

    def traverse(self):
        # A function to traverse the B-tree
        if self.root:
            self.root.traverse()
            print()

    def search(self, k):
        # A function to search for a key in the B-tree
        return None if not self.root else self.root.search(k)

    def insert(self, k):
        # The main function that inserts a new key in the B-tree
        if not self.root:
            self.root = BTreeNode(self.t, True)
            self.root.keys[0] = k
            self.root.n = 1
        else:
            if self.root.n == (2 * self.t - 1):
                s = BTreeNode(self.t, False)
                s.C[0] = self.root
                s.split_child(0, self.root)

                i = 0
                if s.keys[0] < k:
                    i += 1

                s.C[i].insert_non_full(k)
                self.root = s
            else:
                self.root.insert_non_full(k)

    def remove(self, k):
        # The main function that removes a key from the B-tree
        if not self.root:
            print("The tree is empty")
            return

        self.root.remove(k)

        if self.root.n == 0:
            tmp = self.root
            if self.root.leaf:
                self.root = None
            else:
                self.root = self.root.C[0]

            del tmp


if __name__ == "__main__":
    b_tree = BTree(3)

    keys_to_insert = [10, 5, 15, 2, 7, 12, 20]

    for key in keys_to_insert:
        b_tree.insert(key)

    print("Before Deletion:", end = " ")
    b_tree.traverse()

    b_tree.remove(5)

    print("After Deletion:", end = " ")
    b_tree.traverse()