<a href="https://colab.research.google.com/github/lblogan14/data_structures_and_algorithms/blob/master/ch11_search_trees.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#11.1 Binary Search Trees
Use a search tree structure to efficiently implement a *sorted map*. See details in Chapter 10.

A **binary search tree** is a binary tree $T$ with each position $p$ storing a key-value pair $(k,v)$ such that:
* Keys stored in the left subtree of $p$ are less than $k$
* Keys stored in the right subtree of $p$ are greater than $k$

##11.1.1 Navigating a Binary Search Tree
*An inorder traversal of a binary search tree visits positions in increasing order of their keys.*

Since an inorder traversal can be executed in linear time, a consequence of this
proposition is that a sorted iteration of the keys of a map can be produced in linear time when represented as a binary search tree.

With a binary search tree, we can provide additional navigation based on the natural order of the keys stored in the tree:
* `first()`: Return the position containing the least key, or `None` if the tree is empty.
* `last()`: Return the position containing the greatest key, or `None` if empty tree.
* `before(p)`: Return the position containing the greatest key that is less than that of
position `p` (i.e., the position that would be visited immediately before `p`
in an inorder traversal), or `None` if `p` is the first position.
* `after(p)`: Return the position containing the least key that is greater than that of
position `p` (i.e., the position that would be visited immediately after `p`
in an inorder traversal), or `None` if `p` is the last position.

If `p` has a right subtree, that right subtree is recursively traversed immediately after `p` is visited, and so the
first position to be visited after `p` is the *leftmost* position within the right subtree.
If `p` does not have a right subtree, then the flow of control of an inorder traversal
returns to `p`’s parent. If `p` were in the right subtree of that parent, then the parent’s
subtree traversal is complete and the flow of control progresses to its parent and
so on. Once an ancestor is reached in which the recursion is returning from its
left subtree, then that ancestor becomes the next position visited by the inorder
traversal, and thus is the successor of `p`.

##11.1.2 Searches
The binary search tree can be treated as a decision tree. The question askes at each position `p` is whether the desired key `k` is less than, equal to, or greater than the key stored at position `p`, which is denoted as `p.key()`. If the
answer is “less than,” then the search continues in the left subtree. If the answer
is “equal,” then the search terminates successfully. If the answer is “greater than,”
then the search continues in the right subtree. Finally, if an empty subtree is reached, then the search terminates unsccussfully.

Since we spend $O(1)$ time per position encountered in the search, the overall search runs in $O(h)$ time, where $h$ is the height of the binary search tree $T$.

##11.1.3 Insertions and Deletions

###Insertion
The map command `M[k] = v` begins with a search for key `k`. If found, that item's existing value is reassigned. Otherwise, a node for the new item can be inserted into the underlying tree `T` in place of the empty subtree that was reached at the end of the failed search. Insertions are always enacted at the bottom of a path.

###Deletion
To delete an item with key `k`, start to find the position `p` of `T` storing an item with key equal to `k`. If the search is successful, there are two cases:
* If `p` has at most one child, the deletion of the node at position `p` is easily implemented.
* If position `p` has two children, there are three more subcases:
 * locating position `r` containing the item having the greatest key that is strictly less than that of position `p`, that is, `r = before(p)`. Because `p` has two children, its predecessor is the rightmost position of the left subtree of `p`.
 * Using `r`'s item as a replacement for the one being deleted at position `p`. Because `r` has the immediately preceding key in the map, any items in
`p`’s right subtree will have keys greater than `r` and any other items in `p`’s
left subtree will have keys less than `r`. Therefore, the binary search tree
property is satisfied after the replacement.
* Having used `r`'s as a replacement for `p`, delete the node at position `r` from the tree. Since `r` was located as the rightmost position in a subtree, `r` does not have a right child. Therefore, its deletion can be performed using the first approach.

As with searching and insertion, this algorithm for a deletion involves the
traversal of a single path downward from the root, possibly moving an item between
two positions of this path, and removing a node from that path and promoting its
child. Therefore, it executes in time $O(h)$ where h is the height of the tree.

##11.1.4 Python Implementation

In [0]:
class TreeMap(LinkedBinaryTree, MapBase):
  '''Sorted map implementation using a binary search tree'''
  
  #-------------------override Position class--------------------------
  class Position(LinkedBinaryTree.Position):
    def key(self):
      '''Return key of map's key-value pair'''
      return self.element()._key
    
    def value(self):
      '''Return value of map's key-value pair'''
      return self.element()._value
    
  #-------------------nonpublic utilities------------------------------
  def _subtree_search(self, p, k):
    '''Return Position of p's subtree having key k, or last node searched'''
    if k == p.key(): # found match
      return p
    elif k < p.key(): # search left subtree
      if self.left(p) is not None:
        return self._subtree_search(self.left(p), k)
    else: # search right subtree
      if self.right(p) is not None:
        return self._subtree_search(self.right(p), k)
    return p # unsuccessful search
  
  def _subtree_first_position(self, p):
    '''Return Position of first item in subtree rooted at p'''
    walk = p
    while self.left(walk) is not None: # keep walking left
      walk = self.left(walk)
    return walk
  
  def _subtree_last_position(self, p):
    '''Return Position of last item in subtree rooted at p'''
    walk = p
    while self.right(walk) is not None: # keep walking right
      walk = self.right(walk)
    return walk
  
  def first(self):
    '''Return the first Position in the tree (or None if empty)'''
    return self._subtree_first_position(self.root()) if len(self)>0 else None
  
  def last(self):
    '''Return the last Position in the tree (or None if empty)'''
    return self._subtree_last_position(self.root()) if len(self)>0 else None
  
  def before(self, p):
    '''Return the Position just before p in the natural order
    
    Return None if p is the first position
    '''
    self._validate(p) # inherited from LinkedBinaryTree
    if self.left(p):
      return self._subtree_last_position(self.left(p))
    else:
      # walk upward
      walk = p
      above = self.parent(walk)
      while above is not None and walk == self.left(above):
        walk = above
        above = self.parent(walk)
      return above
    
  def after(self, p):
    '''Return the Position just after p in the natural order
    
    Return None if p is the last position
    '''
    # symmetric to before(p)
    self._validate(p)
    if self.right(p):
      return self._subtree_first_position(self.right(p))
    else:
      #walk downward
      walk = p
      below = self.children(walk)
      while below is not None and walk == self.right(below):
        walk = below
        below = self.children(walk)
      return below
    
  def find_position(self, k):
    '''Return position with key k, or else neighbor (or None if empty)'''
    if self.is_empty():
      return None
    else:
      p = self._subtree_search(self.root(), k)
      self._rebalance_access(p) # hook for balanced tree subclasses
      return p
    
  def find_min(self):
    '''Return (key, value) pair with minimum key (or None if empty)'''
    if self.is_empty():
      return None
    else:
      p = self.first()
      return (p.key(), p.value())
    
  def find_ge(self, k):
    '''Return (key,value) pair with least key greater than or equal to k
    
    Return None if there does not exist such a key
    '''
    if self.is_empty():
      return None
    else:
      p = self.find_position(k) # may not find exact match
      if p.key() < k: # p's key is too small
        p = self.after(p)
      return (p.key(), p.value()) if p is not None else None
  
  def find_range(self, start, stop):
    '''Iterate all (key,value) pairs such that start<=key<stop
    
    If start is None, iteration begins with minimum key of map
    If stop is None, iteration continues through the maximum key of map
    '''
    if not self.is_empty():
      if start is None:
        p = self.first()
      else:
        # we initialize p with logic similar to find_ge
        p = self.find_position(start)
        if p.key() < start:
          p = self.after(p)
      while p is not None and (stop is None or p.key() < stop):
        yield (p.key(), p.value())
        p = self.after(p)
        
  def __getitem__(self, k):
    '''Return value associated with key k (raise KeyError if not found)'''
    if self.is_empty():
      raise KeyError('Key Error: ' + repr(k))
    else:
      p = self._subtree_search(self.root(), k)
      self._rebalance_access(p) # hook for balanced tree subclasses
      if k != p.key():
        raise KeyError('Key Error: ' + repr(k))
      return p.value()
    
  def __setitem__(self, k, v):
    '''Assign value v to key k, overwriting existing value if present'''
    if self.is_empty():
      leaf = self._add_root(self._Item(k,v)) # from LinkedBinaryTree
    else:
      p = self._subtree_search(self.root(), k)
      if p.key() == k:
        p.element()._value = v # replace existing item's value
        self._rebalance_access(p) # hook for balanced tree subclasses
        return
      else:
        item = self.Item(k,v)
        if p.key() < k:
          leaf = self._add_right(p, item) # inherited from LinkedBinaryTree
        else:
          leaf = self._add_left(p, item) # inherited from LInkedBinaryTree
    self._rebalance_insert(leaf) # hook for balanced tree subclasses
    
  def __iter__(self):
    '''Generate an iteration of all keys in the map in order'''
    p = self.first()
    while p is not None:
      yield p.key()
      p = self.after(p)
      
  def delete(self, p):
    '''Remove the item at given Position'''
    self._validate(p) # inherited from LinkedBinaryTree
    if self.left(p) and self.right(p): # p has two children
      replacement = self._subtree_last_position(self.left(p))
      self._replace(p, replacement.element()) # from LinkedBinaryTree
      p = replacement
    # now p has at most one child
    parent = self.parent(p)
    self._delete(p) # inherited from LinkedBinaryTree
    self._rebalance_delete(parent) # if root deleted, parent is None
    
  def __delitem__(self, k):
    '''Remove item associated with key k (raise KeyError if not found)'''
    if not self.is_empty():
      p = self._subtree_search(self.root(), k)
      if k == p.key():
        self.delete(p) # rely on positional version
        return # successful deletion complete
      self._rebalance_acess(p) # hook for balanced tree subclasses
    raise KeyError('Key Error: ' + repr(k))

The `TreeMap` class takes advantage of **multiple inheritance** for code reuse, inheriting from the `LinkedBinaryTree` class in Chapter 8 for representation as a positional binary tree and from the `MapBase` class in Chapter 10 to provide the key-value composite item and the concrete behaviors from the `collections.MutableMapping` abstract base class.

The `_subtree_search(p,k)` method returns a position, ideally one that contains the key `k` or otherwise the last position that is visited on the search path.

The `TreeMap` is also implemented with calls to presumed methods named `_rebalance_insert`, `_rebalance_delete`, and `_rebalance_access`, which serve as **hooks** for future use when balancing search trees.

Almost all operations have a worst-case running time depends on $h$, where $h$ is the height of the current tree, which is because most operations rely on a constant amount of work for each node along a particular path of the tree, and the maximum path length within a tree is proportional to the height of the tree.

On average, a binary search tree with $n$ keys generated from a random seires of insertions and removals of keys has expected height $O(\log n)$. In applications where one cannot guarantee the random nature of updates, it
is better to rely on variations of search trees, that guarantee a worst-case height of $O(\log n)$, and thus $O(\log n)$ worstcase
time for searches, insertions, and deletions.

#11.2 Balanced Search Trees
The primary operation to rebalance a binary search tree is known as a **rotation**. During a rotation, a child is "rotated" to be above its parent. To maintain the binary search tree property through a rotation, we note that
if position x was a left child of position y prior to a rotation (and therefore the
key of x is less than the key of y), then y becomes the right child of x after the
rotation, and vice versa. If there exists a subtree T representing items with keys that are greater than that of position x and less than that of position y, then after the rotation, the order still maintains.

Because a single rotation modifies a constant number of parent-child relationship, it can be implemented in $O(1)$ time with a linked binary tree representation.

In the context of a tree-balancing algorithm, a rotation allows the shape of a
tree to be modified while maintaining the search tree property. If used wisely, this
operation can be performed to avoid highly unbalanced tree configurations. One or more rotations can be combined to provide broader rebalancing within a
tree. One such compound operation we consider is a **trinode restructuring**: \\
For example: consider a position x, its parent y, and its grandparent z, \\
Algorithm `resturcture(x):` \\
Input: A position x of a binary search tree `T` that has both a parent `y` and a grandparent `z` \\
Output: Tree `T` after a trinode restructuring (which corresponds to a single or double rotation) involving positions x, y, and z \\
1. Let (a, b, c) be a left-to-right (inorder) listing of the positions x, y, and z, and
let (T1,T2,T3,T4) be a left-to-right (inorder) listing of the four subtrees of x,
y, and z not rooted at x, y, or z.
2. Replace the subtree rooted at z with a new subtree rooted at b.
3. Let a be the left child of b and let T1 and T2 be the left and right subtrees of a,
respectively.
4. Let c be the right child of b and let T3 and T4 be the left and right subtrees of
c, respectively.

The trinode restructuring is completed with $O(1)$ running time.

##11.2.1 Python Framework for Balancing Search Trees
The inheritance hierarchy is shown below:
$$(AVLTreeMap, SplayTreeMap, RedBlackTreeMap)\rightarrow(TreeMap)\rightarrow(LinkedBinaryTree, MapBase)$$

###Hooks for Rebalancing Operations
* A call to `_rebalance_insert(p)` is made from within the `__setitem__` method
immediately after a new node is added to the tree at position `p`.
* A call to `_rebalance_delete(p)` is made each time a node has been deleted
from the tree, with position `p` identifying the parent of the node that has just
been removed. Formally, this hook is called from within the public `delete(p)`
method, which is indirectly invoked by the public `__delitem__(k)` behavior.
* A call to `_rebalance_access(p)` is made when an item at
position `p` of a tree is accessed through a public method such as `__getitem__` .
This hook is used by the **splay tree** structure to restructure
a tree so that more frequently accessed items are brought closer to the root.

###Nonpublic Methods for Rotating and Restructuring
Implement the `_rotate` and `_restructure` methods. There are other utilities defined to simplify the code: (attach these methods to the `TreeMap` class)

In [0]:
def _relink(self, parent, child, make_left_child):
  '''Relink parent node with child node (allowing child to be None)'''
  if make_left_child: # make it a left child
    parent._left = child
  else: # make it a right child
    parent._right = child
  if child is not None: # make child point to parent
    child._parent = parent
    
def _rotate(self, p):
  '''Rotate Position p above its parent'''
  x = p.node
  y = x._parent # assume this exists
  z = y._parent # grandparent (possibly None)
  if z is None:
    self._root = x # x becomes root
    x._parent = None
  else:
    self._relink(z,x,y == z._left) # x becomes a direct child of z
  # now rotate x and y, including transfer of middle subtree
  if x == y._left:
    self._relink(y, x._right, True) # x._right becomes left child of y
    self._relink(x, y, False) # y becomes right child of x
  else:
    self._relink(y, x._left, False) # x._left becomes right child of y
    self._relink(x, y, True) # y becomes left child of x
    
def _restructure(self, x):
  '''Perform trinode restructure of Position x with parent/grandparent'''
  y = self.parent(x)
  z = self.parent(y)
  if (x == self.right(y)) == (y == self.right(z)): # matching alignments
    self._rotate(y) # single rotation (of y)
    return y # y is new subtree root
  else: # opposite alignments
    self._rotate(x) # double rotation (of x)
    self._rotate(x)
    return x # x is new subtree root

The `_relink` utility links parent and child nodes to each other, including the special case in which a "child" is a `None` reference.

#11.3 AVL Trees
The `TreeMap` class so far has its worst-case performance for the various operations in linear time because of the linear height of the tree. The AVL tree in this section guarantees worst-case logarithmic running time for all map opeartions.

###Definition of an AVL Tree
Originally, the height of a subtree rooted at position $p$ of a tree is defined to be the number of *edges* on the longest path from $p$ to a leaf. \\
Now, consider the height to be the number of *nodes* on such a longest path. Based on this definition, a leaf position has height 1, while the height of a "null" child to be 0.

**Height-Balance Property**: For every position $p$ of $T$, the heights of the children of $p$ differ by at most 1.

Any binary search tree $T$ that satisfies the height-balance property is an **AVL tree**.

Now, a subtree of an AVL tree is itself an AVL tree.

*The height of an AVL tree storing $n$ entires is $O(\log n)$.*

Now the operation `__getitem__` implemented with an AVL tree runs in time $O(\log n)$, where $n$ is the number of items in the map.

##11.3.1 Update Operations
Given a binary search tree $T$, a position is **balanced** if the absolute
value of the difference between the heights of its children is at most 1; otherwise, it is unbalanced. Therefore, the height-balance property characterizing AVL trees is equivalent to saying that every position is balanced.

###Insertion
An insertion of a new item in a binary search tree results in a new node at a leaf position $p$. This action may violate the height-balance property. To restructure $T$ to fix any unbalance that may have occured, start to restore the balance of the nodes in the binary search tree $T$ by a simple "search-and-repair" strategy.

Let z be the first position encountered in going up from $p$ toward the root of $T$ such that z is unbalanced. Let y denote the child of z with higher height (and y must be an ancestor of $p$). Finally let x be the child of y with higher height (there cannot be a tie and position x must also be an ancestor of $p$, possibly $p$ itself). Now rebalance the subtree rooted at z by calling the **trinode restructuring** method, `restructure(x)`. After the trinode restructuring, each of x, y, and z has become balaanced. The node that becomes the root of the subtree after the restructuring has height $h+2$, which is precisely the height that z had before the insertion of the new item. Therefore, any ancestor of z that became temporarily unbalanced becomes balanced again, and this one restructuring restores the height-balance property *globally*.

###Deletion
A deletion from a regular binary search tree results in the structural removal of a node having either zero or one child.

As with insertion, use trinode restructuring to restore balance in the tree $T$. let z be the first unbalanced position encountered going up from $p$
toward the root of $T$. Also, let y be the child of z with larger height (note that
position y is the child of z that is not an ancestor of $p$), and let x be the child of y
defined as follows: If one of the children of y is taller than the other, let x be the
taller child of y; else (both children of y have the same height), let x be the child of
y on the same side as y (that is, if y is the left child of z, let x be the left child of
y, else let x be the right child of y). In any case, perform a `restructure(x)`
operation for the next step.

###Performance of AVL Trees
The height of an AVL tree with $n$ items is guaranteed to be $O(\log n)$. Because the standard binary search tree operation had running times
bounded by the height, and because the additional work in maintaining
balance factors and restructuring an AVL tree can be bounded by the length
of a path in the tree, the traditional map operations run in worst-case logarithmic
time with an AVL tree.

##11.3.2 Python Implementation
The `AVLTreeMap` class inherits from the standard `TreeMap` class. The `AVLTreeMap` overrides the definition of the nested `_Node` class in order to provide support for storing the height of the subtree stored at a node.

In [0]:
class AVLTreeMap(TreeMap):
  '''Sorted map implementation using an AVL tree'''
  
  #---------------------nested _Node class------------------------------
  class _Node(TreeMap._Node):
    '''Node class for AVL maintains height value for balancing'''
    __slots__ = '_height' # additional data member to store height
    
    def __init__(self, element, parent=None, left=None, right=None):
      super().__init__(element, parent, left, right)
      self._height = 0 # will be recomputed during balancing
      
    def left_height(self):
      return self._left._height if self._left is not None else 0
    
    def right_height(self):
      return self._right._height if self._right is not None else 0
    
  #---------------positional-based utility methods----------------------
  def _recompute_height(self, p):
    p._node._height = 1 + max(p._node.left_height(), p._node.right_height())
    
  def _isbalanced(self, p):
    return abs(p._node.left_height - p._node.right_height()) <= 1
  
  def _tall_child(self, p, favorleft=False): # parameter controls tiebreaker
    if p._node.left_height() + (1 if favorleft else 0) > p._node.right_height():
      return self.left(p)
    else:
      return self.right(p)
    
  def _tall_grandchild(self, p):
    child = self._tall_child(p)
    # if child is on left, favor left grandchild, else favor right grandchild
    alignment = (child == self.left(p))
    return self._tall_child(child, alignment)
  
  def _rebalance(self, p):
    while p is not None:
      old_height = p._node._height # trivially 0 if new node
      if not self._isbalanced(p): # imbalance detected
        # perform trinode restructuring setting p to resulting root,
        # and recompute new local heights after the restructuring
        p = self._resturcture(self._tall_grandchild(p))
        self._recompute_height(self.left(p))
        self._recompute_height(self.right(p))
      self._recompute_height(p) # adjust for recent changes
      if p._node._height == old_height: # has height changed?
        p = None # no further changes needed
      else:
        p = self.parent(p) # repeat with parent
  
  #---------------------override balancing hooks-------------------------
  def _rebalance_insert(self, p):
    self._rebalance(p)
    
  def _rebalance_delete(self, p):
    self._rebalance(p)

The `_rebalance` method suffices as a hook for restoring the height-balance property after an insertion or a deletion. Although the inherited behaviors for insertion and
deletion are quite different, the necessary post-processing for an AVL tree can be
unified.

In both cases, we trace an upward path from the position p at which the
change took place, recalculating the height of each position based on the (updated)
heights of its children, and using a trinode restructuring operation if an imbalanced
position is reached. If we reach an ancestor with height that is unchanged by the
overall map operation, or if we perform a trinode restructuring that results in the
subtree having the same height it had before the map operation, we stop the process;
no further ancestor’s height will change

#11.4 Splay Trees
**Splay trees** do not strictly enforce a logarithmic upper bound on the height of the tree. There are no additional height, balance, or other auxiliary data associated with the nodes of this tree.

The efficiency of splay trees is due to a certain move-to-root operation, called **splaying**, that is performed at the bottommost position $p$ reached during every insertion, deletion, or a search. A splay operation causes more frequently accessed elements to remain nearer to the root, thereby reducing the typical search times.

##11.4.1 Splaying
Given a node $x$ of a binary search tree $T$, $x$ is splayed by moving $x$ to the root of $T$ through a sequence of restructurings. The operation to move $x$ up depends upon the relative positions of $x$, its parent $y$, and (if it exists) $x$'s grandparent $z$. The splaying step consists of repeating the restructuring at $x$

##11.4.2 When to Splay
Rules dictating when splaying is performed:
* When searching for key k, if k is found at position p, p is splayed, else the leaf position is splayed at which the search terminates unsuccessfully.
* When inserting key k, the newly created internal node is splayed where k
gets inserted.
* When deleting a key k, the position p that is the parent of the removed
node is splayed; recall that by the removal algorithm for binary search trees, the
removed node may be that originally containing k, or a descendant node with
a replacement key.

##11.4.3 Python Implementation

In [0]:
class SplayTreeMap(TreeMap):
  '''Sorted map implementation using a splay tree'''
  #--------------------------splay operation-----------------------------
  def _splay(self, p):
    while p != self.root():
      parent = self.parent(p)
      grand = self.parent(parent)
      if grand is None:
        # zig case
        self._rotate(p)
      elif (parent == self.left(grand)) == (p == self.left(parent)):
        # zig-zig case
        self._rotate(parent) # move PARENT up
        self._rotate(p) # then move p up
      else:
        # zig-zag case
        self._rotate(p) # move p up
        self._rotate(p) # move p up again
        
  #------------------------override balancing hooks----------------------
  def _rebalance_insert(self, p):
    self._splay(p)
    
  def _rebalance_delete(self, p):
    if p is not None:
      self._splay(p)
      
  def _rebalance_access(self, p):
    self._splay(p)

In the worst case, the overall running time of a search, insertion, or deletion in a
splay tree of height h is O(h), since the position we splay might be the deepest
position in the tree

#11.5 (2,4) Trees
**(2,4) Tree** is an example of **multiway search tree**, in which internal nodes may have more than two children.

##11.5.1 Multiway Search Trees
Map items stored in a search tree are pairs of the form $(k,v)$, where $k$ is the key and $v$ is the value associated with the key.

###Definition of a Multiway Search Tree
Let $w$ be a node of an ordered tree. $w$ is a **d-node** if $w$ has $d$ children. A multiway search tree is defined as an ordered tree $T$ that has the following properties:
* Each internal node of $T$ has at least two children. That is, each internal node is a $d$-node such that $d\geq 2$.
* Each internal $d$-node $w$ of $T$ with children $c_1, ...., c_d$ stores an ordered set of $d-1$ key-value pairs $(k_1,v_1),..., (k_{d-1}, v_{d-1})$, where $k_1\leq...\leq k_{d-1}$.
* Define $k_0=-\infty$ and $k_d=+\infty$. For each item $(k,v)$ stored at a node in the subtree of $w$ rooted at $c_i, i=1,...,d$, $k_{i-1}\leq k \leq k_i$.

That is, if the set of keys is stored at $w$ including the special ficititous keys $k_0=-\infty$ and $k_d=+\infty$, then a key $k$ stored in the subtree of $T$ rooted at a child node $c_i$ must be "in between" two keys stored at $w$.

Note that the external nodes for a multiway search do not store any data and serve only as "placeholders".

*An $n$-item multiway search tree has $n+1$ external nodes.*

###Searching in a Multiway Tree
Start with tracing a path in $T$ at the root. At a $d$-node $w$ during this search, the key $k$ is compared with the keys $k_1,..., k_{d-1}$ stored at $w$. If $k=k_i$ for some $i$, the search is successfully completed. Otherwise, the search is continued in the child $c_i$ of $w$ such that $k_{i-1}<k<k_i$. If an external node is reached, there is no item with key $k$ in $T$, and the serach terminates unsuccessfully.

##11.5.2 (2,4)- Tree Operations
During a search for key $k$ in a multiway search tree, the primary operation needed when navigating a node is finding the smallest key at that node that is greater than or equal to $k$. Therefore, it is natural to model the information at a node itself as a *sorted map*. This map serves as a **secondary** data structure to support the **primary** data structure represented by the entire multiway search tree.

A multiway search tree that keeps the secondary data structure stored at each node small and also keeps the primary multiway tree balanced is the **(2,4) tree**. This data structure achieves these goals by maintaining:
* **Size Property**: Every internal node has at most four children.
* **Depth Property**: All the external nodes have the same depth.

For simplicity, assume all external nodes are empty.

**(2,4) tree** also implies that each internal node in the tree has 2, 3, or 4 children.

*The height of a (2,4) tree storing $n$ items is $O(\log n)$.*

#11.6 Red-Black Trees
AVL trees may require many restructure operations (rotations) to be performed after a deletion, and (2,4) trees may require many split or fusing operations to be performed after an insertion or removal. However, the Red-Black Trees introduced here do have those drawbacks and uses $O(1)$ structural changes after an update in order to stay balanced.

A **red-black tree** is a binary search tree with nodes colored red and black in a way that satisfies the following properties:
* **Root Property**: The root is black.
* **Red Property**: The children of a red node (if any) are black.
* **Depth Property**: All nodes with zero or one chid have the same **black depth**, defined as the number of black ancestors. (Recall that a node is its own ancestor).

More intuitively, given a red-balck tree, we can construct a corresponding (2,4) tree by merging every red node $w$ into its parent, storing the entry from $w$ at its parent, and with the children of $w$ becoming ordered children of the parent.
* If $w$ is a 2-node, then keep the (black) children of $w$ as is.
* If $w$ is a 3-node, then create a new red node $y$, give $w$’s last two (black)
children to $y$, and make the first child of $w$ and $y$ be the two children of $w$.
* If $w$ is a 4-node, then create two new red nodes $y$ and $z$, give $w$’s first two
(black) children to $y$, give $w$’s last two (black) children to $z$, and make $y$ and
$z$ be the two children of $w$.

Note that a red node always has a black parent in this contruction.

*The height of a red-black tree storing $n$ entires is $O(\log n)$

###Python Implementation

In [0]:
class RedBlackTreeMap(TreeMap):
  '''Sorted map implementation using a red-black tree'''
  class _Node(TreeMap._Node):
    '''Node class for red-black tree maintains bit that denotes color'''
    __slots__ = '_red' # add additional data member to the Node class
    
    def __init__(self, element, parent=None, left=None, right=None):
      super().__init__(element, parent, left, right)
      self._red = True # new node red by default
      
  #----------------positional-based utility methods----------------------
  # consider a nonexistent child to be trivially black
  def _set_red(self, p):
    p._node._red = True
  def _set_black(self, p):
    p._node._red = False
  def _set_color(self, p, make_red):
    p._node._red = make_red
  def _is_red(self, p):
    return p is not None and p._node._red
  def _is_red_leaf(self, p):
    return self._is_red(p) and self.is_leaf(p)
  
  def _get_red_child(self, p):
    '''Return a red child of p (or None if no such child)'''
    for child in (self.left(p), self.right(p)):
      if self._is_red(child):
        return child
    return None
  
  #----------------------support for insertions--------------------------
  def _rebalance_insert(self, p):
    self._resolve_red(p) # new node is always red
    
  def _resolve_red(self, p):
    if self.is_root(self, p):
      self._set_black(p) # make root black
    else:
      parent = self.parent(p)
      if self._is_red(parent): # double red problem
        uncle = self.sibling(parent)
        if not self._is_red(uncle): # Case 1: misshappen 4-node
          middle = self._restructure(p) # do trinode restructuring
          self._set_black(middle) # and then fix colors
          self._set_red(self.left(middle))
          self._set_red(self.rihgt(middle))
        else: # Case 2: overfull 5-node
          grand = self.parent(parent)
          self._set_red(grand) # grandparent becomes red
          self._set_black(self.left(grand)) # its children become black
          self._set_black(self.right(grand))
          self._resolve_red(grand) # recur at red grandparent
          
  #-----------------support for deletions--------------------------------
  def _rebalance_delete(self, p):
    if len(self) == 1:
      self._set_black(self.root()) # special case: ensure that root is black
    elif p is not None:
      n = self.num_children(p)
      if n == 1: # deficit exists unless child is a red leaf
        c = next(self.children(p))
        if not self._is_red_leaf(c):
          self._fix_deficit(p,c)
        elif n == 2: # removed black node with red child
          if self._is_red_leaf(self.left(p)):
            self._set_black(self.left(p))
          else:
            self._set_black(self.right(p))
            
  def _fix_deficit(self, z, y):
    '''Resolve black deficit at z, where y is the root of z's heavier subtree'''
    if not self._is_red(y): # y is black; will apply Case 1 or 2
      x = self._get_red_child(y)
      if x is not None: # Case 1: y is black and has red child x; do "transfer"
        old_color = self._is_red(z)
        middle = self._restructure(x)
        self._set_color(middle, old_color) # middle gets old color of z
        self._set_black(self.left(middle)) # children become black
        self._set_black(self.right(middle))
      else: # Case 2: y is black, but no red children; recolor as "fusion"
        self._set_red(y)
        if self._is_red(z):
          self._set_black(z) # this resolves the problem
        elif not self.is_root(z):
          self._fix_deficit(self.parent(z), self.sibling(z)) # recur upward
    else: # Case 3: y is red; rotate misaligned 3-node and repeat
      self._rotate(y)
      self._set_black(y)
      self._set_red(z)
      if z == self.right(y):
        self._fix_deficit(z, self.left(z))
      else:
        self._fix_deficit(z, self.right(z))

When an element has been inserted as a leaf in the tree, the `_rebalance_insert`
hook is called, allowing us the opportunity to modify the tree. The new node is
red by default, so we need only look for the special case of the new node being
the root (in which case it should be colored black), or the possibility that we have
a double-red violation because the new node’s parent is also red