# Hudson Arney
## Lab 4 - Scapegoat Trees
### CSC 3310 001 - Dr. Berisha

### Introduction: 
In this lab I hope to




### 1a. How do scapegoat trees compare with Red-Black, AVL, and splay trees? Why might you prefer to use or not use a scapegoat tree?

Scapegoat trees are a type of binary search tree that do not store any extra information at each node, unlike Red-Black or AVL trees. They achieve a worst-case search time of O(log n) by occasionally rebuilding subtrees that are too deep or unbalanced1. Splay trees are another type of binary search tree that do not store extra information, but they do not guarantee a logarithmic worst-case search time, and they require restructuring during searches2.

**Some possible advantages of using scapegoat trees are:**

- They are simple and practical to implement, and do not require complex rotations or color changes.
- They have low storage overhead, since each node only contains a key and two pointers.
- They have fast search performance, especially for search-intensive applications, since they do not incur any balancing overhead during searches.
- They can be easily adapted to other tree-like data structures, such as k-d trees or quad trees, by using relaxed rebuilding routines.

**Some possible disadvantages of using scapegoat trees are:**

- They have slower insertion performance than Red-Black or splay trees, especially for sorted sequences, since they may trigger more rebuilds.
- They have a trade-off between the value of alpha and the performance of different operations. A larger alpha may improve the search and delete times, but worsen the insert time.
- They have a higher constant factor in the amortized update cost than Red-Black or splay trees, since - they need to find the scapegoat node and rebuild the subtree.

### 1b. What does it mean for a node to be weight balanced? What does it mean for a tree to be weight-balanced? Draw some examples and calculate their weight balances.


In the context of Scapegoat Trees, a node is said to be α-weight-balanced if both of its child nodes satisfy the condition **size(child) ≤ α * size(node)**, where size(node) is the size of the subtree rooted at the node and α is a constant between 0.5 and 1.

A tree is α-weight-balanced if all the nodes in it are α-weight-balanced. Intuitively, a tree is α-weight-balanced if, for any subtree, the sizes of its left and right subtree are approximately equal.

### 1c. What is the interpretation of the α parameter?


A

### 1d. What are the conditions for triggering a rebuild of a subtree (during inserts) or the entire tree (during deletes)?


A

### 2. Implement a scapegoat tree that supports insert, size, delete, and contains operations.

In [1]:
import unittest

In [2]:
class Node:
    def __init__(self, key):
        self.key = key
        self.left = None
        self.right = None

In [3]:
class ScapegoatTree:
    def __init__(self, alpha):
        self.root = None
        self.size = 0
        self.alpha = alpha
        self.rebuild_count = 0

    def size_of_subtree(self, node):
        return 0 if node is None else 1 + self.size_of_subtree(node.left) + self.size_of_subtree(node.right)

    def insert(self, key):
        self.root = self._insert(self.root, key)
        self.size += 1

    def _insert(self, node, key):
        if node is None:
            return Node(key)

        if key < node.key:
            node.left = self._insert(node.left, key)
        elif key > node.key:
            node.right = self._insert(node.right, key)

        if self._is_unbalanced(node):
            self.rebuild_count += 1
            return self._rebuild(node)

        return node

    def _is_unbalanced(self, node):
        if node is None:
            return False

        size_left = self.size_of_subtree(node.left)
        size_right = self.size_of_subtree(node.right)

        return size_left > self.alpha * (size_left + size_right) or size_right > self.alpha * (size_left + size_right)

    def _rebuild(self, node):
        nodes = self._inorder_traversal(node)
        return self._build_tree_from_sorted_nodes(nodes, 0, len(nodes) - 1)

    def _inorder_traversal(self, node):
        result = []
        self._inorder_traversal_recursive(node, result)
        return result

    def _inorder_traversal_recursive(self, node, result):
        if node is not None:
            self._inorder_traversal_recursive(node.left, result)
            result.append(node)
            self._inorder_traversal_recursive(node.right, result)

    def _build_tree_from_sorted_nodes(self, nodes, start, end):
        if start > end:
            return None

        mid = (start + end) // 2
        root = nodes[mid]
        root.left = self._build_tree_from_sorted_nodes(nodes, start, mid - 1)
        root.right = self._build_tree_from_sorted_nodes(nodes, mid + 1, end)

        return root

    def toList(self):
        return [node.key for node in self._inorder_traversal(self.root)]

    def size(self):
        return self.size

    def contains(self, key):
        return self._contains(self.root, key)

    def _contains(self, node, key):
        if node is None:
            return False

        if key == node.key:
            return True
        elif key < node.key:
            return self._contains(node.left, key)
        else:
            return self._contains(node.right, key)

    def delete(self, key):
        if self.contains(key):
            self.root = self._delete(self.root, key)
            self.size -= 1

    def _delete(self, node, key):
        if node is None:
            return None

        if key < node.key:
            node.left = self._delete(node.left, key)
        elif key > node.key:
            node.right = self._delete(node.right, key)
        else:
            if node.left is None:
                return node.right
            elif node.right is None:
                return node.left
            else:
                successor = self._get_min(node.right)
                node.key = successor.key
                node.right = self._delete(node.right, successor.key)

        if self._is_unbalanced(node):
            self.rebuild_count += 1
            return self._rebuild(node)

        return node

    def _get_min(self, node):
        while node.left is not None:
            node = node.left
        return node


### 3. Write unit tests that involve the insert, remove, size, contains, and toList() operations.

In [4]:
class TestScapegoatTree(unittest.TestCase):
    def test_insert_contains_size(self):
        tree = ScapegoatTree(0.75)
        keys = [3, 1, 5, 2, 4]

        for key in keys:
            tree.insert(key)

        self.assertEqual(tree.size, len(keys))
        for key in keys:
            self.assertTrue(tree.contains(key))

    def test_delete_contains_size(self):
        tree = ScapegoatTree(0.75)
        keys = [3, 1, 5, 2, 4]

        for key in keys:
            tree.insert(key)

        tree.delete(2)
        tree.delete(5)

        self.assertEqual(tree.size, len(keys) - 2)
        self.assertFalse(tree.contains(2))
        self.assertFalse(tree.contains(5))

    def test_toList(self):
        tree = ScapegoatTree(0.75)
        keys = [3, 1, 5, 2, 4]

        for key in keys:
            tree.insert(key)

        sorted_keys = sorted(keys)
        self.assertEqual(tree.toList(), sorted_keys)

    def test_rebuild_count(self):
        tree = ScapegoatTree(0.75)
        keys = [3, 1, 5, 2, 4]

        for key in keys:
            tree.insert(key)

        # Inserting the 6th element triggers a rebuild
        tree.insert(6)
        self.assertEqual(tree.rebuild_count, 1)

        # Deleting an element triggers another rebuild
        tree.delete(2)
        self.assertEqual(tree.rebuild_count, 2)

    def test_empty_tree(self):
        tree = ScapegoatTree(0.75)

        self.assertEqual(tree.size, 0)
        self.assertFalse(tree.contains(1))
        self.assertEqual(tree.toList(), [])


In [5]:
if __name__ == '__main__':
    unittest.main()


E
ERROR: C:\Users\arneyh\AppData\Roaming\jupyter\runtime\kernel-ae682027-f66f-425b-a387-fdd690663896 (unittest.loader._FailedTest.C:\Users\arneyh\AppData\Roaming\jupyter\runtime\kernel-ae682027-f66f-425b-a387-fdd690663896)
----------------------------------------------------------------------
AttributeError: module '__main__' has no attribute 'C:\Users\arneyh\AppData\Roaming\jupyter\runtime\kernel-ae682027-f66f-425b-a387-fdd690663896'

----------------------------------------------------------------------
Ran 1 test in 0.003s

FAILED (errors=1)


SystemExit: True

  warn("To exit: use 'exit', 'quit', or Ctrl-D.", stacklevel=1)


### 4. Benchmark the insert, delete, and contains operations of your implementation on data sets of different sizes. Create tables and plots that include both run times and the number of times the rebuild operation was performed.

### 5. Analyze and interpret the benchmark results to determine if the run time of your implementation is consistent with the theoretical analysis.
