# COMP0005 - GROUP COURSEWORK
# Experimental Evaluation of Search Data Structures and Algorithms

The cell below defines **AbstractSearchInterface**, an interface to support basic insert/search operations; you will need to implement this three times, to realise your three search data structures of choice among: (1) *2-3 Tree*, (2) *AVL Tree*, (3) *LLRB BST*; (4) *B-Tree*; and (5) *Scapegoat Tree*. <br><br>**Do NOT modify the next cell** - use the dedicated cells further below for your implementation instead. <br>

In [16]:
# DO NOT MODIFY THIS CELL

from abc import ABC, abstractmethod  

class AbstractSearchInterface(ABC):
    '''
    Abstract class to support search/insert operations (plus underlying data structure)
    
    '''
        
    @abstractmethod
    def insertElement(self, element):     
        '''
        Insert an element in a search tree
            Parameters:
                    element: string to be inserted in the search tree (string)

            Returns:
                    "True" after successful insertion, "False" if element is already present (bool)
        '''
        
        pass 
    

    @abstractmethod
    def searchElement(self, element):
        '''
        Search for an element in a search tree
            Parameters:
                    element: string to be searched in the search tree (string)

            Returns:
                    "True" if element is found, "False" otherwise (bool)
        '''

        pass

Use the cell below to define any auxiliary data structure and python function you may need. Leave the implementation of the main API to the next code cells instead.

In [17]:
# ADD AUXILIARY DATA STRUCTURE DEFINITIONS AND HELPER CODE HERE

class LLRBNode:
    
    def __init__(self, key, is_red=True):
            self.key = key
            self.is_red = is_red
            self.left = None
            self.right = None

Use the cell below to implement the requested API by means of **2-3 Tree** (if among your chosen data structure).

In [18]:
class TwoThreeTree(AbstractSearchInterface):
        
    def insertElement(self, element):
        inserted = False
        # ADD YOUR CODE HERE
      
        
        return inserted
    
    

    def searchElement(self, element):     
        found = False
        # ADD YOUR CODE HERE

        
        return found    

Use the cell below to implement the requested API by means of **AVL Tree** (if among your chosen data structure).

In [19]:
class AVLTree(AbstractSearchInterface):
        
    def insertElement(self, element):
        inserted = False
        # ADD YOUR CODE HERE
      
        
        return inserted
    
    

    def searchElement(self, element):     
        found = False
        # ADD YOUR CODE HERE

        
        return found  

Use the cell below to implement the requested API by means of **LLRB BST** (if among your chosen data structure).

In [20]:


class LLRBBST(AbstractSearchInterface):
        
    def __init__(self):
        self.root = None
        
    
    def _is_red(self, n):
        if not n:
            return False
        return n.is_red
    
    
    def _rotate_left(self, n):
        x = n.right
        n.right = x.left
        x.left = n
        x.is_red = n.is_red
        n.is_red = True
        return x
    
    
    def _rotate_right(self, n):
        x = n.left
        n.left = x.right
        x.right = n
        x.is_red = n.is_red
        n.is_red = True
        return x
        
        
    def _flip_colours(self, n):
        n.is_red = not n.is_red
        if n.left:
            n.left.is_red = not n.left.is_red
        if n.right:
            n.right.is_red = not n.right.is_red
        
        
    def insertHelper(self, n, element):
        if n is None:
            return LLRBNode(element), True
        
        inserted = False
        
        if element == n.key:
            return n, False
        elif element < n.key:
            n.left, inserted = self.insertHelper(n.left, element)
        else:
            n.right, inserted = self.insertHelper(n.right, element)
            
            
        if self._is_red(n.right) and not self._is_red(n.left):
            n = self._rotate_left(n)
        if self._is_red(n.left) and self._is_red(n.left.left):
            n = self._rotate_right(n)
        if self._is_red(n.left) and self._is_red(n.right):
            n = self._flip_colours(n)
            
        return n, inserted
        
    def insertElement(self, element):
        self.root, inserted = self.insertHelper(self.root, element)
        if self.root:
            self.root.is_red = False
        return inserted


    def searchElement(self, element):     
        found = False
        # ADD YOUR CODE HERE
        '''
        Search for an element in a search tree
            Parameters:
                    element: string to be searched in the search tree (string)

            Returns:
                    "True" if element is found, "False" otherwise (bool)
        '''

        
        return found  
    
    
if __name__ == "__main__":
    tree = LLRBBST()
    tree.insertElement("hello")
    tree.insertElement("helloo")
    print(tree.root.left.key)

hello


Use the cell below to implement the requested API by means of **B-Tree** (if among your chosen data structure).

In [21]:
class BTree(AbstractSearchInterface):
        
    def insertElement(self, element):
        inserted = False
        # ADD YOUR CODE HERE
      
        
        return inserted
    
    

    def searchElement(self, element):     
        found = False
        # ADD YOUR CODE HERE

        
        return found

Use the cell below to implement the requested API by means of **Scapegoat Tree** (if among your chosen data structure).

In [22]:
class ScapegoatTree(AbstractSearchInterface):
        
    def insertElement(self, element):
        inserted = False
        # ADD YOUR CODE HERE
      
        
        return inserted
    
    

    def searchElement(self, element):     
        found = False
        # ADD YOUR CODE HERE

        
        return found 

Use the cell below to implement the **synthetic data generator** needed by your experimental framework (be mindful of code readability and reusability).

In [None]:
import string
import random

class TestDataGenerator():
    
    def __init__(self, num_samples=1000):
        self.num_samples = num_samples
        self.letters = string.ascii_letters
        self.lowercase = string.ascii_lowercase

    #generates random strings of fixed length, can be uppercase or lowercase
    def gen_rand(self, strlen):
        result = []
        for _ in range(self.num_samples):
            result.append(''.join(random.choices(self.letters, k=strlen)))
        return result
    
    # [aaa, aab, aac, aad, aae, aaf..... aba, abb, abc, abd]
    def gen_sorted(self, strlen, reversed=False):
        if len(self.letters)**strlen < self.num_samples:
            print("Not enough combinations. Use a larger string length")
            return
        
        
        characters = self.lowercase if not reversed else self.lowercase[::-1]
        
        def generate_combinations(prefix, length):
            if length == 0:
                result.append(prefix)
                return
            for char in characters:
                if len(result) >= self.num_samples:
                    return
                generate_combinations(prefix + char, length - 1)
        
        result = []
        generate_combinations('', strlen)
        return result
            
    
    def gen_high_dup(self):
        pass
    
    def gen_rand_len(self):
        pass
    
    def gen_increasing_len(self):
        pass
    
    def gen_huge_len(self):
        pass
    
test_code = TestDataGenerator(100)
print(test_code.gen_sorted(5, True))




Use the cell below to implement the requested **experimental framework** (be mindful of code readability and reusability).

In [24]:
import timeit, random, string
import matplotlib.pyplot as plt

class ExperimentalFramework():
    '''
    A class to represent an experimental framework.

    Attributes
    ----------
    results : dict
        A dictionary to store timing results. Each key could be the data size or a label,
        and each value is the timing measurement.

    Methods
    -------
    run_experiment(data_structure, data)
        Measures insertion and/or search time on a given data structure with given data.
    plot_results()
        Visualizes the results with matplotlib.
    '''
            
    def __init__(self):
        # You can store your collected timing results here.
        self.results = {}

    def run_experiment(self, data_structure, data, label="Experiment1"):
       pass

    def plot_results(self):
        pass
    

Use the cell below to illustrate the python code you used to **fully evaluate** your three chosen search data structures and algortihms. The code below should illustrate, for example, how you made used of the **TestDataGenerator** class to generate test data of various size and properties; how you instatiated the **ExperimentalFramework** class to  evaluate each data structure using such data, collect information about their execution time, plot results, etc. Any results you illustrate in the companion PDF report should have been generated using the code below.

In [25]:
# ADD YOUR TEST CODE HERE 



