# Instructions
* You must work on this assignment individually. We check for similarities and find collusion cases every semester. You've been warned!
* This assignment contributes 20% towards your final mark in FIT5211.
* The subjects are sorting, trees and dynamic programming.
* The exercises are roughly given by increasing difficulty. Obtaining a 100% mark may be very hard, depending on your background, and will probably take many hours. The assignment is written such that everyone can correctly complete the first exercises, but it is likely that the last ones will only be succesfully completed by few.
* You are expected to comment your code. Marks will be removed if your code is insufficiently documented.
* You may create auxiliary/helper functions for each question, as long as you use the pre-defined functions, if any is given.
* Unit tests have only been provided for Exercise 1. You will need to write your own tests when they are not provided. 
* The expected deliverable is this Jupyter Notebook, completed with your answers.
* For questions on this assigment, please use the [Moodle forum](https://moodle.vle.monash.edu/mod/forum/view.php?id=5193080).
* The deadline is October 19, 23:55, submission via [Moodle](https://moodle.vle.monash.edu/course/view.php?id=45761&section=5#5). If this does not work, and only in this case, send your Notebook to pierre.lebodic@monash.edu.
* The late penalty is 10 marks (deducted from your original mark) per late day.
* Have fun!

# Exercise 1: Sorting Lists of Pairs (45 marks)

(Note that in this exercise, questions may be attempted without having completed all previous questions)

Consider the abstract Python class below:

In [1]:
class Comparison:
    def __init__(self):
        pass
    
    #returns True if the two objects are comparable,
    #False otherwise
    def areComparable(self, other):
        raise Exception("NotImplementedException")
        
    #returns True if the two objects are equal,
    #False otherwise
    def __eq__(self, other):
        raise Exception("NotImplementedException")
   
    #returns True if self > other,
    #False otherwise
    def __gt__(self, other):
        raise Exception("NotImplementedException")
        
    #returns True if self < other,
    #False otherwise
    def __lt__(self, other):
        raise Exception("NotImplementedException")    
        
    def __ne__(self, other):
        return not self.__eq__(other)
                
    def __ge__(self, other):
        return self.__eq__(other) or self.__gt__(other)

    def __le__(self, other):
        return self.__eq__(other) or self.__lt__(other)
        
    def compare(self, other):
        if self.areComparable(other) is False:
            return None
        elif self == other:
            return 0
        elif self < other:
            return -1
        elif self > other:
            return 1
        else:
            assert False, "Inconsistent operation definitions"

The Comparison class provides a way to model items that are not always comparable. For instance, the pair of integers $(5, 10)$ is greater than $(4, 8)$, but it is not comparable to $(6, 5)$, because $5 < 6$ and $10 > 5$.

In this exercise, we will look into different ways to sort list of pairs. We will suppose that the pairs in a list are all different.  

## Question 1.1 (10 marks)

The rules of comparison between two Pairs $(a, b)$ and $(c, d)$ are:
* $(a, b) == (c, d)$ if and only if $a == c$ and $b == d$,
* $(a, b) > (c, d)$ if and only if ($a > c$ and $b \geq d$) or ($a \geq c$ and $b > d$),
* $(a, b) < (c, d)$ if and only if ($a < c$ and $b \leq d$) or ($a \leq c$ and $b < d$).

We say that $(a, b)$ and $(c, d)$ are comparable if 
* $(a, b) == (c, d)$, or
* $(a, b) > (c, d)$, or
* $(a, b) < (c, d)$.

We ask that you implement the rules above in the class called Pair below, by completing the functions that have a comment "#TODO" in the body. Note that the class Pair inherits from Comparison.

In [2]:
class Pair(Comparison):
    def __init__(self, x, y):
        self.x = x
        self.y = y
    
    def __str__(self):
        return "({},{})".format(self.x, self.y)
        
    def areComparable(self, other):
        if (self.x >= other.x and self.y >= other.y):
            return True
        
        if (self.x<= other.x and self.y<= other.y):
            return True
        
        return False
        

        
        
    def __eq__(self, other):
        
        return (self.x == other.x and self.y == other.y)
    
    
    def __gt__(self, other):
  
        if (self.x >= other.x and self.y > other.y) or (self.x > other.x and self.y>= other.y):
            return True
        else:
            return False
        

        
    def __lt__(self, other):
        
        if (self.x <= other.x and self.y < other.y) or (self.x < other.x and self.y<= other.y):
            return True
        else:
            return False
        

We provide a test class below. You don't need to edit it, but the class Pair you write need pass these tests! 

In [3]:
import unittest

class TestPair(unittest.TestCase):
    def setUp(self):
        self.v00 = Pair(0,0)
        self.v01 = Pair(0,1)
        self.v10 = Pair(1, 0)
        self.v11 = Pair(1, 1)
        self.v21 = Pair(2, 1)
        self.v31 = Pair(3, 1)
        self.v23 = Pair(2, 3)
        self.v23other = Pair(2, 3)
        
    def test_areComparable(self):
        self.assertTrue(self.v00.areComparable(self.v01))
        self.assertTrue(self.v01.areComparable(self.v00))
        
        self.assertTrue(self.v11.areComparable(self.v00))
        self.assertTrue(self.v00.areComparable(self.v11))
        
        self.assertTrue(self.v21.areComparable(self.v23))
        self.assertTrue(self.v23.areComparable(self.v21))
        
        self.assertTrue(self.v23.areComparable(self.v23))
        
        self.assertTrue(self.v23.areComparable(self.v23other))
        self.assertTrue(self.v23other.areComparable(self.v23))
        
        self.assertFalse(self.v01.areComparable(self.v10))
        self.assertFalse(self.v10.areComparable(self.v01))
        
        self.assertFalse(self.v31.areComparable(self.v23))
        self.assertFalse(self.v23.areComparable(self.v31))
        
    def test_eq(self):
        self.assertTrue(self.v00 == self.v00)
        self.assertTrue(self.v21 == self.v21)
        self.assertTrue(self.v23 == self.v23)
        self.assertTrue(self.v23 == self.v23other)
        
        self.assertFalse(self.v00 == self.v11)
        self.assertFalse(self.v21 == self.v11)
        self.assertFalse(self.v21 == self.v23)
        
    def test_ne(self):
        self.assertFalse(self.v00 != self.v00)
        self.assertFalse(self.v21 != self.v21)
        self.assertFalse(self.v23 != self.v23)
        self.assertFalse(self.v23 != self.v23other)
        
        self.assertTrue(self.v00 != self.v11)
        self.assertTrue(self.v21 != self.v11)
        self.assertTrue(self.v21 != self.v23)
        
    def test_gt(self):
        self.assertTrue(self.v01 > self.v00)
        self.assertTrue(self.v10 > self.v00)
        self.assertTrue(self.v31 > self.v21)
        
        self.assertFalse(self.v00 > self.v01)
        self.assertFalse(self.v00 > self.v01)
        self.assertFalse(self.v21 > self.v31)
        
        self.assertFalse(self.v10 > self.v01)
        self.assertFalse(self.v01 > self.v10)
        self.assertFalse(self.v31 > self.v23)
        self.assertFalse(self.v23 > self.v31)
        
    def test_lt(self):
        self.assertFalse(self.v01 < self.v00)
        self.assertFalse(self.v10 < self.v00)
        self.assertFalse(self.v31 < self.v21)
        
        self.assertTrue(self.v00 < self.v01)
        self.assertTrue(self.v00 < self.v01)
        self.assertTrue(self.v21 < self.v31)
        
        self.assertFalse(self.v10 < self.v01)
        self.assertFalse(self.v01 < self.v10)
        self.assertFalse(self.v31 < self.v23)
        self.assertFalse(self.v23 < self.v31)

In [4]:
test = TestPair()
suite = unittest.TestLoader().loadTestsFromModule(test)
unittest.TextTestRunner().run(suite)

.....
----------------------------------------------------------------------
Ran 5 tests in 0.005s

OK


<unittest.runner.TextTestResult run=5 errors=0 failures=0>

In the following questions, we suppose that we have a set of Pairs, and that not every two pairs in that set are comparable.

## Question 1.2 (10 marks)

Given a list $l$ of Pairs in no particular order, use a sorting algorithm similar to *selection sort* to sort $l$ such that, at the end of the algorithm, for every two pairs $l[i]=(a, b)$ and $l[j]=(c, d)$ at index $i$ and $j$ in $l$, respectively, with $i <j$, we have:
* either $(a, b) \leq (c, d)$,
* or $(a, b)$ and $(c, d)$ are not comparable.

In [5]:
def pairSort(l):
    n = len(l)
    
    for change_pos in range(n-1,0,-1):
        biggest_pos = 0
        
        for check_pos in range(1,change_pos+1):
            
            if l[check_pos] > l[biggest_pos]:
                biggest_pos = check_pos
        
        l[biggest_pos] , l[change_pos] = l[change_pos] , l[biggest_pos]

    
    
    
    

Again, we provide a test class below. You don't need to edit it, but the method pairsort you wrote needs to pass these tests! 

In [6]:
import unittest
import random

class TestPairSort(unittest.TestCase):
    def setUp(self):
        self.PairClass = Pair
        self.sortAlgo = pairSort
    
    def test1(self):
        #in this test we suppose x=0 for all entries
        #this means that the algorithm should do a regular bubble sort
        #we also suppose everything is already ordered
        l = [self.PairClass(0, 1), self.PairClass(0, 3), self.PairClass(0, 4), self.PairClass(0, 6), 
            self.PairClass(0, 8), self.PairClass(0, 15), ]
        self.sortAlgo(l)
        #for item in l: print(item)
        self.checkorder(l)
            
    def test2(self):
        #in this test we suppose x=0 for all entries
        #this means that the sort algorithms should do a "regular" sort
        l = [self.PairClass(0, 8), self.PairClass(0, 4), self.PairClass(0, 3), self.PairClass(0, 9), 
            self.PairClass(0, 10), self.PairClass(0, 5), ]
        self.sortAlgo(l)
        #for item in l: print(item)
        self.checkorder(l)

            
    def test3(self):
        #in this test we suppose x and y are not fixed
        #we also suppose that everything is in a good order
        l = [self.PairClass(5, 8), self.PairClass(5, 10), self.PairClass(6, 10), self.PairClass(7, 12), 
            self.PairClass(5, 12), self.PairClass(9, 12), ]
        self.sortAlgo(l)
        #for item in l: print(item)
        self.checkorder(l)
        
    def test4(self):
        #in this test we suppose x and y are not fixed
        #we suppose there is no two pairs that are comparable
        l = [self.PairClass(8, 8), self.PairClass(5, 10), self.PairClass(10, 8), self.PairClass(12, 5)]
        self.sortAlgo(l)
        #for item in l: print(item)
        self.checkorder(l)
        
    def test5(self):
        #in this test we suppose x and y are not fixed
        #the input is in arbitrary order
        #we only test one case
        l = [self.PairClass(5, 8), self.PairClass(10, 10), self.PairClass(12, 5), self.PairClass(9, 5),
             self.PairClass(5, 10), self.PairClass(7, 2), self.PairClass(6, 10) ]
        self.sortAlgo(l)
        #for item in l: print(item)
        self.checkorder(l)
    
    def test6(self):
        #in this test we suppose x and y are not fixed
        #the input is in arbitrary order
        #we generate many cases
        for size in range(0, 100):
            #create a list
            l = []
            xs = random.sample(range(0, 100), size)
            xy = random.sample(range(0, 100), size)
            for i in range(0, size):
                l.append(self.PairClass(xs[i], xy[i]))
            self.sortAlgo(l)
            #for item in l: print(item)
            self.checkorder(l)
        
    def checkorder(self, l):
        for i in range(0, len(l)-1):
            for j in range(i, len(l)-1):
                if l[i].areComparable(l[j]):
                    self.assertTrue(l[i] <= l[j], "{} and {} are not well ordered".format(l[i], l[j]) )

In [7]:
test = TestPairSort()
suite = unittest.TestLoader().loadTestsFromModule(test)
unittest.TextTestRunner().run(suite)

......
----------------------------------------------------------------------
Ran 6 tests in 0.379s

OK


<unittest.runner.TextTestResult run=6 errors=0 failures=0>

## Question 1.3 (5 marks)

Suppose we have a list of Pairs of integers.
We want to implement a comparison mechanism between two pairs of integers using a function **pairKey**, which takes a pair as input and outputs a single number. The function pairKey should be such that for two pairs $(a, b)$ and $(c, d)$,
* $(a,b) > (c,d)$ implies $pairKey((a,b)) > pairKey((c,d))$,
* $(a,b) < (c,d)$ implies $pairKey((a,b)) < pairKey((c,d))$,
* $(a,b) == (c,d)$ implies $pairKey((a,b)) == pairKey((c,d))$.

In [8]:
def pairKey(pair): # If we add x and y values of a pair
                    # It will always give a bigger value for a bigger pair
                    # smaller value for a smaller pair and equal value for equal pair
                    # For uncomparable ones , result can be anything.
    return pair.x + pair.y

If you have defined the function pairKey correctly, we should now be able to use the built-in sort function of Python to sort our Pairs:

In [9]:
def pairSortWithKey(l):
    l.sort(key = pairKey)

(For more information on what the above does, please refer to https://docs.python.org/3/howto/sorting.html)

In [10]:
class TestKeyPairSort(TestPairSort):
    def setUp(self):
        self.PairClass = Pair
        self.sortAlgo = pairSortWithKey

In [11]:
test = TestKeyPairSort()
suite = unittest.TestLoader().loadTestsFromModule(test)
unittest.TextTestRunner().run(suite)

......
----------------------------------------------------------------------
Ran 6 tests in 0.290s

OK


<unittest.runner.TextTestResult run=6 errors=0 failures=0>

## Question 1.4 (5 marks)

Prove that the pairKey function you have defined above provides a guarantee that, if used as key for Python's sort, then the list will be sorted as stated in Queston 1.2.

Question 1.2 gives 2 rules to obey while sorting;

For any of the pair $(a,b)$ in $l$
$l[i]=(a, b)$ and $l[j]=(c, d)$ at index $i$ and $j$ in $l$, respectively, with $i <j$

* 1-) either $(a, b) \leq (c, d)$,
* 2-) or $(a, b)$ and $(c, d)$ are not comparable.

If we summarise these 2 rules in one , it becomes as follows;


- There can't be any $(c,d) < (a,b)$ 

if $(c,d) < (a,b)$ , 2 cases possible;

1-) $c <= a$ and $d < b$  ----> pairKey(c,d) = c+d  < pairkey(a,b) = a+b

2-) $c < a$ and $d <= b$  ----> pairKey(c,d) = c+d  < pairkey(a,b) = a+b

Therefore;
if a pair is smaller than the other our pairKey function will always return a smaller value and that's why Python's sort function will not put the small one in a bigger index which is the only condition needed to be satisfied by q1.2

## Question 1.5 (5 marks)

(We have *not* covered the concept of stability in this unit. If you are not yet familiar with this concept, learning about it is part of the question. See e.g. [here](https://en.wikipedia.org/wiki/Sorting_algorithm#Stability) to start.)

In this question we ask you to use Python's built-in sort. Use the fact that Python's sort is stable to provide a simple solution to sort a list of pairs as stated in Question 1.2. Use unit testing as in the previous questions to test your solution. Both your solution and the unit testing will be assessed. (no need to rewrite *new* test cases; use the old ones).

In [12]:
def pairSortUsingStability(l):
    l.sort(key=(lambda x: (x.x , x.y)))  

In [13]:
class TestPairSortUsingStability(TestPairSort):
    def setUp(self):
        self.PairClass = Pair
        self.sortAlgo = pairSortUsingStability
        
    def test1(self):
        #in this test we suppose x=0 for all entries
        #this means that the algorithm should do a regular bubble sort
        #we also suppose everything is already ordered
        l = [self.PairClass(0, 1), self.PairClass(0, 3), self.PairClass(0, 4), self.PairClass(0, 6), 
            self.PairClass(0, 8), self.PairClass(0, 15), ]
        self.sortAlgo(l)
        #for item in l: print(item)
        self.checkorder(l)
            
    def test2(self):
        #in this test we suppose x=0 for all entries
        #this means that the sort algorithms should do a "regular" sort
        l = [self.PairClass(0, 8), self.PairClass(0, 4), self.PairClass(0, 3), self.PairClass(0, 9), 
            self.PairClass(0, 10), self.PairClass(0, 5), ]
        self.sortAlgo(l)
        #for item in l: print(item)
        self.checkorder(l)

            
    def test3(self):
        #in this test we suppose x and y are not fixed
        #we also suppose that everything is in a good order
        l = [self.PairClass(5, 8), self.PairClass(5, 10), self.PairClass(6, 10), self.PairClass(7, 12), 
            self.PairClass(5, 12), self.PairClass(9, 12), ]
        self.sortAlgo(l)
        #for item in l: print(item)
        self.checkorder(l)
        
    def test4(self):
        #in this test we suppose x and y are not fixed
        #we suppose there is no two pairs that are comparable
        l = [self.PairClass(8, 8), self.PairClass(5, 10), self.PairClass(10, 8), self.PairClass(12, 5)]
        self.sortAlgo(l)
        #for item in l: print(item)
        self.checkorder(l)
        
    def test5(self):
        #in this test we suppose x and y are not fixed
        #the input is in arbitrary order
        #we only test one case
        l = [self.PairClass(5, 8), self.PairClass(10, 10), self.PairClass(12, 5), self.PairClass(9, 5),
             self.PairClass(5, 10), self.PairClass(7, 2), self.PairClass(6, 10) ]
        self.sortAlgo(l)
        #for item in l: print(item)
        self.checkorder(l)
    
    def test6(self):
        #in this test we suppose x and y are not fixed
        #the input is in arbitrary order
        #we generate many cases
        for size in range(0, 100):
            #create a list
            l = []
            xs = random.sample(range(0, 100), size)
            xy = random.sample(range(0, 100), size)
            for i in range(0, size):
                l.append(self.PairClass(xs[i], xy[i]))
            self.sortAlgo(l)
            #for item in l: print(item)
            self.checkorder(l)
        
    def checkorder(self, l):
        for i in range(0, len(l)-1):
            for j in range(i, len(l)-1):
                if l[i].areComparable(l[j]):
                    self.assertTrue(l[i] <= l[j], "{} and {} are not well ordered".format(l[i], l[j]) )
        
        
        
    
        
        

In [14]:
test = TestPairSortUsingStability()
suite = unittest.TestLoader().loadTestsFromModule(test)
unittest.TextTestRunner().run(suite)

......
----------------------------------------------------------------------
Ran 6 tests in 0.302s

OK


<unittest.runner.TextTestResult run=6 errors=0 failures=0>

## Question 1.6 (5 marks)

Define a function pairSortFast that takes a list $l$ of $n$ Pairs as input and sort this list in $O(n)$ time in the worst case (not the amortised worst case). Suppose that any pair $(a,b)$ in $l$ is such that $a$ and $b \in \{0, \dots, U\}$, where $U$ is a small integer. Hint: *a clever hashing function* may help.

In [15]:
def pairSortFast(l,u):
    output=[]
    hash_table=[[]for _ in range(u+1)]
    
    for each_pair in l:
        index=each_pair.x
        hash_table[index] += [each_pair]
        
    
    for i in range(u+1):
        if len(hash_table[i]) > 1:
            pairSort(hash_table[i])
    
    
    for i in range(u+1):
        if len(hash_table[i]) != 0:
            output += hash_table[i]
            
    return output  

## Question 1.7 (5 marks)

Thoroughly benchmark all the sorting algorithms you have written, and plot their running time as a function of the size of the input list. You may use methods and code seen in the lectures or tutes and pracs during the semester.

In [16]:
import random
from timeit import default_timer as timer
import datetime
import matplotlib.pyplot as plt 
%matplotlib



def generatelist(n, lower = 0, upper = 1000, seed = 0):
    random.seed(seed)
    l = [None] * n
    for i in range(0,n):
        l[i] = Pair(random.randrange(lower, upper),random.randrange(lower, upper))
    return l





sorting_algorithms = [pairSort, pairSortWithKey,pairSortUsingStability,pairSortFast]

styles = {}
styles[pairSort] = "r"
styles[pairSortWithKey] = "g"
styles[pairSortUsingStability] = "b"
styles[pairSortFast] = "g--"




def computeandplot(sorting_algorithms):
    step = 100
    nsamples = 100
    samples = range(0, nsamples)
    timelimit = 0.05 #seconds, per run
    totaltime = 0
    print("Maximum computing time:", \
          str(datetime.timedelta(seconds=\
            timelimit*nsamples*len(sorting_algorithms))))
    for sortalgo in sorting_algorithms:
            attimelimit = False
            times = [timelimit for _ in samples]
            for i in samples:
                if i == 0:
                    times[i] = 0
                    continue
                n = step * i
                if attimelimit == False:
                    l = generatelist(n)
                    if sortalgo != pairSortFast:
                        start = timer()
                        sortalgo(l)
                        end = timer()
                    else:
                        start=timer()
                        sortalgo(l,1000)
                        end=timer()
                    times[i] = end - start
                    totaltime += times[i]
                    if times[i] > timelimit:
                        times[i] = timelimit
                        attimelimit = True

            plt.plot(samples, times, \
                     styles[sortalgo], \
                     label=sortalgo.__name__)
    print("Actual computing time:", str(datetime.timedelta(seconds=int(totaltime))))
    plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)
    plt.xlabel("list size / {}".format(step))
    plt.ylabel("time in seconds")
    plt.show()

computeandplot(sorting_algorithms)


Using matplotlib backend: MacOSX
Maximum computing time: 0:00:20
Actual computing time: 0:00:01


# Exercise 2 (25 marks)

You are given a tree $T$ with an integer value (negative or positive) at each node. We want to select a subtree of $T$, which contains at least the root ot $T$, that minimises the sum of the values at its nodes. Note that the answer is trivially $T$ if all nodes have a negative value, and simply the root of $T$ if all nodes have a positive value.

## Question 2.1 (5 marks)

Define a data structure to store $T$.

In [17]:
class NArytree:
    def __init__(self,key):
        self.key = key
        self.children =[]
        
    def __str__(self):
        return str(self.key)
    
    def addChild(self,tree):
        self.children.append(tree)
    
    def getChild(self,k):
        return self.children[k]
    
    def getNchildren(self):
        return len(self.children)
    


## Question 2.2 (15 marks)

Design and code a dynamic programming algorithm that solves this problem and returns the optimal value. What is its $O()$ complexity in the worst-case?

In [18]:
def optimalValFinder(t):
    result=[]          # Carries the choosen nodes in the Tree
    result.append(t.key) # First adds root node for all cases.

    for i in t.children:           # Takes every child of the root to decide whether it worths to add or not using helper function
        result+=_optimalValFinder(i)   # For every child of the root, returns the path( as list of values of path) that gives negative summation.
                                       # If there is no path (under the investigated child) with negative sum returns empty list.


    return sum(result)   # Adds up all the values that are reasonable to taken to find the optimum value.




def _optimalValFinder(t):
    
    sum_of_childs=[]    # Carries choosable childs of a parent. For every parent starts with an empty list then adds childs.
    nodes_to_take=[]    # Carries the list of nodes that are decided to be choosen.
    
    if t.getNchildren() == 0 :          # if it is a leaf, we take into account just the negative values
        if t.key<0:                     # Because there is no reason to get a leaf with positive value.
            sum_of_childs.append(t.key) # There is no children that can balance a leaf with a negative value.
        
        return sum_of_childs
    
    
    if t.getNchildren() !=0:
       
        for each_child in t.children:
            sum_of_childs+= _optimalValFinder(each_child)  # When the loop finishes for a parent
                                                           # sum_of_childs carries all choosable childs of that parent 
    

        
        if sum(sum_of_childs) + t.key < 0:  # This step is where we decide to take or not the parent and their choosable children
            nodes_to_take += sum_of_childs  # If Choosable children + parents value is negative, we add them together to the nodes_to_take list 
            nodes_to_take.append(t.key)     # Because we can not get children without parent and there is no reason to take a sub Tree with positive value.
    
    
    return nodes_to_take
    

In my aproach, values of the nodes doesn't effect the complexity because code above starts it's investigation from the leaf and value of the nodes just effect the last if statement's result so it doesn't use anything else than an append function and one + of lists which will not change anything in O() complexity. Therefore , O(n) is the complexity of optimalValFinder() function in all worst, average and best case scenerios.(Where n is the number of values in the tree)

## Question 2.3 (5 marks)

 Write and run unit tests and performance tests.

In [19]:
import unittest

class TestNArytree(unittest.TestCase):
    def setUp(self):             # Creating a Tree structure by adding parent-child relationships
        self.t1=NArytree(100)    
        self.t2=NArytree(-5)
        self.t3=NArytree(3)
        self.t4=NArytree(57)
        self.t5=NArytree(-10)
        self.t6=NArytree(8)
        self.t7=NArytree(-5)
        self.t8=NArytree(-1)
        self.t9=NArytree(-8)
        self.t10=NArytree(-6)
        self.t11=NArytree(-5)
        self.t12=NArytree(-23)
        self.t13=NArytree(51)
        
        
        
        
        self.t1.addChild(self.t2)
        self.t1.addChild(self.t3)
        self.t1.addChild(self.t4)
        

        self.t2.addChild(self.t5)
        self.t2.addChild(self.t6)

        self.t3.addChild(self.t7)
        self.t3.addChild(self.t8)

        self.t4.addChild(self.t9)

        self.t9.addChild(self.t10)
        self.t9.addChild(self.t11)

        self.t10.addChild(self.t12)
        self.t10.addChild(self.t13)
        
        
    
    def test_addChild(self):
        
        
        self.assertTrue(self.t2 in self.t1.children)
        self.assertTrue(self.t3 in self.t1.children)
        self.assertTrue(self.t4 in self.t1.children)
        
        self.assertTrue(self.t5 in self.t2.children)
        self.assertTrue(self.t6 in self.t2.children)
        
        self.assertTrue(self.t7 in self.t3.children)
        self.assertTrue(self.t8 in self.t3.children)
        
        self.assertTrue(self.t9 in self.t4.children)
        
        self.assertTrue(self.t10 in self.t9.children)
        self.assertTrue(self.t11 in self.t9.children)

        self.assertTrue(self.t12 in self.t10.children)
        self.assertTrue(self.t13 in self.t10.children)
        
        
        self.assertFalse(self.t10 in self.t3.children)
        self.assertFalse(self.t11 in self.t3.children)

        self.assertFalse(self.t2 in self.t10.children)
        self.assertFalse(self.t3 in self.t10.children)
        
        
        
        
        
        
    
    def test_getChild(self):

        
        self.assertTrue(self.t2 == self.t1.getChild(0))
        self.assertTrue(self.t3 == self.t1.getChild(1))
        self.assertTrue(self.t4 == self.t1.getChild(2))
        
        self.assertTrue(self.t5 == self.t2.getChild(0))
        self.assertTrue(self.t6 == self.t2.getChild(1))

        self.assertTrue(self.t7 == self.t3.getChild(0))
        self.assertTrue(self.t8 == self.t3.getChild(1))
        
        self.assertTrue(self.t9 == self.t4.getChild(0))

        self.assertTrue(self.t10 == self.t9.getChild(0))
        self.assertTrue(self.t11 == self.t9.getChild(1))
        
        self.assertTrue(self.t12 == self.t10.getChild(0))
        self.assertTrue(self.t13 == self.t10.getChild(1))
        
        
        self.assertFalse(self.t10 == self.t3.getChild(0))
        self.assertFalse(self.t11 == self.t3.getChild(1))

        self.assertFalse(self.t2 == self.t10.getChild(0))
        self.assertFalse(self.t3 == self.t10.getChild(1))
        
    
    
    def test_getNchildren(self):
        self.assertTrue(self.t1.getNchildren() == 3)
        self.assertTrue(self.t2.getNchildren() == 2)
        self.assertTrue(self.t3.getNchildren() == 2)
        self.assertTrue(self.t4.getNchildren() == 1)
        self.assertTrue(self.t9.getNchildren() == 2)
        self.assertTrue(self.t10.getNchildren() == 2)
        self.assertTrue(self.t5.getNchildren() == 0)
        
        
        
        self.assertFalse(self.t1.getNchildren() == 5)
        self.assertFalse(self.t2.getNchildren() == 1)
        self.assertFalse(self.t3.getNchildren() == 0)
        self.assertFalse(self.t4.getNchildren() == 3)
        self.assertFalse(self.t9.getNchildren() == 4)
        self.assertFalse(self.t10.getNchildren() == 1)
        self.assertFalse(self.t5.getNchildren() == 1)
        
        

        

        
    
    def test_optimalValFinder(self):
        self.assertTrue(optimalValFinder(self.t1) == 82)
        self.assertTrue(optimalValFinder(self.t2) == -15)
        self.assertTrue(optimalValFinder(self.t3) == -3)
        self.assertTrue(optimalValFinder(self.t4) == 15)
        self.assertTrue(optimalValFinder(self.t5) == -10)
        self.assertTrue(optimalValFinder(self.t6) == 8)
        self.assertTrue(optimalValFinder(self.t7) == self.t7.key)
        self.assertTrue(optimalValFinder(self.t8) == self.t8.key)
        self.assertTrue(optimalValFinder(self.t9) == -42)
        self.assertTrue(optimalValFinder(self.t10) == -29)
        self.assertTrue(optimalValFinder(self.t11) == self.t11.key)
        self.assertTrue(optimalValFinder(self.t12) == self.t12.key)
        self.assertTrue(optimalValFinder(self.t13) == self.t13.key)
        

In [20]:
test = TestNArytree()
suite = unittest.TestLoader().loadTestsFromModule(test)
unittest.TextTestRunner().run(suite)

....
----------------------------------------------------------------------
Ran 4 tests in 0.003s

OK


<unittest.runner.TextTestResult run=4 errors=0 failures=0>

In [21]:
import random
from timeit import default_timer as timer
import datetime
import matplotlib.pyplot as plt 
%matplotlib



def generatelist(n, lower = -1000, upper = 1000, seed = 0): # Genarating list of NArytrees
    random.seed(seed)
    l = [None] * n
    for i in range(0,n):
        l[i] = NArytree(random.randrange(lower, upper))
    return l

def make_tree(l):  # Using generated list, create parent child relationship to end up with 1 tree
    random.seed(0)
    n=len(l)
    
    for i in range(1,n):            # There will be one NArytree will left to end and it will be the root.
        if len(l) == 2:              
            l[0].addChild(l[1])
            l.remove(l[1])
            break
        to_make_child = random.randrange(1,len(l)-1)   # Randomly choosen element of l is aded as child to 
        l[to_make_child-1].addChild(l[to_make_child])  # other element at 1 index before. 
        l.remove(l[to_make_child])   # When an element added to another Tree as a Child, it is removed 
                                     # because every element can have 1 parent.
    
    return l[0]
    

def performance(max_sample_size):
    times=[None]*max_sample_size
    sample_sizes=[None]* max_sample_size

    for n in range(1,max_sample_size+1): # Calculates process time of optimalValFinder function 
        l=generatelist(n)                # For every list size.(between 1 and max_sample_size)
        root = make_tree(l)
        start = timer()
        optimalValFinder(root)
        end = timer()
        times[n-1] = end - start
        sample_sizes[n-1] = n
    
    
    
    plt.plot(sample_sizes,times)
    plt.xlabel("list size ")
    plt.ylabel("time in seconds")
    plt.show()

    
    
performance(1000)

Using matplotlib backend: MacOSX


# Exercise 3 (30 marks)

We are again interested in lists of pairs of integers. However, this time, we want to store these pairs into a rooted tree $T$. More precisely, each node of $T$ will store a pair of $l$. This tree $T$ will have the property that for each inner node $k$ that stores a pair $(a, b)$, the pair $(c, d)$ stored at a node in the subtree of $k$ satisfies $(a, b) \leq (c, d)$. Furthermore, the pairs stored at sibling nodes must be non-comparable.

We will suppose that the root of the tree is a dummy pair $(-1, -1)$.

You need to provide a unit test for each question in this exercise where code is required. Each question is evaluated on the correctness of your code and the thoroughness of your unit tests.

## Question 3.1 (10 marks)

Define a class that stores the tree $T$ for any given list of pairs. The \_\_init\_\_() function must take a list as input and build the tree accordingly. An insert() function must allow the insertion of a pair to an initialised tree. Also provide a function printTree() which prints a tree with the given format:  
at level 0$~~~$(-1,-1)  
at level 1$~~~~~~$(1,2)  
at level 2$~~~~~~~~~$(3,2)  
at level 3$~~~~~~~~~~~~$(4,2)  
at level 3$~~~~~~~~~~~~$(3,3)  
at level 4$~~~~~~~~~~~~~~~$(3,4)  
at level 2$~~~~~~~~~$(1,3)  
at level 3$~~~~~~~~~~~~$(1,4)  
at level 4$~~~~~~~~~~~~~~~$(1,5)  
at level 1$~~~~~~$(2,1)  
(the pairs above are given as an example)

In [22]:
class Node:
    def __init__(self,key , parent = None):
        self.key = key
        self.children = []
        self.parent = parent
        
    def __str__(self):
        return str(self.key)
    
    def addChild(self,node):
        node.parent=self
        self.children.append(node)
    
    def hasChild(self):
        return len(self.children) != 0 
    
    def getNchildren(self):
        return len(self.children)
    
    
    
    
class Tree:
    def __init__(self,l):              # Takes a list as an argument to initialise Tree building process.
        self.size = 1                  # Defines the root as (-1,-1) for all Tree class objects
        self.root = Node(Pair(-1,-1))  # Then invokes buildTree() function using given list to build the tree structure.
        self.buildTree(l)
        
    def buildTree(self,l):     # Takes a list of pairs as argument and takes first element of list to
        if len(l) > 0 :        # insert it to the tree using insert() function. 
            self.insert(l[0])  # After insert, removes inserted element from the list and makes a recursive call
            l.remove(l[0])     # until all elements are inserted in the tree.
            self.buildTree(l)
            
            
            
    def printTree(self):       # Calls helper function by giving root of tree and level 0 as a starting point argument
        level = 0
        self._printTree(self.root,level)
        
    
    
    def _printTree(self, current , level):                         # Recursive DFS algorithm.
        white_space ='   '+'   '*level                             # For every recursive call increases the level variable by 1
        print('at level {}{}{}'.format(level,white_space,current)) # then use it in the print function to determine 
                                                                   #  number of whitespaces in the printed string.
        for each_child in current.children :
            self._printTree(each_child,level+1)
              
            
    
    
    def insert(self,pair):                # Function to start inserting process from the root of the tree
        self._insert(pair,self.root)      # for every element.
                                          # Calls helper _insert() function to handle rest of the process.

            
    
    def _insert(self,pair,currentNode):   
        
        bigger=[]      # List to carry the childs of the currentNode that are bigger than the input pair.
        smaller=[]
        uncomp=[]                # List to carry the childs of the currentNode that are smaller than the input pair.
        
        if pair < currentNode.key:               # A pair smaller than currentNode must be at a lower level
            self._insert(pair,currentNode.parent)# That's why calls the function with currentNode's parent.
            return True
        
        
        

        
        
        #Content of smaller and bigger lists are put in these list
        # By looking at currentNode's children.
        for each_child in currentNode.children:     
            if each_child.key.areComparable(pair):  
                if each_child.key>= pair:
                    bigger.append(each_child)
                elif each_child.key < pair:
                    smaller.append(each_child)
            else:
                uncomp.append(each_child)
                
        
        
        
    
        
        
        # If none of the children of the currentNode is comparable,
        # Inserts the pair as a child of currentNode.
        if (len(uncomp) == len(currentNode.children)) and pair >= currentNode.key :

            currentNode.children.append(Node(pair,parent=currentNode))
     
            self.size +=1
            return True
            
     
        
        
        # If There is just one child comparable and bigger than pair
        # pair is inserted as a child of currentNode and comparable child
        # is made pair's child.
        if len(smaller) == 0 and len(bigger) == 1 and pair >= currentNode.key:

            
            
            new = Node(pair,parent=bigger[0].parent)
            bigger[0].parent.children.append(new)
            bigger[0].parent.children.remove(bigger[0])
            bigger[0].parent = new
            new.children.append(bigger[0])
            self.size +=1
            
            
            return True
            

        
        # If There is just one child comparable and smaller than pair
        # Makes a recursive call to investigate comparable child as currentNode
        # Because only possiblely appropriate sub tree is under that comparable child
        if len(smaller) == 1 and len(bigger) == 0 and pair >= currentNode.key:
            self._insert(pair,smaller[0])
            return True
            
        


        
        # If all the children of currentNode are bigger than pair and pair is smaller than currentNode
        # Pair becomes the new parent of that children and currentNode becomes the parent of pair.
        if (len(bigger) == len(currentNode.children) and pair>= currentNode.key):

            
            
            new = Node(pair,parent=currentNode)
            for each_child in currentNode.children:
                new.children.append(each_child)
                each_child.parent=new
            currentNode.children=[]
            currentNode.children.append(new)
            self.size+=1
            

            return True
        
        if (len(smaller)==0 and len(bigger)>1 and pair>=currentNode.key ):

            currentNode.children=[]                
                
            new=Node(pair,parent=currentNode)
            currentNode.children.append(new)
            for each in uncomp:
                currentNode.children.append(each)
                each.parent=currentNode
                
            for each in bigger:
                each.parent=new
                new.children.append(each)
                
            self.size+=1
            


        # If None of the cases above are possible, every children of currentNode that are small from pair
        # are investigated until finding an appropriate place for pair.
        if ((len(smaller) > 1 ) and (len(bigger) != len(currentNode.children)) and (pair >= currentNode.key)) :
            stop = False
            for each_node in smaller:
                if not stop:
                    stop=self._insert(pair,each_node)
                    return True
                


In [23]:
import unittest

class TestNode(unittest.TestCase):
    def setUp(self):             # Creating Nodes
        self.n1=Node(Pair(1,2))
        self.n2=Node(Pair(2,1))
        self.n3=Node(Pair(3,4))
        self.n4=Node(Pair(1,4))
        self.n5=Node(Pair(3,3))
        self.n6=Node(Pair(4,2))
        self.n7=Node(Pair(1,5))
        self.n8=Node(Pair(3,2))
        self.n9=Node(Pair(1,3))
        
         
        
        self.n1.addChild(self.n2)
        self.n1.addChild(self.n3)
        self.n1.addChild(self.n4)
        

        self.n2.addChild(self.n5)
        self.n2.addChild(self.n6)

        self.n3.addChild(self.n7)
        self.n3.addChild(self.n8)

        self.n4.addChild(self.n9)
        
    
    def test_addChild(self):
        
        
        self.assertTrue(self.n2 in self.n1.children)
        self.assertTrue(self.n3 in self.n1.children)
        self.assertTrue(self.n4 in self.n1.children)
        
        self.assertTrue(self.n5 in self.n2.children)
        self.assertTrue(self.n6 in self.n2.children)
        
        self.assertTrue(self.n7 in self.n3.children)
        self.assertTrue(self.n8 in self.n3.children)
        
        self.assertTrue(self.n9 in self.n4.children)
        
        self.assertFalse(self.n9 in self.n3.children)
        self.assertFalse(self.n2 in self.n3.children)

        self.assertFalse(self.n2 in self.n5.children)
        self.assertFalse(self.n3 in self.n6.children)
        
        
        
        
        
        
    
    def test_hasChild(self):

        self.assertTrue(self.n1.hasChild())        
        self.assertTrue(self.n2.hasChild())
        self.assertTrue(self.n3.hasChild()) 
        self.assertTrue(self.n4.hasChild())
        
        
        self.assertFalse(self.n5.hasChild())
        self.assertFalse(self.n6.hasChild())

        self.assertFalse(self.n7.hasChild())
        self.assertFalse(self.n8.hasChild())
        
        self.assertFalse(self.n9.hasChild())
        
    
    
    def test_getNchildren(self):
        self.assertTrue(self.n1.getNchildren() == 3)
        self.assertTrue(self.n2.getNchildren() == 2)
        self.assertTrue(self.n3.getNchildren() == 2)
        self.assertTrue(self.n4.getNchildren() == 1)
        self.assertTrue(self.n5.getNchildren() == 0)
        self.assertTrue(self.n6.getNchildren() == 0)
        self.assertTrue(self.n7.getNchildren() == 0)
        
        
        
        self.assertFalse(self.n1.getNchildren() == 5)
        self.assertFalse(self.n2.getNchildren() == 1)
        self.assertFalse(self.n3.getNchildren() == 0)
        self.assertFalse(self.n4.getNchildren() == 3)
        self.assertFalse(self.n5.getNchildren() == 4)
        self.assertFalse(self.n6.getNchildren() == 1)
        self.assertFalse(self.n7.getNchildren() == 1)
        
        

        

        
class TestTree(unittest.TestCase):
    def setUp(self):
        self.PairClass = Pair
        self.buildTree = Tree
        
        
        
        
    # This is to check  __init__ function , buildTree function and also insert function because
    # init function calls buildTree and it calls insert function and insert function insert every element of 
    # the given list one by one without sorting the list. Therefore, there is no way for a not working insert function
    # when the other 2 funcitons are working properly
        
    def check_buildTree(self,currentNode):
        for each_child in currentNode.children:
            self.assertTrue(each_child.key >= currentNode.key, "{} and {} are not well structured".format(each_child.key, currentNode.key))
            
        for i in range(len(currentNode.children)-1):
            for j in range(i+1,len(currentNode.children)):
                self.assertFalse(currentNode.children[i].key.areComparable(currentNode.children[j].key),"{} and {} are comparable".format(currentNode.children[i].key, currentNode.children[j].key))
        for each_child in currentNode.children:
            self.check_buildTree(each_child)
    
    def check_size(self,tree,list_size):
        self.assertEqual(tree.size-1,list_size)
    
    def test1(self):
        random.seed(0)
        l=[None]*10
        for i in range(0,10):
            l[i] = Pair(random.randrange(0,1000),random.randrange(0,1000))
            
        list_size=len(l)
        tree_to_check=Tree(l)
        self.check_size(tree_to_check,list_size)
        self.check_buildTree(tree_to_check.root)
    
    def test2(self):
        random.seed(0)
        l=[None]*100
        for i in range(0,100):
            l[i] = Pair(random.randrange(0,1000),random.randrange(0,1000))
        
        list_size=len(l)
        tree_to_check=Tree(l)
        self.check_size(tree_to_check,list_size)
        self.check_buildTree(tree_to_check.root)
    
    
    def test3(self):
        random.seed(0)
        l=[None]*1000
        for i in range(0,1000):
            l[i] = Pair(random.randrange(0,1000),random.randrange(0,1000))
            
        list_size=len(l)
        tree_to_check=Tree(l)
        self.check_size(tree_to_check,list_size)
        self.check_buildTree(tree_to_check.root)
        
        
    

In [24]:
test = TestNode()
suite = unittest.TestLoader().loadTestsFromModule(test)
unittest.TextTestRunner().run(suite)

...
----------------------------------------------------------------------
Ran 3 tests in 0.003s

OK


<unittest.runner.TextTestResult run=3 errors=0 failures=0>

In [25]:
test = TestTree()
suite = unittest.TestLoader().loadTestsFromModule(test)
unittest.TextTestRunner().run(suite)

...
----------------------------------------------------------------------
Ran 3 tests in 0.062s

OK


<unittest.runner.TextTestResult run=3 errors=0 failures=0>

In $O()$, what is the running time complexity of the functions you defined?

__init__ function which calls buildTree() function, both have $O(n log n)$ complexity on average.
Because for every insert , function starts from the root and if there is a node at any level which is uncomparable with the pair needed to be inserted, the whole sub tree become out of interest so remaining part is considered only. Therefore, for every insertion to tree from the given list to build tree costs (log n) on average. There will be n insertions so complexity $O(n log n)$ on average. For the worst case we need to visit all nodes separetaly. Therefore, every insertion will cost O(n) and __init__ and buildTree functions will have O(n^2) complexity at worst-case. For the best-case complexity of insert function will be O(1) because if all the elements are uncomparable it can directly insert it 

For the insert() function which takes a pair as an argument and inserts it to the appropriate place of the tree has 
O(log n) complexity on average when n is the size of the Tree. Because of the same reasons explained above on average O(log n) but in worst-case it becomes O(n) because if there is no uncomparable node , we may need to visit all nodes. For the best case it is O(1) when there is one element at first level and it is uncomparable.

For the printTree() function O(n) is the complexity because it just visit all the nodes. n is the number of nodes in the tree.

## Question 3.2 (5 marks)

In the class you have defined in Question 3.1, can there be multiple trees that store the same list of pairs? (including after insertions?). Prove your answer.

There can be different trees due to input order of the list because of the order sensitive definition of insert() function used to build the tree. When insert function is about to insert a pair it starts to search for an appropriate place from the root of the tree and whenever it finds the place inserts the pair there. This insertion can easily effect the remaining part of the list's insertion location.

For example if there are 2 lists with same pairs in a different order as follows;

$l1=[(1,2) , (2,1) , (3,4)]$

$l2=[(2,1) , (1,2)  , (3,4)]$


If we build a tree using $l1$ , our function will put  (1,2) as the first child of the root and (2,1) as the second child of the root(Because they are uncomparable). For the insertion of (3,4) insert() function will first look at the children of root and (3,4) is comparable with both of them so it wont be possible to put it as a sibling node of them. Then insert function will try to find a appropriate position at level2 to first by checking children of (1,2) because it is the first child of root. ( (3,4) is bigger than both of them so the function may investigate the children of both (1,2) and (2,1) but at the first place it finds , it puts it there.) When the function checks the children of (1,2) it will see that there is no children which means that all childrens are incomparable with (3,4) that is the first case in insert fuction and if it is provided directly inserts the new node there. Therefore, we will end up with a tree with (3,4) is a child of (1,2).

If we build a tree using $l2$ , our function will put  (2,1) as the first child of the root and (1,2) as the second child of the root. In this case with the same logic above , function will insert the (3,4) under (2,1) because it is the first child of the root and we will end up with a tree where (3,4) is the child of (2,1) instead of (1,2).

## Question 3.3 (10 marks)

We now suppose that $T$ is a DAG, a Directed Acyclic Graph (see e.g. [here](https://en.wikipedia.org/wiki/Directed_acyclic_graph)). In the DAG $T$, a child node which stores a pair $(e, f)$ must:
* have as an ancester all nodes that have a pair $(a, b)$ such that $(a,b)\leq(e,f)$,
* have no other ancester,
* have no edge to an ancester $(a,b)$ if there exists a pair $(c,d)$ such that $(a,b)\leq (c,d) \leq(e,f)$.

Design a class (it may be based on the previous one) to allow and build this representation. Also provide a function printTree() which prints the DAG with a similar format.

In [26]:
class Vertex:
    def __init__(self,key):
        self.key = key
        self.connectedTo = []

    def addNeighbor(self,nbr):
        self.connectedTo.append(nbr)

    def __str__(self):
        return str(self.id) 

    def getConnections(self):
        return self.connectedTo

    def getId(self):
        return self.id

class Graph:
    def __init__(self):
        self.vertList=[]
        self.size = 0
        
    def addVertex(self,key):
        self.size +=1
        new = Vertex(key)
        self.vertList.append(new)
        return new