# Algorithms and Data Structures - Ivan Lanese module

* Our main topic will be data structures (balanced trees, graphs) and algorithms (greedy, dynamic programming)
* Recursion will be essential for exploring search trees

Fibonacci recursive is incredibly slow with n above 35 while it can compute huge numbers in the iterative form.

In [13]:
def fib(n):
    if n==1 or n==0:
        return 1
    elif n>1:
        return fib(n-1)+fib(n-2)
    else:
        print("fib not possible for negative numbers")

fib(10)

89

In [None]:
def fib(n):
    if n==0 or n==1:
        return 1
    list = [1,1,2]
    for i in range(n-2):
        list=[list[1],list[2],list[1]+list[2]]
    return list[2]
    

fib(100)

## Balanced search trees
* A balanced tree is a bynary search tree (BST) that minimizes the height of the tree
    * In a BST the left subtree contains elements smaller than the root, the right subtree bigger
    * The BST property is useful for lookups and must be mantained by insertions and deletions
* I BSTs insertion and delition have a complexity linear with the height of the tree, $O(h)$
    * In a complete BST $h=\log{n}$, so insertion and deletion are $O(\log{n})$ with $n =$ number of nodes
* Insertions and deletions can make a complete BST unbalanced
* A tree built from ordered elements is maximally unbalanced
    * A way to create a generally balanced tree is to randomly permutate the input data when constructing the tree
* Modifications of BST have been developped for maintaining it as balanced as possible

### AVL trees
* AVL trees are almost balanced BST that support `insert()`, `delete()` and `lookup()` in $O(\log{n})$ time
* AVL introduce a balancing factor for each node $\beta(n)$
    * It is the difference in the height of the 2 subtrees of the node
    * A tree is balanced if for each node $|\beta|$ < 1

#### Proof that for AVL $h=O(\log{n})$
* The most unbalanced AVL is a fibonacci tree
    * It puts everything on one side but it leaves on the other side only what is needed to make it AVL
    * It is defined as a BST of height $h$ having a left subtree of heigh $h-1$ and a right subtree of heigh $h-2$
        * $n_h = n_{h-1} + n_{h-2} + 1$
    * I want to prove that a fibonacci tree of heigh $h$ has $F_{h+3}-1$ nodes with $F_n$ being the $n^{th}$ fibonacci number
    * Base case: $h=0$
        * I have 1 node and it satisfies the condition
            * $n_0 = 1, F_3 = 2$
    * Induction: $n_h = n_{h-1} + n_{h-2} + 1$ by construction
        * $n_h = n_{h-1} + n_{h-2} + 1=F_{h+2}-1+F_{h+1}-1+1=F_{h+3}-1$
        * Since this is true for $h=0$, it is true for every $h$
    * Since $F_h = \Theta(\phi^h)$ with $\phi \approx 1.618$
        * $n_h = \Theta(\phi^h)$
        * $\Theta(\log{n_h}) = \Theta(h)$
        * $h=\Theta(\log{n_h})$
To be finished from last lecture

## Exercise on recursion

In [2]:
# computing the height of a node recursively has complexity O(n)
def h(node):
    if (n.left==None and n.right==None) or n==None: # n is a leaf or it is not existent
        return 0
    else:
        return max(h(n.left),h(n.right))+1

### B-trees
* When I have a lot of data I cannot keep everithing in memory
* I want to minimize disk access, since it is usually a bottleneck
* I cannot read a single byte from a disk, it is read in sectors
    * Also SSDs have a minimal access unit, the block
