# Foundations

## Algorithmic Analysis

### Asymptotic bounds

**$\Theta$-notation** gives a bounded fit to a function. $f(n) = \Theta(g(n))$ if there exist positive constants $n_0$, $c_1$, and $c_2$ such that, for $ n > n_0, c_1g(n) \leq f(n)  \leq c_2g(n)$<br>
**O-notation** gives an upper bound for a function.
$f(n) = O(g(n)) if there are positive constants $n_0$ and $c$ such that, for $n > n_0, f(n) < cg(n)$ <br>
**$\Omega$-notation** gives a lower bound for a function. $f(n) = \Omega(g(n))$ if there are positive constants $n_0$ and $c$ such that, for $n > n_0, f(n) > cg(n)$

#### Growth of Common Functions
$log(n) < n < nlog(n) < log^k(n) < n^k < k^n < n! $

## Sorting

### Insertion Sort

Walk over the elements in the input array and iteratively put them into another array in the correct order.

#### In-place sorting
Keep the left side of the array sorted and move to the right.  This **keeps memory constant** and allows for a **binary search** to locate the insertion index but it will require lots of **'shift right by 1' operations**.
#### Insert into a linked list
Allocate a new linked list to store the sorted values.  This requires more memory and doesn't allow for binary searches during insertion but it does prevent the need to shift items by 1 during insertion.

In [None]:
import numpy as np
_in = np.random.randint(50, size=100)

In [None]:
def _shift_right(array: list, s: int, e: int):
    # going backwards allows us to use the space of the item being inserted as swap
    for i in range(e,s,-1):
        array[i] = array[i-1]
    return array

def insert_sort(input: list, in_place:bool=True):
    _in = input if in_place else input.copy()
    for j in range(1,len(_in)):
        # find correct location for _in[j] in the array _in[0:j]
        for i in range(0,j):
            if _in[j] < _in[i]:
                v = _in[j]
                _shift_right(_in, i, j)
                _in[i] = v
                break
    return _in


### Merge Sort

Split the input array in half, sort those halves, and then merge the sorted halves together.

The tricky part will be the merge logic.  Since we know that the two arrays being merged are already sorted, we can incrementally move across both arrays and only compare a item to a (likely) small subset of the other values in the other array.

I have used this merging logic elsewhere to simplify processes (I think when merging potentially overlapping time series data for Prima).

In [None]:
def merge(a: list, b: list) -> list:
    rval = []
    a_idx = 0
    b_idx = 0

    while a_idx < len(a) and b_idx < len(b):
        if a[a_idx] < b[b_idx]:
            rval.append(a[a_idx])
            a_idx += 1
        else:
            rval.append(b[b_idx])
            b_idx += 1

    rval += a[a_idx:] if a_idx < len(a) else b[b_idx:]
    return rval


merge([1,2,4,5], [3,4,7,8])

In [None]:
def merge_sort(_in):
    if len(_in) == 1:
        return _in
    else:
        i = int(len(_in) / 2)
        return merge(merge_sort(_in[:i]), merge_sort(_in[i:]))


assert sorted(_in) == merge_sort(_in.tolist())

# Heaps

In [56]:
import math
import toolz as tz

def _parent_idx(i: int) -> int:
    "Returns the index of the ith elements parent within a binary heap"
    return int(i/2)

def _height_at(i: int) -> int:
    "Returns the height of the element at index i, where the root is at level 0"
    return math.floor(math.log2(i+1))


def _max_heapify(_heap: list[int], idx: int) -> list[int]:
    "Percolate the element at index idx up the ancestoral branch so that the max heap property is maintained"
    j = idx
    pi = _parent_idx(j)
    while pi >= 0 and _heap[pi] < _heap[j]:
        tmp = _heap[pi]
        _heap[pi] = _heap[j]
        _heap[j] = tmp
        j = pi
        pi = _parent_idx(pi)
    return _heap


def insert_in_heap(_heap: list[int], elem: int) -> list[int]:
    return _max_heapify(_heap + [elem], len(_heap))


def build_max_heap(int_seq: list[int]) -> list[int]:
    _heap = int_seq.copy()
    for i in range(len(_heap)):
        _heap = _max_heapify(_heap, i)
    return _heap

def print_binary_tree(tree: list):
    pad_iter = tz.iterate(lambda x: x*2 + 2, 1) # assume 1 space between leaves, 2 character elements
    _pad = [next(pad_iter) for _ in range(_height_at(len(tree)-1)+2)][-1]  # +1 for 0 indexed height calc, +1 since range() excludes end point 
    s = ""
    prev_height = -1
    for i in range(len(tree)):
        _height = _height_at(i)
        if _height > prev_height:
            s +=  "\n" if _height else ""
            prev_height = _height
            _pad = int((_pad / 2)) - 1
            s += (" "*int(_pad/2))
        else:
            s += (" "*_pad)
        s += str(tree[i])

    print(s)


In [57]:
import numpy as np

_list = np.random.randint(10,100,size=18)

_heap = build_max_heap(_list)
print_binary_tree(_heap)

                       92
           87                      84
     87          64          72          61
  77    44    20    15    58    19    48    47
70 16 37


# Factorial

In [None]:
# Recursive
def fac(n):
    if n <= 1:
        return 1
    else:
        return n * fac(n-1)

# Iterative
import toolz as tz

def facI(n):
    return tz.reduce(lambda x,y: x*y, (range(1,n+1)))

def facI2(n):
    v = 1
    for i in range(2,n+1):
        v *= i
    return v

assert facI(6) == facI2(6) == fac(6) == 720

# Fibonacci

In [None]:
# recursive
def fib(n):
    if n == 1:
        return 1
    elif n < 1:
        return 0
    else:
        return fib(n-1) + fib(n-2)

# TODO an iterative solution?

In [None]:
# Largest profit attainable and the buy and sell days
prices = [100, 113, 110, 85, 105, 102, 86, 63, 81, 101, 94, 106, 101, 79, 94, 90, 97]


# Brute force approach
# store highest found
# iterate over prices, i = starting date.  Check profit from selling on any remaining date, store max value

best = (0,0,0)
for i,p in enumerate(prices):
    for j in range(i+1, len(prices)):
        profit = prices[j] - p
        if profit > best[2]:
            best = (i,j,profit)

best