# Algorithms & Data Structures

Nathan Sharp | September 2020

# Lecture 1: Introduction

## Some Syllabus Content

__Advanced data structures:__ Data structures for disjoint sets, Union by rank, path compression, etc., "heuristics".  
__Minimum spanning Trees:__ Prims algorithms (using priority queues); Kruskal's algorithm (using disjoint sets).   
__Graph/Network algorithms:__ Network flows, Ford-Fulkerson algorithm for max flow.  
__Geometric algorithms:__ Convex Hull of a set of points in two dimentions; Graham's scan algorithm.  

## Basic Definitions

__Model of Computation:__ An abstract sequential computer, called a _Random Access Machine_ or _RAM_. Uniform cost model.  
__Computional Problem:__ A specificatoin in general terms of _inputs_ and _outputs_ and the desired input/output relationship.  
__Problem Instance:__ A particular collecion of inputs for a given problem.  
__Algorithm:__ A general method of solving a problem which can be implemented on a computer.   Usually there are many algorithms from a given problem.  
__Program:__ Particular implementation of some algorithm.  

## Additional Definitions

__Running Time:__ number of computation steps performed by the algorithm on this instance (this depends on our _machine model_).    
__Number of basic arithmetic operations:__ abstract way of only counting the essential computing steps.  
_Note:_ These notions are abstractions of the actual running time which will also depend on factors like:
 - quality of implementation 
 - quality of code generated by compiler
 - the machine used to execute the program

## Worst Case Running Time

_Note:_ The runtime will be proportional to the size of the input, $n$.  
__The worst case running time of an algorithm $A$:__ the function $T_{A}: \mathbb{N} \to \mathbb{N}$ where $T_{A}(n)$ is the _maximum_ number of computational steps performed by $A$ on an input of size $n$.

## Average Case Running Time

__The average case running time of an algorithm $A$:__ is the function $T_{A}: \mathbb{N} \to \mathbb{N}$ where $T_{A}(n)$ is the _average_ number of computational steps performed by $A$ on an input of size $n$.  
For genuine average case analysis we need to know for each $n$ the probability with which each input turns up. Usually we assume that all inputs of size $n$ are equally likely. 

## Bounds

__Upper bound:__ if the algorithm which solves the problem has a worst-case runtime at most $T(n)$.  
__Average-case bound:__ if the algorithm which solves the problem has an average-case runtime at most $T(n)$.  
__Lower bound:__ if _every_ algorithms which solves the problem must use at least $T(n)$ time on some instance of size $n$ for infinitely many $n$.

## Problem Questions

_1. Analyse the asymptotic worst-case run times of the three following $a^{n} \mod n$ algorithms_.

In [1]:
def pow_rem1(a,n,m):
    # interger overflow even at small a,m and moderate n
    # even without overflow number are needlessly large
    r = a                 # c1 = O(1)
    for i in range(n-1):  # c2 = O(n)
        r = r * a         # c3 = O(n)
    return r % m          # c4 = O(1)

# T(n) = O(n)



def pow_rem2(a,n,m):
    # better than pow_rem1
    # no interger overflow (unless m large) 
    # more arithmetically efficient - numbers kept small
    x = a % m             # c1 = O(1)
    r = x                 # c2 = O(1)
    for i in range(n-1):  # c3 = O(n)
        r = (r * x) % m   # c4 = O(n)
    return r              # c5 = O(1)

# T(n) = O(n)

def pow_rem3(a,n,m):
    # no integer overflow (unless a,m large)
    # even less numerical arithmetic
    if n == 1:                       # O(1)
        return a % m                 # O(1)
    elif n % 2 == 0:                 # O(1)
        r = pow_rem3(a, n/2, m)      # O(log n), as split by constant
        return r**2 % m              # O(1)
    else:   
        r = pow_rem3(a, (n-1)/2, m)  # O(log n)
        return (r**2 % m) * a % m    # O(1)
# T(n) = O(log n)

# pow_rem1(2,5,3)