In [3]:
# setup
from IPython.core.display import display,HTML
display(HTML('<style>.prompt{width: 0px; min-width: 0px; visibility: collapse}</style>'))
display(HTML(open('rise.css').read()))

# imports
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
sns.set(style="whitegrid", font_scale=1.5, rc={'figure.figsize':(12, 6)})


# CMPS 2200
# Introduction to Algorithms

## Review & SPARC + Cost Model



## Language-based Cost  Model

- Define a language to specify algorithms
- Assign a cost to each expression
- Cost of algorithm is sum of costs for each expression

> For a given expression $e$ [a series of statements], we will analyze the work $W(e)$ and span $S(e)$

## SPARC Example 


<br><br>
<p> <span>\[\begin{array}{l}  
\texttt{let}\\   
~~~~x = 2 + 3\\  
~~~~f (w) = (w * 4, w - 2)\\  
~~~~(y,z) = f(x-1)\\  
\texttt{in}\\   
~~~~x + y + z\\  
\texttt{end}   
\end{array}\]</span></p>
<br><br>
$$x = 2 + 3 = 5$$

$$f(4) \rightarrow (16, 2)$$

$$x + y + z= 5 + 16 + 2 = 23$$

<br><br>
**binding**: associate entities (data or code) with identifiers.

<br>

**let expression:**

**let**  
$\:\: b^+$  
**in**  
$\:\:e$  
**end**

Expression $e$ is applied using the bindings defined inside **let**.

<br><br>
**expression** *e*: describes a computation  
- **evaluating** an expression produces its value






### Factorial function
<br><br>
<p> <span>\[\begin{array}{l}  
\texttt{let}\\   
~~~~f(i)=\texttt{if}~(i<2) \\
~~~~~~~~~~~~~~~~~~~~~\texttt{𝚝𝚑𝚎𝚗}~i \\
~~~~~~~~~~~~~~~~~\texttt{𝚎𝚕𝚜𝚎}\\
~~~~~~~~~~~~~~~~~~~~~~i*f(i-1) \\
\texttt{in}\\   
~~~~f(5)\\  
\texttt{end}   
\end{array}\]</span></p>
<br><br>

In [2]:
def factorial(i):
    if i < 2:
        return i
    else:
        return i * factorial(i-1)
factorial(5)

120

## Composition [Work & Span]
<center>
<img src="figures/composition.png" width="60%"/>
</center>

####  $(e_1, e_2)$: Sequential Composition

Add work and span 

$W(e_1, e_2) = 1 + W(e_1) + W(e_2)$

$S(e_1, e_2) = 1 + S(e_1)+ S(e_2)$ 

####  $(e_1 || e_2)$: Parallel Composition

Add work but **take the maximum span** 

$W(e_1 || e_2) = 1 + W(e_1) + W(e_2)$

$S(e_1 || e_2) = 1 + \max(S(e_1), S(e_2))$  


### SPARC SumList
<p><span class="math display">\[\begin{array}{l}   
\\\\
\mathit{sumList}~a =   
\\  
~~~~~~~~~\texttt{if}~|a| = 1~\texttt{then}   
\\  
~~~~~~~~~~~~~\texttt{return}~a  
\\  
~~~~~~~~~\texttt{else}  \\
~~~~~~~~~~~~\texttt{let}\\   
~~~~~~~~~~~~~~~~(l, r)=\texttt{splitMid}~a \\
~~~~~~~~~~~~~~~~(l',r')=(\mathit{sumList}~l~||~\mathit{sumList}~r)\\
~~~~~~~~~~~~\texttt{in}\\  
~~~~~~~~~~~~~~~~l'+r'\\
~~~~~~~~~~~~\texttt{end}   
\end{array}\]</span></p>


<!-- <p><span class="math display">\[\begin{array}{l}   
\\\\
\mathit{sumList}~a =   
\\  
~~~~~~~~~\texttt{if}~|a| = 1~\texttt{then}   
\\  
~~~~~~~~~~~~~\texttt{return}~a  
\\  
~~~~~~~~~\texttt{else}  \\
~~~~~~~~~~~~\texttt{let}\\   
~~~~~~~~~~~~~~~~(l, r)=\texttt{splitMid}~a \\
~~~~~~~~~~~~~~~~(l',r')=(\mathit{sumList}~l,~\mathit{sumList}~r)\\
~~~~~~~~~~~~\texttt{in}\\  
~~~~~~~~~~~~~~~~l'+r'\\
~~~~~~~~~~~~\texttt{end}   
\end{array}\]</span></p> -->

<center>
<img src="figures/recursion.png" width="80%"/>
</center>

In [None]:
# recursive, serial
def sum_list_recursive(mylist):    
    print('summing %s' % mylist)
    
    if len(mylist) == 1:
        return mylist[0]
    
    return (
        sum_list_recursive(mylist[:len(mylist)//2]) +
        sum_list_recursive(mylist[len(mylist)//2:])
    )

# recursive, parallel
def sum_list_recursive_parallel(mylist):    
    print('summing %s' % mylist)
    if len(mylist) == 1:
        return mylist[0]
    
    # each thread spawns more threads
    result1, result2 = in_parallel(
        sum_list_recursive_parallel, mylist[:len(mylist)//2],
        sum_list_recursive_parallel, mylist[len(mylist)//2:]
    )
    print('>>>merging %s and %s' % (result1, result2))
    return result1 + result2


## Recurrences


Recurrences are a way to capture the behavior of recursive algorithms.

Key ingredients: 

- Base case ($n = c$): constant time 
- Inductive case ($n > c$): recurse on smaller instance and use output to compute solution

Actually recursion is a conceptual way to view algorithm execution, and we can reframe an algorithm specification to make it recursive.



In [2]:
def selection_sort(L):
    for i in range(len(L)):
        print(L)
        m = L.index(min(L[i:]))
        L[i], L[m] = L[m], L[i]
    return L
                   
selection_sort([2, 1, 4, 3, 9])

[2, 1, 4, 3, 9]
[1, 2, 4, 3, 9]
[1, 2, 4, 3, 9]
[1, 2, 3, 4, 9]
[1, 2, 3, 4, 9]


[1, 2, 3, 4, 9]

In [3]:
def selection_sort_recursive(L):
    print('L=%s' % L)
    if (len(L) == 1):
        return(L)
    else:
        m = L.index(min(L))
        L[0], L[m] = L[m], L[0]
        return [L[0]] + selection_sort_recursive(L[1:])
    
selection_sort_recursive([2, 1, 999, 4, 3])

L=[2, 1, 999, 4, 3]
L=[2, 999, 4, 3]
L=[999, 4, 3]
L=[4, 999]
L=[999]


[1, 2, 3, 4, 999]

Are these the same algorithm? Can we give a SPARC specification?

<p><span class="math display">\[\begin{array}{l}  
\mathit{selectionsort}~~L = 
\\  
~~~~~\texttt{if}~|L| = 1~\texttt{then}    
\\  
~~~~~~~~~~~~~\texttt{return}~~L  
\\  
~~~~~~~~~\texttt{else}
\\
~~~~~~~~~~~~\texttt{let}\\
~~~~~~~~~~~~~~~m = \texttt{minimum element in}~~L\\
~~~~~~~~~~~~\texttt{in}\\ 
~~~~~~~~~~~~~~~\texttt{Cons}(m, (\mathit{selectionsort~~\langle x | x\in L~~and~~x\neq m \rangle})) \\
~~~~~~~~~~~~\texttt{end} 
\end{array}\]</span></p>



What is the work and why?

$\begin{eqnarray}
W(n) &=& W(n-1) + n \\
 &=& W(n-2) + (n-1) + n = W(n-2)+2n-1 \\
&\vdots&
\end{eqnarray}$


$\begin{eqnarray}
W(n) &=& \sum_{i=1}^n i  \\
&=& \frac{n(n+1)}{2}  = \Theta(n^2)\\
&\in& O(n^2).
\end{eqnarray}$


The recurrence for Selection Sort is somewhat simple - what if we have multiple recursive calls and split the input? (This is actually what *divide-and-conquer* algorithms do.)

We'll look at methods to solve recurrences in order to obtain big-O bounds for recursive algorithms.

We will:
- Get intuition for recurrences by looking the recursion tree. 

- Develop the **brick** method to quickly state asymptotic bounds on a recurrence by looking at the shape of the tree.

Let's look at the specification and recurrence for Merge Sort: 

<p><span class="math display">\[\begin{array}{l}  
\mathit{mergeSort}~a =  
\\   
~~~~\texttt{if}~|a| \leq 1~\texttt{then}  
\\   
~~~~~~~~a  
\\  
~~~~\texttt{else}  
\\   
~~~~~~~~\texttt{let}  
\\  
~~~~~~~~~~~~(l,r) = \mathit{splitMid}~a  
\\   
~~~~~~~~~~~~(l',r') = (\mathit{mergeSort}~l \mid\mid{} \mathit{mergeSort}~r)  
\\  
~~~~~~~~\texttt{in}  
\\   
~~~~~~~~~~~~\mathit{merge} (l',r')  
\\  
~~~~~~~~\texttt{end}  
\end{array}\]</span></p>

Suppose that the merging step can be done with $O(n)$ work and $O(\log n)$ span. Then recurrence for the work is: 

$ \begin{equation}
W(n) = \begin{cases}
  O(1)= c_b, & \text{if $n=1$} \\
  2W(n/2) + O(n) = 2W(n/2) + c_1n + c_2, & \text{otherwise} 
  \end{cases}
\end{equation}$

How do we solve this recurrence to obtain $W(n) = O(n\log n)$?





![alttext](figures/mergesort_tree.png)

The recursion tree for Merge Sort has linear work at every level except at the leaves. There are a logarithmic number of levels and a linear number of leaves so we obtain an asymptotic bound of $O(n\log n)$ for the work.

## Solving Recurrences with the Tree Method 

<br>


<div>size at level $i$</div> <div style="text-align: right"> cost at level $i$ </div>

![merge-tree.jpg](figures/merge-tree.jpg)

### Recipe: 
1. Determine the cost of each level $i$ to be $c_i$($i$ starts at $0$).
2. Determine the number of levels $h$
3. Cost = $\sum_{i=0}^{h} c_i$
  - This last step usually involves using properties of series
  
<br>

E.g., for merge sort:

- level $i$ contains $2^i$ nodes
- each node at level $i$ costs $c_1 \frac{n}{2^i} + c_2$
- so, each level costs $2^i * (c_1 \frac{n}{2^i} + c_2) = c_1n + 2^i c_2$
- since each level reduces size by half, we have $\log_2n$ levels
- so, total cost of tree is:

$$W(n) = \sum_{i=0}^{\log_2n} (c_1n + 2^i c_2)$$

To solve this, we'll make use of bounds for **geometric series**. 

For $\alpha > 1$: 
$\:\:\: \sum_{i=0}^n \alpha^i <\frac{\alpha}{\alpha - 1}\cdot\alpha^n$

e.g., $\sum_{i=0}^{\log_2n} 2^i < \frac{2}{1} * 2^{\log_2n} = 2n$

For $\alpha < 1$: 
$\:\:\: \sum_{i=0}^\infty \alpha^i  < \frac{1}{1-\alpha}$

e.g., $\sum_{i=0}^{\log_2n} \frac{1}{2^i} < 2$


<br> plugging in...

$$= \sum_{i=0}^{\log_2n} (c_1 n + 2^i c_2)$$

$$= \sum_{i=0}^{\log_2n}c_1 n + \sum_{i=0}^{\log_2n} 2^i c_2$$

$$= c_1n \sum_{i=0}^{\log_2n} 1 + c_2 \sum_{i=0}^{\log_2n} 2^i$$

$$<c_1n \log_2n + 2 c_2 n$$

$$\in O(n \lg n)$$

What about the span?

![alttext](figures/tree.png)


The recurrence for the span of Mergesort is:

$ \begin{equation}
S(n) = \begin{cases}
  c_3, & \text{if $n=1$} \\
  S(n/2) + c_4 \lg n, & \text{otherwise} 
  \end{cases}
\end{equation}$


Since each level of the recursion tree is concurrent and all nodes have the same cost, we have that

$ \begin{align}
S(n) & = \sum_{i=0}^{\lg n} \lg\frac{n}{2^i}\\
& = \sum_{i=0}^{\lg n} (\lg n - i)\\
& = \sum_{i=0}^{\lg n} (\lg n) - \sum_{i=1}^{\lg n} i\\
& = \lg n * (\lg n+1)  - \frac{1}{2}\lg n * (\lg n+1) \:\: (\hbox{using}\:\:\sum_{i=1}^n = \frac{n(n+1)}{2})\\
& = \frac{1}{2}\lg^2 n + \frac{1}{2}  \lg n\\
& \in O(\lg^2 n)\\
\end{align}$


### One More Recurrence
    
<br>
$ \begin{equation}
W(n) = \begin{cases}
  c_b, & \text{if $n=1$} \\
  2W(n/2) + O(n^2), & \text{otherwise} 
  \end{cases}
\end{equation}$

What is the asymptotic runtime?

![alttext](figures/tree.png)



$$W(n) = 2W(n/2) + c_1n^2 + c_2$$

<img width="70%" src="figures/n_squared.png"/>

$= \sum_{i=0}^{\lg n} (c_1 \frac{n^2}{2^i} + 2^i c_2)$

$= c_1 n^2 \sum_{i=0}^{\lg n} \frac{1}{2^i} + c_2 \sum_{i=0}^{\lg n} 2^i$

$< 2 c_1 n^2 + 2 c_2 n$

$\in O(n^2)$

So what if branching factor is not 2?

$$W(n) = 4 W \Big(\frac{n}{2}\Big) + O(n)$$

**costs**

- level 0: $c_1n + c_2$
- level 1: $4(c_1 \frac{n}{2} + c_2)$
- level 2: $16(c_1 \frac{n}{4} + c_2)$
- level $i$ ?

$$4^i(c_1 \frac{n}{2^i} + c_2)$$

<br>

still $\lg n$ levels:, so $W(n)$ is:

<br>

$$= c_1n \sum_{i=0}^{\lg n} \Big(\frac{4}{2}\Big)^i + c_2 \sum_{i=0}^{\lg n} 4^i$$

$$< 2 c_1 n^2 + \frac{4}{3} c_2 4^{\lg n}$$

$$(\hbox{since} \:\:\ \sum_{i=0}^n \alpha^i  < \frac{\alpha}{\alpha - 1}\cdot\alpha^n)$$

$$= 2 c_1 n^2 + \frac{4}{3} c_2 2^{\lg n} 2^{\lg n}$$

$$= 2 c_1 n^2 + \frac{4}{3} c_2 n^2$$

$$\in O(n^2)$$

## Summary 

- Algorithm Comparison [**Worse Case**, Asymptotic Analysis] $\mathcal{O}(),~~ \Omega(), ~~\Theta()$ -> Limit Method
- Parallelism -> Speedup -> Dependency [**Work $T_1$ \& Span $T_\infty$**]
- Divide and Conquer -> Greedy Scheduling
- Funtional Language [SPARC]
- Language based Work-Span model

