# Introduction to Computation and Python Programming

## Lecture 7

### Today
----------

- Algorithmic Complexity

### Computational Complexity

How long will the following function take to run?

```python
def f(i):
    """Assumes i is an int and i > 0"""
    answer = 1
    while i >= 1:
        answer *= i
        i -= 1
    return answer
```


### How to Measure

- measure with a **timer**
- **count** the operations
- abstract notion of **order of growth**



### Timing

- use time module
- see code
- GOAL: to evaluate different algorithms
- running time **varies between algorithms** **&#x2611;**
- running time **varies between Python implementations** **&#x2612;**
- running time **varies between computers** **&#x2612;**
- running time is **not predictable** based on small imputs **&#x2612;**
- time varies for different inputs but cannot really express a relationship between inputs and time **&#x2612;**


### Counting Operations

- see code
- Assume each line of code takes one unit of time
- Then running time of this function is:
\begin{equation*}
1000 + x + 2x^2
\end{equation*}
---
- f(10) = 1210
- f(1000) = 2002000
- For small values of x the constant term dominates
---
- GOAL: to evaluate different algorithms
- count **depends on algorithm** **&#x2611;**
- count **depends on implementations** **&#x2612;**
- count **independent of computers** **&#x2611;**
- no clear definition of **which operations** to count **&#x2612;**
- count varies for different inputs and can come up with a relationship between inputs and count **&#x2611;**

### We need a better way

- timing and counting **evaluate implementations**
- timing **evaluates machines**
<br><br>
- want to **evaluate algorithm**
- want to **evaluate scalability**
- want to **evaluate in terms of input size**

### A better way
- Focus on counting but ignore small variations in implementation (does a loop have 3 or 5 operations)
- Focus on how long the algorithm takes on very large inputs
- In the example, do we care that the inner loop takes $x^2$ or $2x^2$
- We should probably look for a more efficient algorithm
---
Rules of Thumb:
- If the runnning time is the sum of multiple terms, keep the one with the largest growth rate, and drop the others
- If the remaining term is a product, drop any constants
---
This is called the **"Big O"** notation
- Asymptotic upper bound on the growth of the function (called **order of growth**)
    - e.g. $f(x) \in O(x^2)$ means that the function f grows no faster than a quadratic polynomial $x^2$, in an asymptotic sense
   

### Types of Orders of Growth or Complexity Classes

![complexity classes](diagrams/complexity-classes.png)

|Complexity Classes||
|---|:---|
|$O(1)$|Constant Time|
|$O(log n)$|Logarithmic Time|
|$O(n)$|Linear Time|
|$O(n log n)$|Log-Linear Time|
|$O(n^c)$|Polynomial Time|
|$O(c^n)$|Exponential Time|

### Combining Complexity Classes

#### Law of Addition for O()
- used with **sequential** statements
- $O(f(n)) + O(g(n)) = O(f(n) + g(n))$
- e.g. <br>
```python
for i in range(n):
    print('a')
for j in range(n*n):
    print('b')
```
- $O(n) + O(n*n) = O(n+n^2) = O(n^2)$
---
#### Law of multiplication for O()
- used with **nested** statements / loops
- $O(f(n)) * O(g(n)) = O(f(n)*g(n))$
- e.g. <br>
```python
for i in range(n):
    for j in range(n):
        print('a')
```
- $O(n) * O(n) = O(n*n) = O(n^2)$

### Complexity Growth

![Complexity Growth](diagrams/complexity-growth.png)



### Linear Complexity

 Simple iterative loop algorithms are typically linear in complexity
 
 ```python
def linear_search(L, e):
    for i in range(len(L)):
        if e == L[i]:
            return True
    return False
```

- must look through all elements to decide it's not there
- $O(len(L))$ for the loop * $O(1)$ to test if e == L[i]
- Overall complexity is $O(n)$ where $n$ is $len(L)$

### Sorted List - Linear Search

```python
def linear_search_sorted(L, e):
    for i n range(len(L)):
        if L[i] == e:
            return True
        if L[i] > e:
            return False
    return False
```

- must only look until reach a number greater than e
- $O(len(L))$ for the loop * $O(1)$ to test if e == L[i]
- overall complexity is still $O(n)$ - where n is len(L) because worst case scenario is no different from unsorted
- although order of growth is the same, run time may differ for the two search methods


### Quadratic Complexity

Loops that have loops in them
<br>
e.g. determine if one list is subset of second, i.e. every element of first, appears in second (assume no duplicates)

```python
def isSubset(L1, L2):
    for e1 in L1:
        matched = False
        for e2 in L2:
            if e1 == e2:
                matched = True
                break
        if not matched:
            return False
    return True
```

- outer loop is executed len(L1) times
- each iteration will execute inner loop up to len(L2) times, with constant number of operations
- $O(len(L1)*len(L2))$
- worst case when L1 and L2 same length, all of the elements of L1 in L2
- $O(len(L1)^2)$


### Another example of Quadratic Complexity

find intersection of two lists, return a list with each element appearing only once

```python
def intersect(L1, L2):
    tmp = []
    for e1 in L1:
        for e2 in L2:
            if e1 == e2:
                tmp.append(e1)
    # now to dedupe
    res = []
    for e in tmp:
        if not (e in res):
            res.append(e)
    return res
```

- first nested loop takes $len(L1)*len(L2)$ steps
- second loop takes at most $len(L1)$ steps
- determining if element in list might take $len(L1)$ steps
- if we assume lists are roughly of the same length, then
    - $O(len(L1)^2)$