## Measuring Performance
- Two main resources of interest 
    - Running time: How long algorithm takes
    - Space: memory requirement
---
### Input Size
- Running time depends on input size
    - Larger array takes longer to sort
- Measure time efficiency as function of **input size**
    - input size - `n`
    - Running time `t(n)`
---
#### Example 1: SIM vs Aadhar  Cards
- Background
    - `n` ~ $10^{9}$: Number of people in india
    - `n` ~ $10^{9}$: Number of SIM cards in india
    - For every SIM find if it has a matching Aadhar
    - for loop on SIM cards (`n`)
        - Nested for loop on Aadhar cards (`n`)
        - Check if SIM No. == Phone No. on Aadhar
        - If no matching Aadhar found, fraud SIM
- Naive algorithm: `t(n)` ~ $n^{2}$
    - Running a nested for loop
- Clever algorithm *(binary search)* : `t(n)` ~ $n\ log_{2}n$
    - $log_{2}n$: number of times you need to divide n by 2 to reach 1
    - $log_{2}(n)\ =\ k\ \Rightarrow \ n =\ 2^{k}$
---
### Orders of Magnitude
- Refers to the magnitude of the growth of a function
- It is typically measured in terms of powers of the input size (`n`)
- Example: A function has an order of magnitude of $n^{2}$, means that the function's growth rate is quadratic
---
#### Ignore Constants
- When comparing `t(n)` focus on orders of magnitude
    - Ignore constant factors
- `f(n)` = $n^{3}$ eventually grows faster than `g(n)` = $5000n^{2}$
    - After `n = 5000` `f(n)` overtakes `g(n)` 
    - At `n = 5000` both become $5000^{3}$
---
### Asymptotic complexity
We find the aymptotic complexity
- We are not interested for a fixed `n`
- We are interested when `n` is large
    - What happens when `n` becomes large
- Typical growth functions
    - Is `t(n)` proportional to $log\ n..., n^{2}, n^{3}...,2^n?$
        - Note: $log \ n$ means $log_{2}\ n$ by default
    - Logarithmic, polynomial, exponential
    <img src=attachment:image.png height= 500 width = 500>
---
### Measuring Running Time
- Analysis should be independent of underlying hardware
    - Don't use actual time
    - Meausre in terms of **Basic Operations**
- Typical basic operations
    - Compare two values
    - Assign a value to variable
---
#### Counting Basic Operations
Swapping values
```python 
# Method 1
(x, y) = (y, x)
# Method 2 - above is implicitly performed by taking temporary variable
t = x
x = y
y = x
```
- Need not to be precise with counting of basic operation
- We assume Method 1 has same number of operations as Method 2
- In general, we just count the number of statements
---
### Which Inputs to Consider
- Performance varies accorss input instances
    - By luck, the list to be sorted is already sorted
    - Or element to be found is in the middle (binary search)
- Ideally we want the "Average" behaviour
    - But this is difficult to compute
    - Need probability distribution, then find *Expected* value
- Instead we consider **Worst Case** input
    - Input that forces algorithm to take longest time
    - Upper bound on worst case gives an overall guarantee on performance

## Comparing Orders of Magnitude
- How do we compare functions w.r.t. orders of magnitude?
    - By examining how their growth rates differ as the input size approaches $\infty$
---
### Big-O Notation
- The letter "O" in "big O" stands for "order" or "order of magnitude
- $f(x)$ is $O(g(x))$ means that the growth rate of $f(x)$ is not faster than the growth rate of $g(x)$
- We use Big-O notation to classify algorithms based on number of operations or comparisions they use
- For large values of $x$: $x^{2},3x^{2}+5x, 6x^{2} + 100...$ are very similar
- So we consider them as the same order: **O($x^{2}$)**
---
### Upper Bounds
- An upper bound is an upper limit on the growth rate of a function
- $f(x)$ is said to be $O(g(x))$ if we can find constants $c$ and $x_{0}$ such that $c\ \cdot\ g(x)$ is an upper bound for $f(x)$ for $x$ beyond $x_{0}$
- Note: Choice of $c$ and $x_{0}$ is not unique
- $f(x)\le cg(x)$ for every $x \ge x_{0}$
<img src=attachment:image.png width=400 height = 400 style="display:inline"> <img src=attachment:image-2.png width=400 height = 400 style="display:inline">
---
#### Complexities
In increasing order of complexity, these are:
- Constant (`1`)
- Logarithmic (`logx`)
- Linear (`x`)
- Linearithmic (`xlogx`)
- Poynomial (`x^c` for `c>1`)
- Exponential (`c^x` for `c>1`)
---
#### Example
Show $3x^{2}\ +\ 25$ is $O(x^{2})$
##### Method 1: Take a good looking arbritrary number
- Take $x\ =\ 5$
- $c\cdot x^{2} \ge\ 3(5^{2})\ +\ 25$
- $c\cdot 25 \ge\ 100$
- $\Rightarrow c\ge\ 4$
- $\Rightarrow$ $c\ =\ 4$ and $x_{0}\ =\ 5$

##### Method 2: Bump up each term to highest power
- $3x^{2}\ +\ 25x^2$
- $\Rightarrow\ 28x^2$
- $c \cdot (x^2) \ge \ 28x^2$
- $\Rightarrow \ c\ =\ 28, \ x_{0}\ =\ 1$

##### Method 2: Bump up method
Show $100n\ +\ 5$ is $O(n^{2})$
- $100n\ +\ 5n$
- $\Rightarrow\ 105n$
- $c \cdot (n^2) \ge \ 105n\ \Rightarrow\ 105 n^2 \ge \ 105n$
- $\Rightarrow \ c\ =\ 105, \ x_{0}\ =\ 1$

#### Method 3: Estimating
- If you are given two polynomial functions, the one with higher power is $O()$ of other
- For more complex like comparing $n^3$ and $n^2logn$
    - $n^3$ is not $O(n^2logn)$ because
    - $n^2$ is smaller than $n^3$
    - $logn$ is less than $n$

## Calculating Complexity
There are two types of programs to calculate complexity
- Iterative Program
- Recursive Program
---
### Example 1: Find maximum element in a list
```python
def maxElement(L):
    maxval = L[0]
    for i in range(len(L)):
        if L[i] > maxval:
            maxval = L[i]
    return maxval
```
- Input size is the length of list (`n`)
- Loop scans all elements
- Always takes `n` steps
- Overall time is $O(n)$
---
### Example 2: Check whether list has duplicates
```python
def noDuplicates(L):
    for i in range(len(L)):
        for j in range(i+1, len(L)):
            if L[i] == L[j]:
                return False
    return True
```
- Input size is `n`
- Worst case: No duplicates, both loops run fully
- Time is
    - The first time nested loop runs for $(n-1)$ times then $(n-2)$ then $(n-3)$...
    - $(n-1)\ +\ (n-2)\ +...+\ 1\ = n(n-1)/2$
    - $\Rightarrow \frac{n^2}{2} \ -\ \frac{n}{2}$
    - Focus on order of magnitudes... $n^2$
- Overall time is $O(n^2)$
---
### Example 3: Matrix multiplication
```python
def matrixMultiply(A, B):
    (m, n, p) = (len(A), len(B), len(B[0]))
    
    C = [[0 for i in range(p)] for i in range(m)]
    
    for i in range(m): # M loop
        for j in range(p): # P loop
            for k in range(n): # N loop
                C[i][j] += A[m][k] * B[k][j]
    return C
```
- Overall time is $O(mnp)$
    - If both are `n*n` matrix then $O(n^3)$
- **TRICK** to calculate complexity
    - Start by inner-most loop and go upwards
    - N loop runs `n` times
    - N loop is controlled by P loop which runs N loop `p` times
    - Now total number of times N loop runs is $p\cdot n$
    - P loop controlled by M loop and runs it `m` times
    - So N loop runs $m\cdot p \cdot n$ times
---
### Example 4: Number of bits in binary representation of `n`
```python
def numberOfBits(n):
    count = 1
    while(n > 1):
        count += 1
        n = n // 2
    return count
```
- $logn$ steps for `n` to reach 1
- It is a linear function 
- NOTE: For number theoretic problems, input size is number of digits
- This algorithm is linear in input size
    - if n = 100, input size is 3
    - if n = 1300, input size is 4
---
### Example 5:  Towers of Hanoi
##### Rules
- Three pegs A, B, C
- Move `n` disks from A to B, use C as transit peg
- Never put a larger pen on a smaller

##### Recursive Solution
- Move `n - 1` disks from A to C, use B as transit peg
- Move largest disk from A to B
- Move `n - 1` disks from C to B, use A as transit peg
```python
def tower_hanoi(n, source, destination, helper):
    if n==1:
        print("Move disk ", n, " from ", source, " to ", destination)
        return
    # We are just printing, returning nothing
    tower_hanoi(n-1, source, helper, destination)
    print("Move disk ", n, " from ", source, " to ", destination)
    tower_hanoi(n-1, helper, destination, source)
```

##### Recurrence 
- $M(n)$ moves `n` disks
- $M(1)$ moves one disk
- $M(n)\ =\ M(n-1)\ +\ 1\ +\ M(n-1)\ = 2(M-1)\ +\ 1$

##### Unwind and Solve
- $M(n)\ =\ 2(M-1)\ +\ 1 $
- $=\ 2(M-1)\ +\ 1 $
- $=\ 2(2(M-2)+1)\ +\ 1 \ =\ 2^2(M-2)\ +\ (2+1) $
- $=\ 2^2(2(M-3)+1)\ +\ (2+1) \ =\ 2^3(M-3)\ +\ (4+2+1)$
- $. . . $
- $=\ 2^k(M-1)\ +\ (2^k-1) $
- $. . . $
- $=\ 2^{n-1}(1)\ +\ (2^{n-1}-1) $
- $=\ 2^n\ -\ 1$

## Searching a List
- Naive Search
- Binary Search
---

### Naive Search
```python
def naivesearch(v, l):
    for x in l:
        if(x == v):
            return True
    return False
```
---

#### Analysis
- Is value `v` present in list `l`
- Input size `n` is size of `l`
- Worst case: `v` not in `l`
- Worst case complexity $O(n)$
---

### Binary Search
- Given, list is sorted in ascending order
```python
def binarysearch(v, l):
    if l == []:
        return False
    
    m = len(l) // 2
    
    if v == l[m]:
        return True
    
    if v < l[m]:
        return binarysearch(v, l[:m])
    else:
        return binarysearch(v, l[m+1:])
```
---

#### Efficiency - Method 1
- Number of calls?
    - Each call halves the interval to search
    - Stop when interval become empty (when `v` not in `l`)
- $logn$: Number of times to divide `n` by 2 to reach 1
    - $logn\ +\ 1$: Extra 1 for $1//2\ =\ 0$ (happens when `v` not in `l`)
    - We ignore the $1$ and thus...
- Time Complexity: $O(logn)$
---

#### Efficiency - Method 2
- $T(n)$: time taken to search a list of length `n`
    - if `n = 0`, we exit so $T(n)\ =\ 1$
    - if `n > 0`, $T(n)\ =\ T(n//2)\ +\ 1$
    - *Why 1 is added above?*
        - Each call has 4 Basic operations
        - 3 comparing (`if` statements) and 1 assigning (`m`)
        - We collapse 4 to 1, as we focus on orders of magnitude and not constants
- **Recurrence** for $T(n)$
    - What is **recurrence**?
        - A recurrence relation typically consists of two parts
        - A base case
        - And a recursive case
    - Base case: $T(0)\ =\ 1$
    - Recursive case: $T(n)\ =\ T(n//2)\ +\ 1 ,\ n>0 $
- Solve by **Unwinding**
    - Take the definition and solve it again and again until you reach the base case
    - When solving by **Unwinding** you do not need to know the algorithm itself, you just need to know the recurrence
        - $T(n)\ =\ T(n//2)\ +\ 1$
        - $      =\ (T(n//4)\ +\ 1)\ +\ 1\ =\ T(n//2^2)\ +\ 2$
        - $      =\ .\ .\ .$
        - $      =\ T(n//2^k)\ +\ k$
        - $      =\ T(1)\ +\ k,\ for\ k\ =\ logn$
        - $      =\ (T(0)\ +\ 1)\ +\ logn$
        - $      =\ 2\ +\ logn$
    - Time Complexity: $O(logn)$

## Selection Sort
---
#### Algorithm
- Repeatedly find the minimum (or maximum) and append to sorted list
- Avoid using second list to append elements
    - Swap the elements instead
    - Using a second list is duplication of list which has some overhead
- Assume `L[:i]` is sorted
---
#### Code
```python
def SelectionSort(L):
    n = len(L)
    # If list is empty
    if (n < 1):
        return L
    for i in range(n):
        # We assume L[:i] is sorted
        mpos = i
        # mpos: Position of the minimum element in L[i:]
        for j in range(i+1, n):
            if L[j] < L[mpos]:
                # L[mpos] is smallest value in L[i:]
                mpos = j
        # Exchange L[i] and L[mpos]
        (L[i], L[mpos])  = (L[mpos], L[i])
        # Now L[:i+1] is sorted
    return L
```
---
#### Efficiency
- Outer loop iterates `n` times, inner loop iterates `n-i` times
    - $T(n)\ =\ n\ +\ (n-1)+\ ... +\ 1$ (sum of first `n` numbers)
    - $T(n)\ =\ n(n+1)/2$
- $T(n)$ is $O(n^2)$
- NOTE: Even if the input list `l` is sorted $T(n)$ is still $O(n^2)$
    - All cases take $O(n^2)$ 

## Insertion Sort
---
#### Algorithm
- Start building a new sorted list
- Pick the next element and insert it to correct position
- **Iterative Formulation**
    - Assume `L[:i]` is sorted
    - Insert `L[i]` in `L[:i]`
    - *Updates* the list in place (you can also create a new list, but too much overhead)
- **Recursive Formulation**
    - Inductively sort `L[:i]` using `Isort()`
    - Insert `L[i]` in sorted `L[:i]` using `Insert()`
    - *Creates* and returns a new list, list provided as argument remains the same
    - Less efficient than Iterative approach, takes more time
---
#### Code: Iterative
```python
def InsertionSort(L):
    n = len(L)
    if (n < 1):
        return L
    for i in range(n):
        j = i
        while(j>0 and L[j] < L[j-1]):
            # Assume L[:i] is sorted
            # Move L[i] to correct position
            (L[j], L[j-1])  = (L[j-1], L[j])
            # Now L[: i+1] is sorted
            j -= 1
    return L
```
---
#### Code: Recursive
```python
def Insert(L, v):
    n = len(L)
    if n<1:
        return [v]
    if v >= L[-1]:
        return (L + [v])
    else:
        return (Insert(L[:-1], v) + L[-1:])
    
def Isort(L):
    n = len(L)
    if n<1:
        return L
    L = Insert(Isort(L[:-1]), L[-1])
    return L
```
---
#### Efficiency: Iterative
- $T(n)$ is $O(n^2)$
    - $T(n)\ =\ 0\ +\ 1\ +...+\ (n-1) $
    - $T(n)\ =\ n(n-1)/2$
---
#### Efficiency: Recursive
- For input size `n` let
    - Time taken by `Insert` be $TI(n)$
    - Time taken by `Isort()` be $TS(n)$
- For $TI(n)$
    - $TI(0)\ =\ 1$
    - $TI(n)\ =\ T(n-1)\ +\ 1$
    - Unwind to get $TI(n)\ =\ n$
- For $TS(n)$
    - $TS(0)\ =\ 1$
    - $TS(n)\ =\ TS(n-1)\ +\ TI(n-1)$
    - Unwind to get $1\ +\ 2\ +...+\ (n-1)$
- $T(n)$ is $O(n^2)$
---
#### Summary
- Unlike selection sort, not all cases take time $n^2$
- If a list is already sorted, it takes 1 step
- Overall time can be close to $O(n)$

## Merge Sort
---
#### Merge Function
- Combine two combine two sorted list `A` and `B` into `C`
- If `A` is empty, copy `B` into `C`
- If `B` is empty, copy `A` into `C`
- Otherwise, compare elements of `A` and `B`
    - Move smaller one to `C`
- Repeat till `A` and `B` become empty

---
#### Code: merge Function
```python
def merge(A, B):
    (m, n) = (len(A), len(B))
    
    (C, i, j, k) = ([], 0, 0, 0)
    '''
    i, j: keeps track of current element of list A, B respectively
    k: tracks for total elements in C, helps to stop the loop
    C: target list / merge list
    '''
    
    while (k < m + n): # Stop when the number of elements in C == the number of elements in A & B
        
        # if A is empty, copy B to C
        if (i == m): # Condition is True when i (starting from 0) reaches m
            C.extend(B[j:])
            k += (n-j) # increase k by the number of elements added
        
        # if B is empty, copy A to C
        elif (j == n): # Condition is True when j (starting from 0) reaches n
            C.extend(A[i:])
            k += (m-i) # increase k by the number of elements added
        
        # if neither of them is empty, then compare elements
        elif (A[i] < B[j]):
            C.append(A[i])
            (i, k) = (i+1, k+1) # increase both i and k
        
        else:
            C.append(B[j])
            (j, k) = (j+1, k+1)
    
    return C
```
---

#### Merge Sort Function
- Sorts `A` into `B`, both of length `n`
- If $n \le 1$, nothing to be done
- Otherwise
    - Sort `A[:n//2]` into `L`
    - Sort `A[n//2:]` into `R`
    - Merge `L` and `R` into `B`
    
---
#### Code: mergesort Function
```python
def mergesort(A): # A is the list to sort
    n = len(A)
    if n <= 1: # if has only one element return A
        return A
    
    '''
    Recursively call mergesort until it reaches the base case
    On each call L and R halves
    '''
    L = mergesort(A[: n//2]) 
    R = mergesort(A[n//2 :])
    
    B = merge(L, R)
    
    return B
```
---
Dry run of above function
<img src="attachment:IMG_20230630_150100_028.jpg" alt="Image" style="width:500px; height:700px;">

# Extra Notes

##### Raising recursion limit in python
- Maximum recursion depth allowed by Python, which by default is **1000**
```python
# This will give error once crosses 1000 calls
def recursive_function():
    recursive_function()

    recursive_function()
```

- You can set a recursion limit upto $2^{31} - 1$ in python
```python
import sys
sys.setrecursionlimit(2**31 - 1)
```

##### Python
- Python can perform around $10^7$ operations per second
``` python
import time
a = time.perf_counter()
for i in range(10**7):
    i = None
b = time.perf_counter()
print(b - a)  # 0.7494662999961292
```

In [None]:
l = [1, 2, 4, 1, 2, 9, 3, 2]
print(l[-1]) # returns the element
print(l[-1:]) # returns the element inside a list

In [30]:
def noReturn():
    return # You can an empty return
noReturn()

In [None]:
l = [1, 2]
l.extend([3, 4]) # extened expects an iterable
print(l)

## Playground