## Resources:  
* Competitive programming at AtCoder  
* Introduction to Algorithms 3rd edition  

## What are algorithms?  
* Methods and procedures to solve mathematical, analytic problems. 

# Linear and Binary Search  
Think of the following problem:  
> You want to find out the age of Ms. A.  
> You know that she is in the range of 20~35 (inclusive).  
> With a limited number of questions allowed, can you correctly guess the age of A?  

In [2]:
# Linear Search 
import random
import math
def lin_guess(ages, A):
    i = 0 # how many guesses it took
    for age in ages:
        i += 1
        if age == A:
            return "It took {} guesses to guess that A's age was {}".format(i, age)
# testing
ages = range(20, 36)
A = random.randint(20, 36) # A's true age
print("A is {} \n".format(A), lin_guess(ages, A))


A is 24 
 It took 5 guesses to guess that A's age was 24


In [3]:
# Binary Search
def bin_guess(ages, A):
    mn = ages[0]
    mx = ages[-1]
    i = 0
    found = False
    while not found:
        mid = (mn + mx) // 2
        if A < mid:
            mx = mid
        elif A > mid:
            mn = mid + 1
        else:
            found = True
        i += 1
    return "It took {} guesses to guess that A's age was {}".format(i, mid)

# testing 
ages = range(20, 36)
A = random.randint(20, 36) # A's true age
print("A is {} \n".format(A), bin_guess(ages, A))

A is 35 
 It took 5 guesses to guess that A's age was 35


# Depth-first Search  
Depth-first search (DFS) is an algorithm for traversing or searching tree or graph data structures. The algorithm starts at the root node (selecting some arbitrary node as the root node in the case of a graph) and explores as far as possible along each branch before backtracking.  
Take the following problem:  
$$
\begin{array}{ccc}
& 2 & 1 \\
\times & \Box & 2 \\
\hline 
\Box & 3 & \Box \\
\Box & \Box & \\
\hline
\Box & 4 & \Box
\end{array}
$$
Fill in the boxes with digits so that the equation makes sense.  
In a depth-first approach, you assume a value for the first box, assume a value for the second box, and so on until you discover a discrepancy, where you return to the preceding box, update its value accordingly, and continue. 

# Breadth-first Search  
Breadth-first search (BFS) is an algorithm for searching a tree data structure for a node that satisfies a given property. It starts at the tree root and explores all nodes at the present depth prior to moving on to the nodes at the next depth level. Extra memory, usually a queue, is needed to keep track of the child nodes that were encountered but not yet explored.  
In a width-first search, consider all immediately available options and then consider all the immediate options for each of those options, and so on. 

# Computational Complexity  
When an algorithm's run time is roughly proportional to $P(N)$, we call that algorithm's computational complexity $O(P(N))$.  
Suppose an algorithm's run time $T(N)$ is 
$$T(N) = 3N^2 + 5N + 100$$ 
the computational complexity, then, is determined by the highest-order term minus the coefficient, so $O(N^2)$.  
For example, an algorithm that prints all even values in range(1, N) requires a for-loop that takes $\frac{N}{2}$ looks through the range. The computational complexity of this algorithm would still be $O(N)$. 

```
# c++ skeleton code for counting even numbers in range(1, N)
#include <iostream> 
using namespace std;

int main() {
    int N;
    cin >> N; // cin stands for console in, and is assigning the console input value to the declared variable N
    
    for (int i = 2; i <= N; i += 2) {
        cout << i << endl; // cout outputs to console, endl is equivalent to end the line ("\n")
    }
}
```

Think of the following problem:
> Given N coordinates $(x_i, y_i)$ where $(i=0, 1, \dots, N -1)$, determine the distance of the closest two coordinates.  


```
# c++ skeleton code for coordinate distance minimization

#include <iostream>
#include <vector>
#include <cmath>
using namespace std;

// function to calculate coordinate distance
double calc_dist(double x1, double y2, double x2, double y2) {
    return sqrt(pow(x1 - x2, 2) + pow(y1 - y2, 2)) // pow is used for exponents, from cmath
}

int main() {
    //receive input data
    int N; cin >> N;
    vector<double> x(N), y(N);
    for (int i = 0; i < N; i++) cin >> x[i] >> y[i]; // filling up vectors
    
    // initialize minimum 
    double min_dist = 10000000000.0;
    
    // search
    for (int i = 0; i < N; i++) {
        for (int j = i + 1; j < N; j++) { //covers all combinations
            double dist_i_j = calc_dist(x[i], y[i], x[j], y[j]);
                                         
            if (dist_i_j < min_dist) {
                min_dist = dist_i_j;
            }
        }
    }
    //output answer
    cout << min_dist << endl;
}
```

In the above problem,  
* when i = 0, j loops N - 1 times 
* when i = 1, j loops N - 2 times  
$\vdots$
* when i = N - 2, j loops 1 time
* when i = N - 1, j loops 0 times  
Thus, 
\begin{align} 
T(N) &= (N-1) + (N-2) + \dots + 1 + 0 \\
&= \sum_{k=0} ^{N-1} k  \\
&= \sum_{k=1}^N k - N \\
&= \frac{1}{2}N(N+1) - N = \frac{1}2N^2 - \frac{1}2N
\end{align}
Implying that $T(N) = O(N^2)$

## Landau's O  
Let $T(N)$ and $P(N)$ each denote functions in the field $\mathbb{N}_0$. Then, $T(N) = O(P(N))$ means that $\exists\: c >0, c \in \mathbb{R}$ and $m \in \mathbb{N}_0$, $\forall N > m$, the following holds:
$$ |\frac{T(N)}{P(N)}|\leq c$$
  
Let's reexamine the case of $T(N) = T(N) = 3N^2 + 5N + 100$  
\begin{align*}
\frac{T(N)}{N} &= 3N + 5 + \frac{100}N \dots (1) \\
\frac{T(N)}{N^2} &= 3 + \frac{5}{N} + \frac{100}{N^2} \dots (2)
\end{align*}
As $N \to \inf$, $(1) \to \inf$ and cannot be bound above by $c$, but $(2) \to 3$ and can be bound above by $c = 4$. Thus, $T(N) = O(N^2)$. 


## Caution  
* In practice, especially when dealing with small $N$, the coefficient can have considerable influence on run-time, and thus the algorithm with the lowest computational complexity may not always be the fastest.  
* The O-notation generally denotes the worst case computational run-time (even in the worst possible case, the algorithm will get the job done in O(x))  


# Exercises

1. Compute the Landau representation of the functions below:  
    1. $T_1(N) = 1000N$ <div style="text-align: right"> Ans. $O(N)$ </div>

    2. $T_2(N) = 5N^2 + 10N + 7$ <div style="text-align: right"> Ans. $O(N^2)$ </div>

    3. $T_3(N) = 4N^2 + 3N\sqrt{N}$  <div style="text-align: right">Ans. $O(N^2)$</div>

    4. $T_4(N) = N\sqrt{N} + 5N\log N$  <div style="text-align: right">Ans. $O(N\sqrt{N})$</div>
  
    5. $T_5(N) = 2^N + N^{2019}$  <div style="text-align: right">Ans. $O(2^N)$</div>

2. Represent the following formula in Landau form: 


In [None]:
import math
def is_prime(N):
    res = True
    if N <= 1:
        res = False
    fac = 2
    while (fac ** 2) < N:
        if not N % fac:
            res = False
            break
        fac += 1
    return res

# test case
print(is_prime(3), is_prime(5), is_prime(8))

True True False


$$\text{Ans. } O(\sqrt{N})$$

3. Using binary search, indicate that the age of A such that $0 \le age \le 2^k$ can be determined within $k$ guesses.  
  
    Ans. With the binary search algorithm, the number of possibile values the target can take is divided by 2 with each guess. Then, since there are $2^k$ possible values to start with, with each guess we can divide the number of possible values by $2$ to and arrive at 1 option after $k$ guesses ($2^k / 2^k = 1$). 

4. Let A's age be in $[0, N)$. Show that the binary search method will reveal A's true age in $O(logN)$.  
  
    Ans. We have already shown that for $0 \le age \le 2^k$, the binary search method reveals an answer in maximum $k$ guesses. We can find $k$ such that $N=2^k$. Then, $k = \log_2 N$, hence $O(\log N)$.

5. Show that $\sum_{n=1} ^N = O(\log N)$.  

    Ans. $\log N= \log N - \log 1 = \int ^N _1 \frac{1}{n}$.  
    Since $\sum_{n=1} ^N$ is discrete and the above is continuous, 
    $$\sum_{n=1} ^N \frac{1}{n} < \int ^N _1 \frac{1}{n}$$
    Thus, the above is $O(\log N)$
