# Greedy Algorithm
Remember that in [dynamic programming](./dynamic_programming.ipynb), we have to traverse a tree to find our desired solution.
We compare the results from branches of a node in an attempt to find the solution.
What if we somehow knew which branch would lead to the correct solution?
It could greatly improve our time complexity of our algorithm.
This is the so called **greedy algorithm**.

However, whether we can solve a problem greedily greatly depends on the nature of the problem.
Many times, applying greedy algorithm to our dynamic programming questions easily leads us to a wrong answer.
Thus, it is imperative for us to properly prove the correctness of our greedy algorithm.

## Proving correctness
We can use **local swap** proving technique to prove the correctness of greedy algorithm.

### Local swap
Given a problem where we are to find an optimal solution $O$ such that $f(O)$ is minimum. (Flip the assertions correspondingly if we need a maximum solution instead).

The procedure is as follows:
1. Consider the solution $G$ that our (greedy) algorithm produce
2. We need to prove that $f(G) = f(O)$
3. For any solution $O$, show that we can slowly change it to obtain $G$, without increasing $f$ at any step.
    1. Define a distance metric between any solution from $G$. $dist(S) = 0$ if and only if $S = G$
    2. Let $O \neq G$ be the optimal solution with the smallest distance
    3. We now try to produce another solution $O*$ such that $f(O*) < f(O)$, or $f(O*) = f(O), dist(O*) < dist(O)$
    4. If the above is possible, then our greedy algorithm is correct.

### Example
#### Chapter arrangement
Consider arranging $n$ chapters within a book.
Each chapter consists of a number of pages $P_1, \dots, P_n$.
Suppose that we have a simple reader, where to read the $k-th$ chapter, it has to also read all the chapters from $1$ to $k-1$.
Hence, the cost to read the $k-th$ is simply
$$
Cost(k) = \sum _{i=0} ^k P_i
$$

Our goal is to minimize the average cost of accessing any chapter.

---

The greedy solution would be to arrange the chapter in ascending page size.
This does fit our intuition because it means that accessing the later chapters would encounter the least amount of overhead of going through the previous chapters.

We wish to prove that the cost
$$
f(q) = P_{i_1} + (P_{i_1} + P_{i_2}) +(P_{i_1} + P_{i_2} + P_{i_3}) + \dots + (P_{i_1} + \dots + P_{i_n})
$$
is minimized if $i$ is chosen such that $P_{i_1} \leq P_{i_2} \dots P_{i_n}$

We consider the permutation $G$ produced by our greedy algorithm such that
$$
P_{i_1} \leq P_{i_2} \dots P_{i_n}
$$

For any permutation $O$, we define $dist(q)$ as the number of inversions, that is the number of index pairs $i,j$ such that $i < j$ and $P_i > P_j$.
Notice that since $G$ is sorted in ascending order, the number of inversions forms a distance metric from it.

It is trivial to see that `dist` is finite, and $dist(O) = 0$ if and only if $O=G$.

Hence, for any optimal solution $O \neq G$, this means that $dist(O) > 0$, which means there is an inversion.
Suppose that $O$ arranges the item with the indices
$$
P_{o_1} + P_{o_2} \dots  + P_{o_i} + \dots P_{o_j} + \dots P_{o_n}
$$
When we consider the $P_{o_i} > P_{o_j}$, $o_i < o_j$, notice that we can produce $O*$ by swapping the location of $P_{o_i}$ and $P_{o_j}$.

$O*$, then would be in the form of 
$$
P_{o_1} + P_{o_2} \dots  + P_{o_j} + \dots P_{o_i} + \dots P_{o_n}
$$
Now, we consider the cost of this permutation with respect to $O$.
Notice that for pages before $o_i$ and after $o_j$ in the original $O$, the cost is the same in $O*$.
And when we expand it out, we can easily see that the cost in $O*$ is smaller than the cost in $O$.
This means we have produced a better permutation than the optimal solution, which is a contradiction.
Hence, it must be that $O = G$.

### Scheduling classes
Suppose that we are given an array of classes, defined as starting $S_1, \dots, S_n$ and ending times $E_1, \dots E_n$.

The task is to find the largest possible subset of $\{1, \dots, n\}$ such that for any pair $i,j$ in $X$, either $S_i \geq E_j$ or $S_j \geq E_i$.
In other words, find the largest set of classes such that no two classes are occurring at the same time.

#### Exploration
Suppose that we picked the class with the shortest duration.
This can be easily proven to be the wrong approach, considering the following class input.
```
[   a     ][      b     ]
        [  c  ]
```
Our algorithm would have picked $c$ only, when the optimal solution is actually $a + b$.

What about the class that begins first?
It is also obvious that this would fail if the first class that starts has a very long duration, as per below
```
[    a    ]
[ b ]   [ c ]
```

Then what about the class that ends first?
When we try to construct counter examples, we can't seem to find one.
This might be hint that this could be a correct solution.
Hence, the next step would be to formally prove that this is indeed correct.

Suppose that we the classes chosen by $O$ and $G$ are as below:
$$
O = \{ o_1, \dots o_n\} \\
G = \{ g_1, \dots, d_n \}
$$

We define $dist(O)$ as $i+1$, where $i$ is the smallest index such that $o_i \neq g_i$ (and 0 if $G = O$).

We now consider the $O$ with the smallest distance, with distance $i+1$.
Now, we know that $g_i \neq o_i$.
By our greedy algorithm, we know that $E_{g_i} \leq E_{o_i}$, since we always picked the class that ended first.
Now, notice that we can produce $O*$, as below
$$
O = \{ o_1, \dots, o_{i-1}, o_i, o_{i+1}, \dots o_n \} \\
O* = \{ o_1, \dots, o_{i-1}, g_i, o_{i+1}, \dots o_n \}
$$

Notice that this has to be a valid arrangement, because we know that $g_i$ will not clash with $o_{i-1} = g_{i-1}$, and since $E_{g_i} \leq E_{o_i}$ (by greediness) and $S_{o_{i+1}} \geq E_{o_i}$ (by validity of $O$), we see that $S_{o_{i+1}} \geq E_{o_i} \geq E_{g_i}$, which implies that $O*$ must be valid.

Pictorially, it as below
``` 
O    [ o_1 ] ... [o_{i-1}] [   o_i   ] [o_{i+1}]  ...
G    [ g_1 ] ... [o_{i-1}] [ g_i ]   ...
O*   [ o_1 ] ... [o_{i-1}] [ g_i ]     [o_{i+1}] ...
```

Hence, we have found another optimal solution that is closer than $O$, leading to a contradiction.
Following the logic as per defined previously completes our proof.