**CS560 - Algorithms and Their Analysis**
<br>
Date: **28 October 2020**
<br>

Title: **Lecture 8**
<br>
Speaker: **Dr. Shota Tsiskaridze**
<br>
Teaching Assistant: **Levan Sanadiradze**

Bibliography:
<br> 
 **Chapter 16.1 - 16.2**. Cormen, Thomas H. and Leiserson, Charles Eric and Rivest, Ronald Linn and Stein, Clifford Seth, *Introduction to Algorithms, 3rd Edition*, MIT Press, 2009
 


<h1 align="center">Greedy Algorithms</h1>


- Algorithms for **optimization problems** typically **go through** a **sequence of steps**, with a set of choices at each step. 


- For many optimization problems, **using dynamic programming** to determine the best choices **is overkill**; 


- The simpler algorithms is **greedy algorithm** that always **makes the choice** that **looks best at the moment**. 

<h1 align="center">An activity-Selection Problem</h1>

- Suppose we have a **set** $S = \{a_1, a_2, ..., a_n\}$ of $n$ proposed **activities** that wish to use a **resource**, such as a **lecture hall**.


- The **lecture hall** can serve **only one activity at a time**.


- Each **activity** $a_i$ has a **start time** $s_i$ and a **finish time** $_i$, where $0 \leq s_i < f_i < \infty$, and takes place during the half-open time interval $[s_i, f_i)$.


- Activities $a_i$ and $a_j$ are **compatible** if the intervals $[s_i, f_i)$ and $[s_j, f_j)$ **do not overlap**, i.e. when $s_i \geq f_j$ or $s_j \geq f_i$.


- In the **activity-selection problem**, we wish to **select a maximum-size subset** of **mutually compatible activities**.


- **Note**: We assume that the activities are **sorted** in **monotonically increasing order** of ****finish time:

  $$f_1 \leq f_2 \leq f_3 \leq \cdots \leq f_{n-1} \leq f_n.$$


<h1 align="center">Example</h1>


- Let's consider an example of a set $S$ of activities:

|  $i$  | 1 | 2 | 3 | 4 | 5 | 6 | 7  | 8  | 9  | 10 | 11 |
|:-----:|:-:|:-:|:-:|:-:|---|---|----|----|----|----|----|
| $s_i$ | 1 | 3 | 0 | 5 | 3 | 5 | 6  | 8  | 8  | 2  | 12 |
| $f_i$ | 4 | 5 | 6 | 7 | 9 | 9 | 10 | 11 | 12 | 14 | 16 |


- For this example, the subset $\{a_4, a_9, a_{11}\}$ consists of **mutually compatible activities**.


- However, it is not a maximum subset, sinve the subset $\{a_1, a_4, a_{8}, a_{11}\}$ is a **largest subset of mutually compatible activities**.


- **Another largest subset** is $\{a_2, a_4, a_{9}, a_{11}\}$.



- **Questions**: How to solve this problem?

<h1 align="center">Dynamic Pogramming Solution</h1>

- We can easily verify that the activity-selection problem exhibits **optimal substructure**.


- Let us denote by $S_{ij}$ the **set of activities** that **start after activity** $a_i$ **finishes** and that **finish before activity** $a_j$ **starts**.


- Let us denote by $A_{ij}$ the **maximum set of mutually compatible activities** in $S_{ij}$.


- If $a_k $ is some activity in $A_{ij}$, i.e. $a_k \in A_{ij}$, then we are left with **two subproblems**:

  - **Finding mutually compatible activities** in the set $S_{ik}$.
  
  - **Finding mutually compatible activities** in the set $S_{kj}$.
  
  
- Let $A_{ik} = A_{ij} \cup S_{ik}$ and $A_{kj} = A_{ij} \cup S_{kj}$, so that:

  - $A_{ik}$ contains the activities in $A_{ij}$ that finish before $a_k$ starts;
    
  - $A_{kj}$ contains the activities in $A_{ij}$ that start after $a_k$ finishes.


- Thus, we have $A_{ij} = A_{ik} \cup \{a_k\} \cup A_{kj}$, and so the **maximum** set $A_{ij}$ of **mutually compatible activities** in $S_{ij}$ **is size of** $|A_{ij}| = |A_{ik}| + |A_{kj}| + 1$.


- The usual **cut-and-paste argument** shows that the **optimal solution** $A_{ij}$ also **include optimal solutions** to the **two subproblems** for $S_{ik}$ and $S_{kj}$.


- This way of **characterizing optimal substructure** suggests that we **might solve** the **activity-selection problem** by **dynamic programming**:

  $$c[i,j] = 
  \left\{\begin{matrix}
  0 & \text{ if } S_{ij} = \varnothing \\
  \max_{a_k \in S_{ij}} \{c[i, k] + c[k, j] + 1\} & \text{ if } S_{ij} \neq \varnothing
  \end{matrix}\right.
  ,$$
  
  where $c[i,j]$ is an array storing the **optimal solution** for the $S_{ij}$.
  
  
- We could then **develop a recursive algorithm** and **memoize it**, or we could work **bottom-up** and **fill in table entries**. **We will do this on next Seminar!**

<h1 align="center">Making the Greedy Choice</h1>

- What do we mean by the **greedy choice** for the **activity-selection problem**? 


- Intuition suggests that we **should choose** an **activity** that **leaves** the **resource available** for as **many other activities** as **possible**.
  

- Now, of the activities we end up choosing, one of them **must be the first one to finish**.


- Thus, we **choose the activity** in $S$ with the **earliest finish time**, since that would leave the resource available for as many of the activities as possible.


- **Note**: Choosing the first activity to finish is **not the only way** to think of **making a greedy choice** for this problem. **We will do this on next Seminar!**


- After making the greedy choice, we have **only one remaining subproblem** to solve: **finding activities that start after $a_1$ finishes**. 
  

- Why don’t we have to **consider activities** that **finish before** $a_1$ **starts**? 

  We have that $s_1 < f_1$, and $f_1$ is the **earliest finish time** of **any activity**, and therefore **no activity can have a finish time less than or equal to** $s_1$. 
  
  Thus, **all activities** that are compatible with activity $a_1$ must **start after** $a_1$ **finishes**.
  
  
- Let $S_k = \{a_i \in S: f_k \leq s_i \}$ be the set of activities that **start after activity $a_k$ finishes**.


- If we make the **greedy choice** of **activity** $a_1$, then $S_1$ remains as the **only subproblem** to solve.


- **Optimal substructure tells** us that **if** $a_1$ is in the **optimal solution**, **then** an **optimal solution** to the **original problem** consists of **activity** $a_1$ and **all the activities** in an **optimal solution** to the **subproblem** $S_1$.


- **One big question remains**: Is the **greedy choice** in which we **choose the first activity to finish** always part of some optimal solution?


- **Theorem**:

  - Consider any **nonempty subproblem** $S_k$, and let $a_m$ be an **activity** in $S_k$ with the **earliest finish time**. 

  - Then $a_m$ is **included** in **some maximum-size** subset of **mutually compatible activities** of $S_k$.

<h1 align="center">A Recursive Greedy Algorithm</h1>

- We can write a **straightforward**, **recursive procedure** to solve the activity-selection problem:

In [1]:
import numpy as np

def recursiveActivitySelector(s,f,k,n):
    m = k + 1
    while m < n and s[m] < f[k]:
        m = m + 1
    if m < n:
        return np.insert(recursiveActivitySelector(s,f,m,n), 0, m)
    else:
        return np.empty(0)

- In order to start, we **add the fictitious activity** $a_0$ with $f_0 = 0$:

In [2]:
s = [0, 1, 3, 0, 5, 3, 5,  6,  8,  8,  2, 12]
f = [0, 4, 5, 6, 7, 9, 9, 10, 11, 12, 14, 16]

A = recursiveActivitySelector(s,f,0,len(s))
A

array([ 1.,  4.,  8., 11.])

- Assuming that the **activities** have **already been sorted by finish times**, the running time of the call `recursiveActivitySelector()` is $\Theta(n)$.

<img src="images/L8_RAS.png" width="800" alt="Example" />


<h1 align="center">An Iterative Greedy Algorithm</h1>

- We easily can **convert** our **recursive procedure** to an **iterative one**.

In [3]:
def greedyActivitySelector(s,f):
    n = len(s)
    A = [1]
    k=1
    for m in range (2,n):
        if s[m] >= f[k]:
            A.append(m)
            k = m
    return A

In [4]:
s = [0, 1, 3, 0, 5, 3, 5,  6,  8,  8,  2, 12]
f = [0, 4, 5, 6, 7, 9, 9, 10, 11, 12, 14, 16]

A = greedyActivitySelector(s,f)
A

[1, 4, 8, 11]

- The procedure works as follows:

  The **variable $k$ indexes** the **most recent addition** to $A$, corresponding to the **activity** $a_k$ in the **recursive version**.
  
  Since we consider the **activities in order of monotonically increasing finish time**, $f_k$ is **always** the **maximum finish time** of **any activity** in $A$:
  
  $$f_k = max \{f_i :  a_i \in A\}.$$
  
  **Lines 3-4**: Selects **activity** $a_1$, initializes $A$ to **contain just this activity**, and initializes $k$ to **index this activity**.

  **Lines 5-8**: Finds the **earliest activity** in $S_k$ to finish.

- Assuming that the **activities** have **already been sorted by finish times**, the running time of the call `greedyActivitySelector()` is $\Theta(n)$.

<h1 align="center">Elements of the Greedy Strategy</h1>

- In the **process** that we followed to develop a **greedy algorithm** we **went through** the following **steps**:

  1. **Determine** the **optimal substructure** of the problem.
  
  2. **Develop** a **recursive solution**.
  
  3. **Show** that if we make the **greedy choice**, then only **one subproblem remains**.

  4. **Prove** that it is **always safe** to **make the greedy choice**.

  5. **Develop** a **recursive algorithm** that **implements the greedy strategy**.

  6. **Convert** the **recursive algorithm** to an **iterative algorithm**.


- In going through these steps, we saw in great detail the **dynamic-programming underpinnings** of a **greedy algorithm**.


- **More generally**, we **design greedy algorithms** according to the following **sequence of steps**:

  1. **Cast** the **optimization problem** as one in which we **make a choice** and are **left with one subproblem** to solve.

  2. **Prove** that the **greedy choice is always safe**, i.e. there is always an optimal solution to the original problem that makes the greedy choice.
  
  3. **Demonstrate optimal substructure** by showing that, having made the greedy choice, what remains is a subproblem with the property that if we combine an optimal solution to the subproblem with the greedy choice we have made, we arrive at an optimal solution to the original problem.

<h1 align="center">Key Ingradients of the Greedy Algorithms</h1>


- How can we tell **whether a greedy algorithm will solve a particular optimization problem**? 

  **No way works all the time**, but the **greedy-choice property** and **optimal substructure** are the **two key ingredients**. 
  
  If we **can demonstrate** that the **problem has these properties**, then we are well **on the way** to **developing a greedy algorithm** for it.
  
  

- The **greedy-choice property**: 

  We can **assemble** a **globally optimal solution** by **making locally optimal (greedy) choices**. 
  
  In other words, when we are considering which choice to make, we **make the choice** that looks best in the current problem, **without considering results from subproblems**.
  


- The **optimal substructure**: 

  An **optimal solution** to the problem **contains within** it **optimal solutions** to **subproblems**.
  
  This property is a **key ingredient** of assessing the applicability of **dynamic programming as well as greedy algorithms**.
  
  

<h1 align="center">Greedy VS Dynamic Programming</h1>


- Because **both** the **greedy** and **dynamic-programming** strategies **exploit optimal substructure**, you might be **tempted to generate** a **dynamic-programming solution** to a problem when a **greedy solution** suffices


- Conversely, you might **mistakenly think** that a **greedy solution works** when in fact a **dynamic-programming solution is required**.



- Let's consider next **two problems**:

  - **0-1 Knapsack Problem**:

    - A **thief** robbing a store finds $n$ **items**. 

    - The $i$-th item is **worth** $p_i$ **dollars** and **weighs** $w_i$ **pounds**, where $p_i$ and $w_i$ are integers.   

    - The **thief wants** to **take as valuable a load as possible**, but **he can carry at most** $W$ **pounds** in his knapsack, for some **integer** $W$ . 

    - **Which items should he take**?

  - **Fractional Knapsack Problem**:

    - The **setup is the same**, but the **thief can take fractions of items**, rather than having to make a **binary** (**0-1**) **choice for each item**.



- Although the **problems are similar**, we **can solve** the **fractional knapsack problem**, but we **cannot solve** the **0-1 problem** by a **greedy strategy**


- To demonstrate that this **greedy strategy does not work for the 0-1 knapsack problem**, let's consider the problem:

|  $i$  |  1 |  2  |  3  |   | $W$ |
|:-----:|:--:|:---:|:---:|:-:|:---:|
| $p_i$ | 60 | 100 | 120 |   |     |
| $w_i$ | 10 |  20 |  30 |   |  50 |


- The **greedy strategy**, would **take item 1 first**, since the **value per pound** of **item 1** is **6 dollars** per pound, which is greater than the **value per pound** of either **item 2** (**5 dollars** per pound) or **item 3** (**4 dollars** per pound).


- However, the **optimal solution** takes **items 2** and **item 3**, leaving item 1 behind.


<img src="images/L8_FKP.png" width="900" alt="Example" />


<h1 align="center">End of Lecture</h1>