# Greedy Algorithm  

**Def**: The algorithm makes the optimal choice at each step as it attempts to find the overall optimal way to solve the entire problem. However, does not ensure global otpimum. 



## Paying with coins:
> Pay 1299 yen with as few coins as possible 
Greedy approach:
* Start with the biggest 500 yen. We can use 2. 1000 yen. 
* Then move to 100 yen. Use 2. 1200 yen.
* Then move to 50 yen. Use 1. 1250 yen. 
* Then move to 10 yen. Use 4. 1290 yen. 
* Then move to 5 yen. Use 1. 1295 yen. 
* Finally move to 1 yen. Use 4. 1299 yen. 

In [1]:
def greedyCoinChanging(M, k):
    """
    Inputs:
      M (lst): available denominations
      k (int): amount to be paid
    """
    M = sorted(M, reverse=True)
    n = len(M)
    result = []
    for i in range(n):
        result += [(M[i], k // M[i])]
        k %= M[i] 
    
    return result

print(greedyCoinChanging([500, 100, 50, 10, 5, 1], 1299))



[(500, 2), (100, 2), (50, 1), (10, 4), (5, 1), (1, 4)]


## Interval Scheduling Problem  

Suppose we want to derive the maximum of $f(x)$. Then, $\forall x$ we can derive $x'$ such that $P(x') = $ True, or $x'$ satisfies some condition $P$. Suppose we can show that 
$$f(x') \ge f(x)$$ 
Then, by considering only $x$ which satisfies $P$, we know that the $f(x)$ maximizing $x$ is included in this subset. 



The Interval Scheduling Problem:
>The idea is we have a collection of jobs (tasks) to schedule on some machine, and each job $i$ has a given start time $s_i$ and a given finish time $t_i$. If two jobs overlap, we can’t schedule them both. Our goal is to schedule as many jobs as possible on our machine.

**The Greedy solution**: always add in the job with the *earliest end time*.  

Why? Because the interval with the earliest end time produces the maximal capacity to hold rest intervals.  
E.g. Suppose current earliest end time of the rest intervals is x. Then available time slot left for other intervals is \[x:\]. If we choose another interval with end time y, then available time slot would be \[y:\]. Since x ≤ y, there is no way \[y:\] can hold more intervals then \[x:\]. Thus, the heuristic holds.


>Proof by induction:  
>To prove correctness, we will prove the following invariant: at every step,
the solution produced by the algorithm so far is a subset of the jobs scheduled in some optimal solution (i.e., it can be extended to an optimal solution without removing any already-scheduled jobs). We can prove this by induction.  


>This invariant is clearly true at the start, when no jobs have yet been scheduled by the algorithm.  


>Now, assume it is true after $i$ jobs have been scheduled, and let $S$ be an optimal solution that includes the $i$ jobs scheduled by the algorithm so far. Let $j$ be the job the algorithm schedules next. If $j \in S$, then we are done—our induction is maintained.  

>If $j \notin S$, let $j' \in S$ be the job in $S$ with the earliest finish time that is not one of the $i$ jobs scheduled by the algorithm. 
Notice that job $j'$ must be a candidate job since it does not conflict with any other job in $S$ and $S$ includes all the $i$ jobs scheduled by the algorithm so far. 
>Therefore it must be the case that the finish time of $j'$ is greater than or equal to the finish time of $j$.  
>This in turn means we can modify $S$ to maintain our invariant by removing $j'$ from it and adding $j$. 
>In particular, $j$ does not conflict with any job in $S$ that starts after $j'$ finishes (since $j$ finishes earlier) and $j$ does not conflict with any job in $S$ that finishes before $j'$ starts (since by definition of $j'$, all such jobs belong to the set of $i$ jobs scheduled by the algorithm, and $j$ does not conflict with any of them).  

>So, our invariant is true all the way through the algorithm’s operation, which means that when it halts, it must have found an optimal solution (if it is a strict subset of an optimal solution, then there will still exist at least one candidate job).

In [1]:
S = [1, 3, 6, 7, 9]
T = [3, 8, 7, 10, 11]
def scheduling_problem(S, T):
    cnt = 0
    selected_t = -1 # 現在選択されている仕事の終了時刻
    jobs = sorted(list(zip(S, T)), key=lambda x: x[1]) # 終了時刻でソート, O(NlogN)
    for s, t in jobs:
        if s >= selected_t:
            selected_t = t
            cnt += 1
    return cnt

print(scheduling_problem(S,T))

3


Time is $O(N\log N)$

## Multiple Array  

> There is an integer sequence $A_0, \dots, A_{N-1}$ consisting of $N$ terms and $N$ buttons. When the $i$-th ($1 \le i \le N$) button is pressed, the values of the $i$ terms from the first through the $i$-th are all incremented by $1$.  

>There is also another integer sequence $B_0, \dots, B_{N-1}$ where all elements are greater than or equal to $1$. Takahashi will push the buttons some number of times so that for every $i$, $A_i$ will be a multiple of $B_i$.  

>Find the minimum number of times Takahashi must press the buttons. 

In this problem, let $D_0, D_1, \dots, D_{N-1}$ be the number of times each button $i$ is pressed. Then, the conditions for solving this problem are:
* $A_0 + (D_0 + D_1 + \cdots + D_{N-1}) =$ multiple of $B_0$ 
* $A_1 + (D_1 + D_2 + \cdots + D_{N-1}) =$ multiple of $B_1$ 
* $\vdots$
* $A_{N-1} + D_{N-1} =$ multiple of $B_{N-1}$ 
Then, the value $D_{N-1}$ should take is:
* If $A_{N-1}$ is already a multiple of $B_{N-1}$:
    + $D_{N-1} = 0, B_{N-1}, 2B_{N-1}, \dots$
* Otherwise, let $A_{N-1}$ \% $B_{N-1} = r$:
    + $D_{N-1} = B_{N-1} - r, 2B_{N-1} - r, \dots$
Since we do not need to needlessly increase $D$, our method should simply add 0 or $B - r$. 

In [None]:
def mularray(A, B):
    pushes = 0
    N = len(A)
    for i in range(N-1, -1, -1):
        r = A[i] % B[i]
        if not r:
            pushes += 0
        else:
            pushes += (B[i] - r)
    return pushes

$O(N)$ time.