# Greedy algorithms

A greedy algorithm solves an oprimization problem by making the best local choice for the current problem, leaving only one subproblem and combining it with the optimal solution to that subproblem. For a Greedy algorithm to apply to a problem the following must hold:
- can cast the solution as a current best choice, the greedy choice, and one subproblem
- greedy-choice property: the (global) optimal solution to the problem involves the (local) greedy choice, leaving one sub-problem
- optimal substructure: the optimal solution is achived by combining the greedy choice and the optimal solution of the remaining subproblem

#### Example
Given tasks $S\{a_0,a_1,\dots,a_n\}$ which have a starting and a finished time $s=\{s_0,s_1,\dots,s_{n-1} \}\,\,,\,\,f=\{f_0,f_1,\dots,f_{n-1}\}$ find the maximum numbers of tasks that can be scheduled to be compatible with each other, i.e. there is no overlapping between any two tasks.

In [1]:
def max_n(s: list, i: int, j:int):
    if i == len(s):
        return 0
    s_i, f_i = s[i]
    s_j, f_j = s[j]
    if j!=-1 and  s_i<f_j:
        return max_n(s, i+1, j)
    
    return max(max_n(s, i+1, j) , max_n(s, i+1, i)+1)

In [2]:
def max_n_2(s: list, i,j):
    '''is the set of task that are compatible with tasks i and j'''
    
    if j<i:
        return 0
    
    return max([max_n_2(s, i,k) + 1 + max_n_2(s,k,j) for k in range(len(s)) if s[k][0]>=s[i][1] and s[k][1]<=s[j][0]] + [0])

In [3]:
def memo(s: list):
    k = len(s)
    # table = [[0]*(k+1) for _ in range(k+1)]
    # sol = [[0]*(k+1) for _ in range(k+1)]
    table={}
    sol = {}
    for j in range(-1, k):
        table[(k,j)] = 0
    
    for i in range(k-1, -1, -1):
        for j in range(-1, i):
            s_i, f_i = s[i]
            s_j, f_j = s[j] 
            if j!=-1 and s_i<f_j:
                table[i,j] , sol[i,j] = table[i+1,j] , False
            else:
                table[i,j] , sol[i,j] = max((table[i+1,j], False) , (table[i+1,i]+1, True) )  
    
    events = []
    j = -1
    for i in range(0,k):
        if sol[i, j]==True:
            events.append(i+1)
            j = i
 
    return table[0,-1] , events

In [4]:
s = [(1,4),(3,5),(0,6),(5,7),(3,9),(5,9),(6,10),(7,11),(8,12),(2,14),(12,16)]
print(f'maximum number of compatible task with the naive recursive method {max_n(s,0,-1)}')
# s_aux = [(-2,-2)]+s+[(100,101)]
# k = len(s_aux)
# print(f'maximum number of compatible task with the naive recursive method {max_n_2(s_aux,0,k-1)}')
print(f'maximum number of compatible task with bottom-up method {memo(s)[0]}, and the task in order {memo(s)[1]}')


maximum number of compatible task with the naive recursive method 4
maximum number of compatible task with bottom-up method 4, and the task in order [1, 4, 8, 11]


In [5]:
def greedy(s: list):
    # number of elements
    k = len(s)
    # make the first greedy choice
    j = 0
    # memoize the greedy choosen tasks, number them 1,2,...,k 
    tasks = [j+1]

    for i in range(1,k):
        # find the next element compatible with the last greedy element
        if s[i][0]>=s[j][1]:
            # make it the next greedy element
            j = i
            tasks.append(j+1)
    
    return tasks


In [6]:
s = [(1,4),(3,5),(0,6),(5,7),(3,9),(5,9),(6,10),(7,11),(8,12),(2,14),(12,16)]
print(f'the maximum number of compatible elements with the greedy method is {len(greedy(s))} and those elements are {greedy(s)}')

the maximum number of compatible elements with the greedy method is 4 and those elements are [1, 4, 8, 11]


### Hoffman's code
Given a text with $N$ total characters from an alphabet $C=\{c_1,c_2,\dots,c_n\}$ and frequencies on the text $F=\{f_1,f_2,\dots, f_n\}$ find a binary character code, i.e. a map that each character is encoded by a binary sting (codeword), which is prefix free, i.e. the codeword are unique for each chareacter and the total length of the encoding bits is minimized.
- fix-length endoding: each character is encoded by a codeword of the same length which is $\lceil \log_2 n \rceil$
- Hoffman's code: represent more frequent character with codewords of small length

In [7]:
import heapq

class node:
    
    def __init__(self, char: str, freq: int):
        
        self.left = None
        self.right = None
        self.parent = None
        self.char = char
        self.freq = freq
        

def hoffman_encoding(c: list[str], f: list[int]) -> dict[str:str]:
    n = len(c)
    # represent the nodes as frecunecy, character, left_child_character, right_child_character
    nodes = [( freq, node(char,freq)) for (char,freq) in zip(c,f)]
    heapq.heapify(nodes)
    for i in range(0,n-1):
        f1, node1 = heapq.heappop(nodes)
        f2, node2 = heapq.heappop(nodes)
        
        new_node = node(node1.char+node2.char, node1.freq+node2.freq)
        new_node.left = node1
        new_node.right = node2
        node1.parent = new_node
        node2.parent = new_node
        
        heapq.heappush(nodes, (new_node.freq, new_node))
    
    return heapq.heappop(nodes)

def leaves(root, codeword=''):
    if root.left:
        leaves(root.left, codeword+'0')
    
    if root.right:
        leaves(root.right, codeword+'1')
    
    if root.left == root.right:
        print(root.char, codeword)

    return 

In [8]:
c = ['a','b','c','d','e','f']
f = [45,13,12,16,9,5]
f_root, root = hoffman_encoding(c,f)

In [9]:
leaves(root)

a 0
c 100
b 101
f 1100
e 1101
d 111
