Greedy Algorithms
-----------------
Greedy is an algorithmic paradigm that builds up a solution piece by piece, always choosing the next piece that offers the most obvious and immediate benefit. So the problems where choosing locally optimal also leads to global solution are the best fit for Greedy.

    1. Kruskal’s Minimum Spanning Tree (MST)
    2. Prim’s Minimum Spanning Tree (MST)
    3. Dijkstra’s Shortest Path
    4. Huffman Coding

##### Activity Selection Problem
You are given n activities with their start and finish times. Select the maximum number of activities that can be performed by a single person, assuming that a person can only work on a single activity at a time. 

    1) Sort the activities according to their finishing time 
    2) Select the first activity from the sorted array and print it. 
    3) Do the following for the remaining activities in the sorted array. 


In [1]:
def printMaxActivities(s, f):
    n=len(f)
    print("The following activities are selected")

    i=0
    print(i, end=' ')

    for j in range(n):
        if s[j] >= f[i]:
            print(j, end=' ')
            i=j

s=[1,3,0,5,8,5]
f=[2,4,6,7,9,9]
printMaxActivities(s,f)


The following activities are selected
0 1 3 4 

##### 1. Kruskal’s Minimum Spanning Tree (MST)
1. Sort all the edges in non-decreasing order of their weight. 
2. Pick the smallest edge. Check if it forms a cycle with the spanning tree formed so far. If cycle is not formed, include this edge. Else, discard it. 
3. Repeat step#2 until there are (V-1) edges in the spanning tree.

In [2]:
from collections import defaultdict

class Graph:

    def __init__(self, vertices):
        self.V=vertices
        self.graph=[]

    def addEdge(self, u, v, w):
        self.graph.append([u,v,w])

    # A utility function to find set of an element i
    # (uses path compression technique)
    def find(self, parent, i):
        if parent[i]==i:
            return i
        return self.find(parent, parent[i])
    
    # A function that does union of two sets of x and y
    # (uses union by rank)
    def union(self, parent, rank, x, y):
        xroot=self.find(parent, x)
        yroot=self.find(parent, y)

        # Attach smaller rank tree under root of
        # high rank tree (Union by Rank)
        if rank[xroot]<rank[yroot]:
            parent[xroot]=yroot
        elif rank[xroot]>rank[yroot]:
            parent[yroot]=xroot
        else:
            parent[yroot]=xroot
            rank[xroot]+=1
    
    def KruskalMST(self):

        result=[]
        i=0 # used for sorted edges
        e=0 # used for result[]

        # Step 1: Sorting all the edges in non-decreasing order of their weight
        # If we are not allowed to change the given graph, we can create a copy of graph
        self.graph = sorted(self.graph, key=lambda item: item[2])

        parent=[]
        rank=[]

        # Create V subsets with single elements
        for node in range(self.V):
            parent.append(node)
            rank.append(0)
        
        # Number of edges to be taken is equal to V-1
        while e < self.V - 1:
            # Step 2: Pick the smallest edge and increment the index for next iteration
            u, v, w = self.graph[i]
            i=i+1
            x=self.find(parent, u)
            y=self.find(parent, v)

            # If including this edge doesn't cause cycle, include it in result
            # and increment the indexof result for next edge
            if x!=y:
                e=e+1
                result.append([u,v,w])
                self.union(parent, rank, x, y)
            
        minimumCost = 0
        print("Edges in the constructed MST")
        for u,v,weight in result:
            minimumCost+=weight
            print("%d -- %d == %d" % (u,v,weight))
        print("Minimum Spanning Tree", minimumCost)


g=Graph(4)
g.addEdge(0,1,10)
g.addEdge(0,2,6)
g.addEdge(0,3,5)
g.addEdge(1,3,15)
g.addEdge(2,3,4)

g.KruskalMST()


Edges in the constructed MST
2 -- 3 == 4
0 -- 3 == 5
0 -- 1 == 10
Minimum Spanning Tree 19


##### 2. Prim’s Minimum Spanning Tree (MST)
It starts with an empty spanning tree. The idea is to maintain two sets of vertices. The first set contains the vertices already included in the MST, the other set contains the vertices not yet included. At every step, it considers all the edges that connect the two sets and picks the minimum weight edge from these edges. After picking the edge, it moves the other endpoint of the edge to the set containing MST.


In [3]:
import sys

class Graph:

    def __init__(self, vertices):
        self.V=vertices
        self.graph=[[0 for column in range(vertices)] for row in range(vertices)]

    # A utility function to print the constructed MST stored in parent[]
    def printMST(self, parent):
        print("Edge \tWeight")
        for i in range(1, self.V):
            print(parent[i], "-", i, "\t", self.graph[i][parent[i]])
    
    # A utility function to find the vertex with minimum distance value,
    # from the set of vertices not yet included in shortest path tree
    def minKey(self, key, mstSet):

        min = sys.maxsize

        for v in range(self.V):
            if key[v] < min and mstSet[v]==False:
                min=key[v]
                min_index=v
            
        return min_index
    
    # Function to construct and print MST for a graph
    # represented using adjacency matrix representation
    def primMST(self):

        # key values used to pick minimum weight edge in cut
        key = [sys.maxsize] * self.V
        # array to store constructed MST
        parent = [None] * self.V

        key[0] = 0
        mstSet = [False] * self.V
        parent[0] = -1

        for cout in range(self.V):
            
            # Pick the minimum distance vertex from the set of verticecs not yet processed.
            u = self.minKey(key, mstSet)
            
            # Put the minimum distance vertex in the shortest path tree
            mstSet[u] = True

            # Update dist value of the adjacent vertices of the picked vertex only if
            # the current distance is greater than new distance and the vertex is not
            # in the shortest path tree
            for v in range(self.V):

                if self.graph[u][v] > 0 and mstSet[v] == False and key[v] > self.graph[u][v]:
                    key[v] = self.graph[u][v]
                    parent[v] = u
            
        self.printMST(parent)

g = Graph(5)
g.graph = [[0,2,0,6,0],
           [2,0,3,8,5],
           [0,3,0,0,7],
           [6,8,0,0,9],
           [0,5,7,9,0]]

g.primMST()



Edge 	Weight
0 - 1 	 2
1 - 2 	 3
0 - 3 	 6
1 - 4 	 5


##### 3. Dijkstra’s Shortest Path
Dijkstra’s algorithm is very similar to Prim’s algorithm for minimum spanning tree. Like Prim’s MST, we generate a SPT (shortest path tree) with a given source as a root. We maintain two sets, one set contains vertices included in the shortest-path tree, other set includes vertices not yet included in the shortest-path tree. At every step of the algorithm, we find a vertex that is in the other set (set of not yet included) and has a minimum distance from the source.


In [4]:
import sys

class Graph():

    def __init__(self, vertices):
        self.V = vertices
        self.graph = [[0 for column in range(vertices)] for row in range(vertices)]

    def printSolution(self, dist):
        print("Vertex \tDistance from Source")
        for node in range(self.V):
            print(node, "\t", dist[node])

    def minDistance(self, dist, sptSet):

        min = sys.maxsize

        for u in range(self.V):
            if dist[u] < min and sptSet[u] == False:
                min = dist[u]
                min_index = u
        
        return min_index

    def dijkstra(self, src):

        dist = [sys.maxsize] * self.V
        dist[src] = 0
        sptSet = [False] * self.V

        for cout in range(self.V):

            # Pick the minimum distance vertex from
            # the set of vertices not yet processed.
            # x is always equal to src in first iteration
            x = self.minDistance(dist, sptSet)
  
            # Put the minimum distance vertex in the
            # shortest path tree
            sptSet[x] = True
  
            # Update dist value of the adjacent vertices
            # of the picked vertex only if the current
            # distance is greater than new distance and
            # the vertex in not in the shortest path tree
            for y in range(self.V):
                if self.graph[x][y] > 0 and sptSet[y] == False and \
                dist[y] > dist[x] + self.graph[x][y]:
                        dist[y] = dist[x] + self.graph[x][y]
  
        self.printSolution(dist)

g = Graph(9)
g.graph = [[0, 4, 0, 0, 0, 0, 0, 8, 0],
        [4, 0, 8, 0, 0, 0, 0, 11, 0],
        [0, 8, 0, 7, 0, 4, 0, 0, 2],
        [0, 0, 7, 0, 9, 14, 0, 0, 0],
        [0, 0, 0, 9, 0, 10, 0, 0, 0],
        [0, 0, 4, 14, 10, 0, 2, 0, 0],
        [0, 0, 0, 0, 0, 2, 0, 1, 6],
        [8, 11, 0, 0, 0, 0, 1, 0, 7],
        [0, 0, 2, 0, 0, 0, 6, 7, 0]]

g.dijkstra(0)

Vertex 	Distance from Source
0 	 0
1 	 4
2 	 12
3 	 19
4 	 21
5 	 11
6 	 9
7 	 8
8 	 14


##### 4. Huffman Coding
Huffman coding is a lossless data compression algorithm. The idea is to assign variable-length codes to input characters, lengths of the assigned codes are based on the frequencies of corresponding characters. The most frequent character gets the smallest code and the least frequent character gets the largest code.


In [5]:
# A Huffman Tree Node
class node:
    def __init__(self, freq, symbol, left=None, right=None):
        # frequency of symbol
        self.freq = freq
 
        # symbol name (character)
        self.symbol = symbol
 
        # node left of current node
        self.left = left
 
        # node right of current node
        self.right = right
 
        # tree direction (0/1)
        self.huff = ''

def printNodes(node, val=''):
    # huffman code for current node
    newVal = val + str(node.huff)

    # if node is not an edge node, then traverse inside it
    if node.left:
        printNodes(node.left, newVal)
    if node.right:
        printNodes(node.right, newVal)
    
    # if node is edge node, then display its huffman code
    if not node.left and not node.right:
        print(node.symbol, " -> ", newVal)
    
chars = ['a','b','c','d','e','f']
freq = [ 5, 9, 12, 13, 16, 45]
nodes = []

# converting characters and frequencies into huffman tree nodes
for x in range(len(chars)):
    nodes.append(node(freq[x], chars[x]))

while len(nodes) > 1:
    
    # sort all the nodes in ascending order based on theri frequency
    nodes = sorted(nodes, key=lambda x: x.freq)

    # pick 2 smallest nodes
    left = nodes[0]
    right = nodes[1]
 
    # assign directional value to these nodes
    left.huff = 0
    right.huff = 1
 
    # combine the 2 smallest nodes to create
    # new node as their parent
    newNode = node(left.freq+right.freq, left.symbol+right.symbol, left, right)
 
    # remove the 2 nodes and add their
    # parent as new node among others
    nodes.remove(left)
    nodes.remove(right)
    nodes.append(newNode)
 
# Huffman Tree is ready!
printNodes(nodes[0])



f  ->  0
c  ->  100
d  ->  101
a  ->  1100
b  ->  1101
e  ->  111


##### +) Fractional Knapsack Problem
In Fractional Knapsack, we can break items for maximizing the total value of knapsack. This problem in which we can break an item is also called the fractional knapsack problem. 

In [6]:
class ItemValue:
  
    """Item Value DataClass"""
  
    def __init__(self, wt, val, ind):
        self.wt = wt
        self.val = val
        self.ind = ind
        self.cost = val // wt
  
    def __lt__(self, other):
        return self.cost < other.cost
  
# Greedy Approach
  
  
class FractionalKnapSack:
  
    """Time Complexity O(n log n)"""
    @staticmethod
    def getMaxValue(wt, val, capacity):
        """function to get maximum value """
        iVal = []
        for i in range(len(wt)):
            iVal.append(ItemValue(wt[i], val[i], i))
  
        # sorting items by value
        iVal.sort(reverse=True)
  
        totalValue = 0
        for i in iVal:
            curWt = int(i.wt)
            curVal = int(i.val)
            if capacity - curWt >= 0:
                capacity -= curWt
                totalValue += curVal
            else:
                fraction = capacity / curWt
                totalValue += curVal * fraction
                capacity = int(capacity - (curWt * fraction))
                break
        return totalValue
  
  
# Driver Code
if __name__ == "__main__":
    wt = [10, 40, 20, 30]
    val = [60, 40, 100, 120]
    capacity = 50
  
    # Function call
    maxValue = FractionalKnapSack.getMaxValue(wt, val, capacity)
    print("Maximum value in Knapsack =", maxValue)

Maximum value in Knapsack = 240.0
