# HOMEWORK BATCH 3 - Algorithmic Design A.A 2020-2021 
## By: Andres Bermeo Marinelli
### DSSC

1) Implement the binary heap-based version of the Dijkstra’s algorithm.

In order to implement the binary heap solution of Dijsktra's algorithm we make use of the implementation written in class called **binheap** which was modified to be able to handle this specific case. In particular the **min_order()** function is changed to compare the **key** of the vertices. Some other slight changes have also been applied in order to make the heap structure well suited to this problem (more details can be found in the binheap file where all changes are commented). We also import **inf** from math to be able to represent infinite keys (which will be useful later on) and **defaultdict** from collections which will provide us with a more versatile version of a dictionary which will be useful in some cases.

In [1]:
from binheap import binheap
from math import inf
from collections import defaultdict


In this implementation of Dijkstra's algorithm we represent graphs as adjacency lists. In order to do so, we defined a **Vertex** class which is the basic building block of a graph structure. This class has an id, a key which is defaulted to infinity, a visited marker which will later tell us if this node has been visited and a prev attribute which will be used to tell us the predecessor of the node in order to be able to identify a path. Furthermore, it will have a heap_idx which is used to identify a node with the position it occupies in the heap. This will be useful later when we have to decrease a key in the binary heap.

Further additional methods (getters and setters) have also been implemented.  

In [2]:
class Vertex:  
    def __init__(self, name, key = inf):
        self.id = name                   
        self.key = key
        self.visited= False   #strictly used to see if node was visited in dijkstra's algorithm.
        self.prev = self      #strictly used to memorize previous node in graph in dijkstras algorithm. 
        self.heap_idx = None  #strictly used to identify node with position in heap.

    def set_key(self, key):
        self.key = key

    def get_key(self):
        return self.key
    
    def __repr__(self):
        return str(self.id)

    def set_visited(self):
        self.visited = True
    
    def set_prev(self, node):
        self.prev = node
    
    def get_prev(self):
        return self.prev


Now, we implement the graph structure which essentially connects two vertices (or nodes) and assigns a weight to the edge between them. The graph has two attributes, a **vertex** dictionary which contains the vertices in the graph (the key of the dictionary is the vertex.id) and a **graph** dictionary which stores for each vertex the edges in the form of a list of tuples (destination vertex, weight). We add further utility methods such as adding a connection, getting the neighbors of a vertex, getting all the vertices, etc... 


In [3]:
class Graph:  #a graph is a union of vertices which are connected to each other in underlying ways
    def __init__(self):
        self.vertex = {}    #contains vertexes stored in graph
        self.graph = {}     #dict of connections. 
    
    def __repr__(self):
        return str(self.graph)



    def add_vertex(self, name, key=inf):
        vertx = Vertex(name, key)
        self.vertex[name]= vertx
        self.graph[name]=[]
    
            
    def add_connection(self, src, dest, weight):
        if src not in self.graph.keys():
            self.add_vertex(src)
        if dest not in self.graph.keys():
            self.add_vertex(dest)
        self.graph[src].append((self.vertex[dest],weight)) #tuple of (vertex, weight)
    
    def get_neighbors(self, node):
        if isinstance(node, Vertex):
            return self.graph[node.id]
        else:
            return self.graph[node]

    def get_vertices(self):
        return self.vertex.values()

    def get_vertex(self,node):
        if isinstance(node, Vertex):
            return self.vertex[node.id]
        else:
            return self.vertex[node]
            
    def set_key(self, node:Vertex, num:int):
        return node.set_key(num)
    

Below we show an example of the initialization of a graph. One can avoid to use the *add_vertex()* functionality and directly define the edges. We build a graph which represents the figure below:


![Title](./e1_1.png)




In [4]:
g = Graph()

In [5]:
g.add_connection('a', 'b', 1)
g.add_connection('a', 'c', 4)
g.add_connection('b', 'c', 2)
g.add_connection('b', 'd', 6)
g.add_connection('c', 'd', 3)



In [6]:
g

{'a': [(b, 1), (c, 4)], 'b': [(c, 2), (d, 6)], 'c': [(d, 3)], 'd': []}

As we can see, the graph is correctly represented. For example, vertex **a** is connected to vertex **b** and **c** with weights 1 and 4 respectively.

Now that we can correctly represent graphs, we implement Dijkstra's algorithm.

In [7]:
# an auxiliary function to get the shortest path. 
def path_getter(paths, src: Vertex):
    newpaths = defaultdict(list)
    for i in paths.keys():
        tmp = paths[i]
        newpaths[i].append(tmp.id)
        while tmp is not src:
            tmp = paths[tmp.id]
            newpaths[i].append(tmp.id)
    return newpaths

In [8]:
def dijkstra(g, src):
    dist = {}
    paths = defaultdict(list)
    g.get_vertex(src).set_key(0)
    g.get_vertex(src).set_prev(g.get_vertex(src))  

    BH = binheap(list(g.get_vertices()))

    for num, node in enumerate(BH._A): #identify node with heap index
        node.heap_idx = num

    while not BH.is_empty():
        node = BH.remove_minimum()       # remove min
        node.set_visited()     

        dist[node.id]=node.get_key()     
        paths[node.id] = node.get_prev() 
        for i in g.get_neighbors(node.id):
            if i[0].visited:             
                continue
            if dist[node.id]+i[1] < i[0].get_key(): 
                i[0].set_prev(g.get_vertex(node.id))
                BH.decrease_key(i[0].heap_idx, dist[node.id]+i[1])
 
    return dist, paths

We now run the algorithm on the graph shown above. We expect that the shortest path from a is as follows:   

* **a**: 0 (source node)
* **b**: 1, a->b
* **c**: 3, a->b->c
* **d**: 6, a->b->c->d

In [9]:
dist, paths = dijkstra(g,'a')
newp = path_getter(paths,g.get_vertex('a'))  # get paths out of dict of previous nodes for each node.
print(dist)
print('\n')
print(newp)

{'a': 0, 'b': 1, 'c': 3, 'd': 6}


defaultdict(<class 'list'>, {'a': ['a'], 'b': ['a'], 'c': ['b', 'a'], 'd': ['c', 'b', 'a']})


We note that the paths dictionary must be read from right to left. In other words, for key **d** we read the elements in the list as "a -> b -> c -> d" which is as expected. As we can see, both the result of the shortest distances and paths are correct.

We now try to run the algorithm on a slightly more complicated example.

![Title](./es22.png)


By setting the source **a**, we expect the following:

* **a**: 0, (source)
* **b**: 4,  a -> b
* **c**: 9,  a -> c
* **d**: 19, a -> b -> d
* **e**: 25, a -> b -> d -> e
* **f**: 11, a -> c -> f

In [10]:
d = Graph()

In [11]:
d.add_connection('a', 'b', 4)  
d.add_connection('a', 'c', 9)
d.add_connection('a', 'f', 14)
d.add_connection('b', 'c', 10)
d.add_connection('b', 'd', 15)
d.add_connection('c', 'd', 11)
d.add_connection('c', 'f', 2)
d.add_connection('d', 'e', 6)
d.add_connection('e', 'f', 9)

In [12]:
dist, paths = dijkstra(d,'a')
newp = path_getter(paths,d.get_vertex('a')) 
print(dist)
print('\n')
print(newp)

{'a': 0, 'b': 4, 'c': 9, 'f': 11, 'd': 19, 'e': 25}


defaultdict(<class 'list'>, {'a': ['a'], 'b': ['a'], 'c': ['a'], 'f': ['c', 'a'], 'd': ['b', 'a'], 'e': ['d', 'b', 'a']})


As we can see, the results are correct. 

In the algorithm itself, we build the heap at the beginning which has $\Theta(|V|)$ complexity. Then, at each iteration of the while loop we extract a node in $O(\log|V|)$ time and operate on the adjacency list of the extracted node to update each key in $O(\log|V|)$ complexity. This means that we have a complexity of $|V|O(\log|V|)+ |E|O(\log|V|)$. Furthermore, we have a few assignment operations which are $\Theta(1)$. Adding all the contributions gives a complexity $O((|V|+|E|))\log|V|)$ as expected.