# [CptS 215 Data Analytics Systems and Algorithms](https://github.com/gsprint23/cpts215)
[Washington State University](https://wsu.edu)

[Gina Sprint](http://eecs.wsu.edu/~gsprint/)
# Dijkstra's Algorithm

Learner objectives for this lesson:
* Understand the problem of finding the shortest path between two vertices
* Implement and analyze Dijkstra's algorithm


## Acknowledgments
Content used in this lesson is based upon information in the following sources:
* [Miller and Ranum](http://interactivepython.org/runestone/static/pythonds/index.html)

## Shortest Path Problem
For many applications, it is important to find the shortest path between two vertices in a graph. Examples of such applications include:
* Planning a travel route from a destination to origin
* Transferring data from a server to a client over the internet

The shortest path in an unweighted graph might be defined as the fewest number of edges (number of hops). We solved this problem using a breadth first search. For a weighted graph, the shortest path might be defined as the lowest total edge weight. 

Note: We can use Dijkstra's algorithm to find the shortest path for unweighted graphs by setting the edge weight of each edge to be uniform (e.g. all ones).

## Dijkstra's Algorithm
Edsger Dijkstra solved the shortest path problem with his algorithm that determines the shortest path from an origin vertex to each vertex in a graph. To do this, the algorithm keeps track of each vertex's shortest path *distance* from the origin vertex, each vertex's previous vertex (predecessor) along the shortest path, and the unvisited vertices in a priority queue. For the priority queue, priority is given to the vertices with the smallest path distance from the origin vertex. Let's take a look at the algorithm in pseudocode:

1. Initialize all vertices'
    * Distances to infinity (a number larger than any realistic distance)
    * Predecessors to 0
1. Enqueue all vertices to the priority queue `unvisitedQ` 
1. Set the origin vertex's distance to 0
1. While `unvisitedQ` is not empty:
    1. `currV` = dequeue `unvisitedQ`
    1. For each adjacent vertex `adjV` of `currV`
        1. `newDistance` = path distance from the origin through `currV` through `adjV`
        1. If `newDistance` is smaller than `adjV` path distance
            1. Update `adjV` path distance
            1. Set `adjV` predecessor to `currV`
            1. Update `unvisitedQ` with `adjV`'s new priority (`newDistance`)
            
### Example
For the following graph from Miller and Ranum:
<img src="https://raw.githubusercontent.com/gsprint23/cpts215/master/lessons/figures/dijkstras_example.png" width="400">
Let's walk through Dijkstra's algorithm step by step with u as the starting vertex.

1. Initially
    * Predecessors:
        * u's predecessor: 0
        * v's predecessor: 0
        * w's predecessor: 0
        * x's predecessor: 0
        * y's predecessor: 0
        * z's predecessor: 0
    * `unvisitedQ`: front(u, 0), (v, inf), (w, inf), (x, inf), (y, inf), (z, inf)back
1. Dequeue (u, 0)
    * Predecessors:
        * u's predecessor: 0
        * v's predecessor: u
        * w's predecessor: u
        * x's predecessor: u
        * y's predecessor: 0
        * z's predecessor: 0
    * `unvisitedQ`: front(x, 1), (v, 2), (w, 5), (y, inf), (z, inf)back
<img src="http://interactivepython.org/runestone/static/pythonds/_images/dijkstraa.png" width="300">
1. Dequeue (x, 1)
    * Predecessors:
        * u's predecessor: 0
        * v's predecessor: u
        * w's predecessor: x
        * x's predecessor: u
        * y's predecessor: x
        * z's predecessor: 0
    * `unvisitedQ`: (v, 2), (y, 2), (w, 4), (z, inf)back
<img src="http://interactivepython.org/runestone/static/pythonds/_images/dijkstrab.png" width="300">
1. Dequeue (v, 2)
    * Predecessors:
        * u's predecessor: 0
        * v's predecessor: u
        * w's predecessor: x
        * x's predecessor: u
        * y's predecessor: x
        * z's predecessor: 0
    * `unvisitedQ`: (y, 2), (w, 4), (z, inf)back
<img src="http://interactivepython.org/runestone/static/pythonds/_images/dijkstrac.png" width="300">    
1. Dequeue (y, 2)
    * Predecessors:
        * u's predecessor: 0
        * v's predecessor: u
        * w's predecessor: y
        * x's predecessor: u
        * y's predecessor: x
        * z's predecessor: y
    * `unvisitedQ`: (w, 3), (z, 3)back
<img src="http://interactivepython.org/runestone/static/pythonds/_images/dijkstrad.png" width="300">   
1. Dequeue (w, 3)
    * Predecessors:
        * u's predecessor: 0
        * v's predecessor: u
        * w's predecessor: y
        * x's predecessor: u
        * y's predecessor: x
        * z's predecessor: y
    * `unvisitedQ`: (z, 3)back
<img src="http://interactivepython.org/runestone/static/pythonds/_images/dijkstrae.png" width="300">   
1. Dequeue (z, 3)
    * Predecessors:
        * u's predecessor: 0
        * v's predecessor: u
        * w's predecessor: y
        * x's predecessor: u
        * y's predecessor: x
        * z's predecessor: y
    * `unvisitedQ`: Empty
<img src="http://interactivepython.org/runestone/static/pythonds/_images/dijkstraf.png" width="300">   
(images from [http://interactivepython.org/runestone/static/pythonds/Graphs/DijkstrasAlgorithm.html](http://interactivepython.org/runestone/static/pythonds/Graphs/DijkstrasAlgorithm.html))

## Implementation
In order to implement Dijkstra's algorithm, we need a priority queue implementation, which we have from our min heap code! We will however need to adjust our `BinaryHeap` code in a few ways:
* Store vertex (key) and path distance (value) pairs
* Add a `decrease_key(item_tuple)` method to decrease the path distance value associated with a key, thus moving the key closer to the front of the queue (it now has higher priority). 

We also need to augment our `Vertex` class to store the path distance and a link to the path predecessor.

### `BinaryHeap`

In [6]:
class BinaryHeap:
    '''
    
    '''
    def __init__(self):
        '''
        heap_list[0] = 0 is a dummy value (not used)
        '''
        self.heap_list = [0]
        self.size = 0
        
    def __str__(self):
        '''
        
        '''
        return str(self.heap_list)
    
    def __len__(self):
        '''
        
        '''
        return self.size
    
    def __contains__(self, item):
        '''
        
        '''
        return item in self.heap_list
    
    def is_empty(self):
        '''
        compare the size attribute to 0
        '''
        return self.size == 0
    
    def find_min(self):
        '''
        the smallest item is at the root node (index 1)
        '''
        if self.size > 0:
            min_val = self.heap_list[1]
            return min_val
        return None
        
    def insert(self, item_tuple):
        '''
        append the item to the end of the list (maintains complete tree property)
        violates the heap order property
        call percolate up to move the new item up to restore the heap order property
        '''
        self.heap_list.append(item_tuple)
        self.size += 1
        self.percolate_up(self.size)
        
    def del_min(self):
        '''
        min item in the tree is at the root
        replace the root with the last item in the list (maintains complete tree property)
        violates the heap order property
        call percolate down to move the new root down to restore the heap property
        '''
        min_val = self.heap_list[1]
        self.heap_list[1] = self.heap_list[self.size]
        self.size = self.size - 1
        self.heap_list.pop()
        self.percolate_down(1)
        return min_val

    def min_child(self, index):
        '''
        return the index of the smallest child
        if there is no right child, return the left child
        if there are two children, return the smallest of the two
        '''
        if index * 2 + 1 > self.size:
            return index * 2
        else:
            if self.heap_list[index * 2][1] < self.heap_list[index * 2 + 1][1]:
                return index * 2
            else:
                return index * 2 + 1
            
    def build_heap(self, alist):
        '''
        build a heap from a list of keys to establish complete tree property
        starting with the first non leaf node 
        percolate each node down to establish heap order property
        '''
        index = len(alist) // 2
        self.size = len(alist)
        self.heap_list = [0] + alist[:]
        while (index > 0):
            self.percolate_down(index)
            index -= 1
        
    def percolate_up(self, index):
        '''
        compare the item at index with its parent
        if the item is less than its parent, swap!
        continue comparing until we hit the top of tree
        (can stop once an item is swapped into a position where it is greater than its parent)
        '''
        while index // 2 > 0:
            if self.heap_list[index][1] < self.heap_list[index // 2][1]:
                temp = self.heap_list[index // 2]
                self.heap_list[index // 2] = self.heap_list[index]
                self.heap_list[index] = temp
            index //= 2
            
    def percolate_down(self, index):
        '''
        compare the item at index with its smallest child
        if the item is greater than its smallest child, swap!
        continue continue while there are children to compare with
        (can stop once an item is swapped into a position where it is less than both children)
        '''
        while (index * 2) <= self.size:
            mc = self.min_child(index)
            if self.heap_list[index][1] > self.heap_list[mc][1]:
                temp = self.heap_list[index]
                self.heap_list[index] = self.heap_list[mc]
                self.heap_list[mc] = temp
            index = mc
            
    def decrease_key(self, item_tuple):
        '''
        decrease the priority associated with a key
        first, find the index of key
        replace the node at the key's index with the last item in the list (maintains complete tree property)
        violates the heap order property
        call percolate down to move the new root down to restore the heap property
        re-insert the key with the new updated priority
        '''
        key = item_tuple[0]
        index = -1
        for i in range(1, len(self.heap_list)):
            tup = self.heap_list[i]
            if tup[0] == key:
                index = i
                break
        self.heap_list[index] = self.heap_list[self.size]
        self.size = self.size - 1
        self.heap_list.pop()
        self.percolate_down(index)
        self.insert(item_tuple)
        
# code to test out the BinaryHeap modifications         
h = BinaryHeap()
print(h)
h.insert(("A", 10))
print(h.find_min())
h.insert(("B", 13))
print(h)
h.insert(("C", 9))
print(h)
h.decrease_key(("B", 12))
print(h)
print(len(h))

[0]
('A', 10)
[0, ('A', 10), ('B', 13)]
[0, ('C', 9), ('B', 13), ('A', 10)]
[0, ('C', 9), ('A', 10), ('B', 12)]
3


### Updated `Vertex` and `Graph`

In [2]:
class Vertex:
    '''
    keep track of the vertices to which it is connected, and the weight of each edge
    '''
    def __init__(self, key, distance=0, predecessor=None):
        '''
        
        '''
        self.ID = key
        self.distance = distance
        self.predecessor = predecessor
        self.connected_to = {}

    def add_neighbor(self, neighbor, weight=0):
        '''
        add a connection from this vertex to anothe
        '''
        self.connected_to[neighbor] = weight

    def __str__(self):
        '''
        returns all of the vertices in the adjacency list, as represented by the connectedTo instance variable
        '''
        return str(self.ID) + ' connected to: ' + str([x.ID for x in self.connected_to])

    def get_connections(self):
        '''
        
        '''
        return self.connected_to.keys()

    def get_ID(self):
        '''
        
        '''
        return self.ID

    def get_weight(self, neighbor):
        '''
        returns the weight of the edge from this vertex to the vertex passed as a parameter
        '''
        return self.connected_to[neighbor]
    
    def get_distance(self):
        '''
        
        '''
        return self.distance
    
    def get_predecessor(self):
        '''
        
        '''
        return self.predecessor
    
    def set_distance(self, dist):
        '''
        
        '''
        self.distance = dist
        
    def set_predecessor(self, pred):
        '''
        
        '''
        self.predecessor = pred
    
class Graph:
    '''
    contains a dictionary that maps vertex names to vertex objects. 
    '''
    def __init__(self):
        '''
        
        '''
        self.vert_list = {}
        self.num_vertices = 0
        
    def __str__(self):
        '''
        
        '''
        edges = ""
        for vert in self.vert_list.values():
            for vert2 in vert.get_connections():
                edges += "(%s, %s: %d)\n" %(vert.get_ID(), vert2.get_ID(), vert.get_weight(vert2))
        return edges

    def add_vertex(self, key, distance=0, predecessor=None):
        '''
        adding vertices to a graph 
        '''
        self.num_vertices = self.num_vertices + 1
        new_vertex = Vertex(key, distance, predecessor)
        self.vert_list[key] = new_vertex
        return new_vertex

    def get_vertex(self, n):
        '''
        
        '''
        if n in self.vert_list:
            return self.vert_list[n]
        else:
            return None

    def __contains__(self, n):
        '''
        in operator
        '''
        return n in self.vert_list

    def add_edge(self, f, t, cost=0):
        '''
        connecting one vertex to another
        '''
        if f not in self.vert_list:
            nv = self.add_vertex(f)
        if t not in self.vert_list:
            nv = self.add_vertex(t)
        self.vert_list[f].add_neighbor(self.vert_list[t], cost)

    def get_vertices(self):
        '''
        returns the names of all of the vertices in the graph
        '''
        return self.vert_list.keys()

    def __iter__(self):
        '''
        for functionality
        '''
        return iter(self.vert_list.values())

### Dijkstra's Algorithm Implementation

In [3]:
import sys # for the "maxsize" of an int for representing infinity

def dijkstras_algorithm(aGraph, start):
    '''

    '''
    pq = BinaryHeap()
    start.set_distance(0)
    pq.build_heap([(v, v.get_distance()) for v in aGraph])
    while not pq.is_empty():
        curr_tuple = pq.del_min()
        currV = curr_tuple[0]
        for adjV in currV.get_connections():
            new_dist = currV.get_distance() + currV.get_weight(adjV)
            if new_dist < adjV.get_distance():
                adjV.set_distance(new_dist)
                adjV.set_predecessor(currV)
                pq.decrease_key((adjV, new_dist))

def display_dijkstra_results(g, origin_vertex):
    '''
    display the shortest paths and their distance
    '''
    for v in g:
        print("distance from %s to %s: %d" %(origin_vertex.ID, v.ID, v.distance))
        path = []
        currV = v
        # if currV.get_predecessor() is None that means there is no path from this vertex to the origin
        while currV != origin_vertex and currV.get_predecessor() != None:
            path.insert(0, currV)
            currV = currV.get_predecessor()
        print("\t", origin_vertex.ID, end="")
        for vert in path:
            print("->%s" %(str(vert.ID)), end="")
        print()

### Example 1
Build the zyBooks Dijkstra's graph example:
<img src="https://raw.githubusercontent.com/gsprint23/cpts215/master/lessons/figures/dijkstras_example_zy.png" width="250">

In [4]:
# zybooks Dijkstra's example
g = Graph()
g.add_vertex("A", sys.maxsize)
g.add_vertex("B", sys.maxsize)
g.add_vertex("C", sys.maxsize)
g.add_vertex("D", sys.maxsize)

g.add_edge("A", "B", 3)
g.add_edge("A", "C", 7)
g.add_edge("B", "C", 5)
g.add_edge("B", "D", 1)
g.add_edge("C", "D", 9)
g.add_edge("D", "C", 2)
print(g)

origin_vertex = g.get_vertex("A")
dijkstras_algorithm(g, origin_vertex)
display_dijkstra_results(g, origin_vertex)

(B, C: 5)
(B, D: 1)
(D, C: 2)
(C, D: 9)
(A, C: 7)
(A, B: 3)

distance from A to B: 3
	 A->B
distance from A to D: 4
	 A->B->D
distance from A to C: 6
	 A->B->D->C
distance from A to A: 0
	 A


### Example 2
Build the Miller and Ranum Dijkstra's graph example:
<img src="https://raw.githubusercontent.com/gsprint23/cpts215/master/lessons/figures/dijkstras_example.png" width="400">

In [5]:
# build the Miller and Ranum Dijkstra's example
g = Graph()
g.add_vertex("u", sys.maxsize)
g.add_vertex("v", sys.maxsize)
g.add_vertex("x", sys.maxsize)
g.add_vertex("w", sys.maxsize)
g.add_vertex("y", sys.maxsize)
g.add_vertex("z", sys.maxsize)

g.add_edge("u", "v", 2)
g.add_edge("v", "u", 2)

g.add_edge("u", "x", 1)
g.add_edge("x", "u", 1)

g.add_edge("u", "w", 5)
g.add_edge("w", "u", 5)

g.add_edge("v", "x", 2)
g.add_edge("x", "v", 2)

g.add_edge("x", "w", 3)
g.add_edge("w", "x", 3)

g.add_edge("x", "y", 1)
g.add_edge("y", "x", 1)

g.add_edge("w", "y", 1)
g.add_edge("y", "w", 1)

g.add_edge("w", "z", 5)
g.add_edge("z", "w", 5)

g.add_edge("z", "y", 1)
g.add_edge("y", "z", 1)

print(g)

origin_vertex = g.get_vertex("u")
dijkstras_algorithm(g, origin_vertex)
display_dijkstra_results(g, origin_vertex)

(x, v: 2)
(x, u: 1)
(x, w: 3)
(x, y: 1)
(v, x: 2)
(v, u: 2)
(z, w: 5)
(z, y: 1)
(u, x: 1)
(u, v: 2)
(u, w: 5)
(w, x: 3)
(w, u: 5)
(w, y: 1)
(w, z: 5)
(y, x: 1)
(y, w: 1)
(y, z: 1)

distance from u to x: 1
	 u->x
distance from u to v: 2
	 u->v
distance from u to z: 3
	 u->x->y->z
distance from u to u: 0
	 u
distance from u to w: 3
	 u->x->y->w
distance from u to y: 2
	 u->x->y


## Algorithm Analysis
Dijkstra's single source shortest path algorithm has a runtime that is dependent on the data structure used for the priority queue. An implementation using a binary heap has a runtime of $\mathcal{O}(V log(V) + E log(V)) = \mathcal{O}((V + E)log(V))$ where $V$ is the number of vertices in the graph and $E$ is the number of edges. If the priority queue is implemented using a list, the runtime is $\mathcal{O}(V^{2} + E)$

Final note: Dijkstra's algorithm may not find the shortest path for some vertices if the graph has negative edge weights, so the algorithm should not be used if a negative edge weight exists.

## Practice Problems

### 1
<img src="https://upload.wikimedia.org/wikipedia/commons/5/5f/CPT-Graphs-undirected-weighted.svg" width="300">
(image from [https://upload.wikimedia.org/wikipedia/commons/5/5f/CPT-Graphs-undirected-weighted.svg](https://upload.wikimedia.org/wikipedia/commons/5/5f/CPT-Graphs-undirected-weighted.svg)) 

Trace the execution of Dijkstra's algorithm for the above graph to find the shortest path from Dunwich to all other cities. 

### 2
<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/2/22/CPT-Graphs-directed-weighted.svg/200px-CPT-Graphs-directed-weighted.svg.png" width="300">
(image from [https://upload.wikimedia.org/wikipedia/commons/thumb/2/22/CPT-Graphs-directed-weighted.svg/200px-CPT-Graphs-directed-weighted.svg.png](https://upload.wikimedia.org/wikipedia/commons/thumb/2/22/CPT-Graphs-directed-weighted.svg/200px-CPT-Graphs-directed-weighted.svg.png)) 

Trace the execution of Dijkstra's algorithm for the above graph to find the shortest path from Dunwich to all other cities. 