The file contains an adjacency list representation of an undirected weighted graph with 200 vertices labeled 1 to 200. Each row consists of the node tuples that are adjacent to that particular vertex along with the length of that edge. For example, the 6th row has 6 as the first entry indicating that this row corresponds to the vertex labeled 6. The next entry of this row "141,8200" indicates that there is an edge between vertex 6 and vertex 141 that has length 8200. The rest of the pairs of this row indicate the other vertices adjacent to vertex 6 and the lengths of the corresponding edges.

Your task is to run Dijkstra's shortest-path algorithm on this graph, using 1 (the first vertex) as the source vertex, and to compute the shortest-path distances between 1 and every other vertex of the graph. If there is no path between a vertex v and vertex 1, we'll define the shortest-path distance between 1 and v to be 1000000.

You should report the shortest-path distances to the following ten vertices, in order: 7,37,59,82,99,115,133,165,188,197. You should encode the distances as a comma-separated string of integers. So if you find that all ten of these vertices except 115 are at distance 1000 away from vertex 1 and 115 is 2000 distance away, then your answer should be 1000,1000,1000,1000,1000,2000,1000,1000,1000,1000. Remember the order of reporting DOES MATTER, and the string should be in the same order in which the above ten vertices are given. The string should not contain any spaces. Please type your answer in the space provided.

IMPLEMENTATION NOTES: This graph is small enough that the straightforward O(mn) time implementation of Dijkstra's algorithm should work fine. OPTIONAL: For those of you seeking an additional challenge, try implementing the heap-based version. Note this requires a heap that supports deletions, and you'll probably need to maintain some kind of mapping between vertices and their positions in the heap.

# Dijkstra algorithm

```python
def Dijkstra(G, s):
  """ Find shorest-path of all nodes in G starting from node s.
  Return: shorst-path map for all nodes
  """
  
  X = set() # nodes already explored
  A = {}    # the final shortest-path map
  
  # initialization
  X.add(s)
  A[s] = 0
  n = len(G)
  
  while len(X) != n:
    for all edges (v, w) with v in X and w not in X:
      choose (vs, ws) such that A[vs] + L(vs, ws) is the smallest
    X.add(ws)
    A[ws] = A[vs] + L(vs, ws)
  
  return A
```

Alternative while loop:
```python
def Dijkstra(G, s):
  A = {} # the final shortest-path map
  B = {} # the current path of the frontier
  B[s] = 0
  n = len(G)
  
  while len(X) != n:
    find node v in B with the smallest B[v]
    A[v] = B[v]
    B.pop(v)
    
    for all edges (v, w):
      if w not in A:
        distance = A[v] + L(v, w)
        update = False
        if w not in B:
          update = True
        else:
          if distance < B[w]:
            update = True
        if update:
          B[w] = distance
  
  return A
```

In [1]:
DEBUG = True

# Read the graph from the file with name filename
def read_graph(filename):
    G = {}
    for line in open(filename, 'r'):
        ls = line.split()
        key = int(ls[0])
        G[key] = []
        for pair in ls[1:]:
            node, distance = pair.split(',')
            p = (int(node), int(distance))
            G[key].append(p)
    
    if DEBUG:
        print G
    return G

def Dijkstra(G, s):
    """ Find shorest-path of all nodes in G starting from node s.
    Return: shorst-path map for all nodes
    """
    
    X = set() # nodes already explored
    A = {}    # the final shortest-path map
    
    X.add(s)
    A[s] = 0
    n = len(G)
    
    while len(X) != n:
        ws = None
        dist_min = float('Infinity')
        for v in X:
            for w, d in G[v]:
                if w not in X:
                    dist = A[v] + d
                    if dist < dist_min:
                        ws = w
                        dist_min = dist
        X.add(ws)
        A[ws] = dist_min

    return A

In [2]:
# timer grabbed from 
# https://stackoverflow.com/questions/7370801/measure-time-elapsed-in-python
from timeit import default_timer as timer
class benchmark(object):
    def __init__(self, msg, fmt="%0.3g"):
        self.msg = msg
        self.fmt = fmt

    def __enter__(self):
        self.start = timer()
        return self

    def __exit__(self, *args):
        t = timer() - self.start
        print(("%s : " + self.fmt + " seconds") % (self.msg, t))
        self.time = t

In [3]:
DEBUG = False
G = read_graph("dijkstraData.txt")

with benchmark("Naive implementation O(m * n)") as r:
    A = Dijkstra(G, 1)

out = None
for v in [7,37,59,82,99,115,133,165,188,197]:
    if out == None:
        out = "{0}".format(A[v])
    else:
        out += ",{0}".format(A[v])
print out

Naive implementation O(m * n) : 0.114 seconds


# Notes on Heap

1. Here we only consider binary heap, which is a complete binary tree that satisfies the heap ordering property

2. The heap property:
   a. min-heap: key[n] >= key[p], node 0 (root) gives min(key[n], n = 0,1,2,...).
   b. max-heap: key[n] <= key[p], node 0 (root) gives max(key[n], n = 0,1,2,...).
   Here p is the parent node of n.

3. There are two ways to express a heap: tree representation and array representation
   For example,
                         4         level 0
                        / \
                       /   \
                      4     8      level 1
                     / \   / \
                    6   9 10 12    level 2
                   /
                  10               level 3
   The above tree is stored in a array:
   ```python
   heap = [4, 4, 8, 6, 9, 10, 12, 10]
   ```
   It is easy to check that parent of ```heap[i]``` is ```heap[(i-1)/2]``` and the children of ```heap[i]``` is ```heap[2*i+1]``` and ```heap[2*i+2]``` where ```i``` is the array index.

4. Implementation of Insert (given a key k): <br>
   a. Append k at end of last level.
   ```python
   heap.append(k)
   ```
   b. Bubble up k until heap property is restored.
   ```python
   child = len(heap) - 1
   parent = (child - 1) / 2
   while heap[child] < heap[parent]:
       heap[child], heap[parent] = heap[parent], heap[child]
       child = parent
       parent = (child - 1) / 2 # should be >= 0
  ```

5. Implementation of Extract-Min: <br>
   a. Delete root and move the last leaf to root.
   ```python
   heap[0] = heap[-1]
   heap.pop()
   ```
   b. Iteratively bubble down until heap property has been restored.
   ```python
   parent = 0
   child1, child2 = 1, 2
   while heap[parent] > heap[child1] or heap[parent] > heap[child2]:
       child = child1
       if heap[child2] < heap[child1]:
           child = child2
       heap[child], heap[parent] = heap[parent], heap[child]
       parent = child
       child1, child2 = 2 * parent + 1, 2 * parent + 2 # should be < len(heap)
   ```

In [4]:
import collections
class binary_heap:
    """ Binary heap for Dijkstra's algorithm
    array -- a list of node-distance pair
    """
    
    def __init__(self, array=[]):
        self.data = []         # node-distance pairs
        self.indices = {}      # map node to its index
        self.size = len(array) # the number of nodes currently in the heap
        
        if self.size != 0:
            array = sorted(array, key = lambda x: x[1])
            for i, pair in enumerate(array):
                node, distance = pair
                self.data.append(pair)
                self.indices[node] = i
        
        return
    
    def __repr__(self):
        return "Key value:\n" + str(self.data) + "\nNode locataion:\n" + str(self.indices)
    
    def __contains__(self, node):
        return node in self.indices
    
    def is_empty(self):
        return self.size == 0
    
    def validate_index(self, i):
        """ Check if index i lies in bound. """
        if i < 0 or i >= self.size:
            print "Index i = {0}, size of heap: {1}".format(i, self.size)
            raise ValueError("Index out of range.")
        return
    
    def parent_index(self, i):
        """ Return the parent index of child i. """
        self.validate_index(i)
        return (i - 1) / 2 if i != 0 else 0
    
    def children_indices(self, i):
        """ Return the children indices of parent i. """
        self.validate_index(i)
        return 2 * i + 1, 2 * i + 2
    
    def is_leaf(self, i):
        """ Check if index i is a leaf or not. """
        self.validate_index(i)
        c1, c2 = self.children_indices(i)
        return c1 >= self.size and c2 >= self.size
    
    def one_child(self, i):
        """ Check if parent of index i has only one child. """
        self.validate_index(i)
        c1, c2 = self.children_indices(i)
        return c1 < self.size and c2 >= self.size
    
    def min_value_child(self, i):
        """ Return the child index of parent i with the smaller value. """
        self.validate_index(i)
        c1, c2 = self.children_indices(i)
        
        c = None
        if c2 < self.size: # node i has two children
            c = c1
            if self.data[c1][1] > self.data[c2][1]:
                c = c2
        else:
            if c1 < self.size: # node i has only one child
                c = c1
            else:              # node i is a leaf
                c = i
        return c
    
    def up_heapify(self, i):
        """ Bubble up from index i. """
        self.validate_index(i)
        ic = i
        ip = self.parent_index(ic)
        while self.data[ic][1] < self.data[ip][1]:
            node_c, node_p = self.data[ic][0], self.data[ip][0]
            self.data[ic], self.data[ip] = self.data[ip], self.data[ic] # swap data
            self.indices[node_c], self.indices[node_p] = ip, ic         # update index map
            ic = ip
            ip = self.parent_index(ic)
        return
    
    def down_heapify(self, i):
        """ Bubble down from index i. """
        self.validate_index(i)
        ip = i
        ic = self.min_value_child(ip)
        while self.data[ic][1] < self.data[ip][1]:
            node_c, node_p = self.data[ic][0], self.data[ip][0]
            self.data[ic], self.data[ip] = self.data[ip], self.data[ic] # swap data
            self.indices[node_c], self.indices[node_p] = ip, ic         # update index map
            ip = ic
            ic = self.min_value_child(ip)
        return
    
    def insert(self, pair):
        self.data.append(pair)
        self.indices[pair[0]] = self.size
        self.size += 1
        self.up_heapify(self.size - 1)
        return
    
    def get_min(self):
        return self.data[0]
    
    def get_value(self, node):
        if node in self.indices:
            i = self.indices[node]
            return self.data[i][1]
        else:
            raise ValueError("Requested node not in the heap!")
    
    def extract_min(self):
        if self.size == 0:
            raise ValueError("Cannot extract min from an empty heap!")
        
        m_node, m_value = self.get_min()
        self.data[0] = self.data[-1]
        self.data.pop()
        self.size -= 1
        self.indices.pop(m_node)
        if self.size > 0:
            self.indices[self.data[0][0]] = 0
            self.down_heapify(0)
        return (m_node, m_value)
    
    def update_node_value(self, node, value):
        if node in self.indices:
            i = self.indices[node]
            self.data[i] = (node, value)
            
            ip = self.parent_index(i)
            ic = self.min_value_child(i)
            if value < self.data[ip][1]:
                self.up_heapify(i)
            elif value > self.data[ic][1]:
                self.down_heapify(i)
        else:
            raise ValueError("Requested node not in the heap!")
        return

In [5]:
# some tests of binary_heap
h = binary_heap([(1,0),(2,10),(3,4)])
print h
print h.is_empty()
h.insert((5,1))
print h
h.extract_min()
print h
h.update_node_value(2, 8)
print h
h.update_node_value(5, 6)
print h
h.update_node_value(2, 1)
print h
print h.is_leaf(0)

Key value:
[(1, 0), (3, 4), (2, 10)]
Node locataion:
{1: 0, 2: 2, 3: 1}
False
Key value:
[(1, 0), (5, 1), (2, 10), (3, 4)]
Node locataion:
{1: 0, 2: 2, 3: 3, 5: 1}
Key value:
[(5, 1), (3, 4), (2, 10)]
Node locataion:
{2: 2, 3: 1, 5: 0}
Key value:
[(5, 1), (3, 4), (2, 8)]
Node locataion:
{2: 2, 3: 1, 5: 0}
Key value:
[(3, 4), (5, 6), (2, 8)]
Node locataion:
{2: 2, 3: 0, 5: 1}
Key value:
[(2, 1), (5, 6), (3, 4)]
Node locataion:
{2: 0, 3: 2, 5: 1}
False


In [6]:
def Dijkstra_heap(G, s):
    """ Find shorest-path of all nodes in G starting from node s.
    Return: shorst-path map for all nodes
    """
    
    A = {s: 0}            # the final shortest-path map
    Q = binary_heap()     # nodes not in A
    
    # initialize heap with infinite distance
    for node in G.keys():
        if node != s:
            A[node] = float('Infinity')
        Q.insert((node, A[node]))

    while not Q.is_empty():
        u, ud = Q.extract_min()
        for v, d in G[u]:
            greedy = A[u] + d
            if greedy < A[v]:
                A[v] = greedy
                Q.update_node_value(v, greedy)

    return A

def Dijkstra_heap2(G, s):
    """ Find shorest-path of all nodes in G starting from node s.
    Return: shorst-path map for all nodes
    """
    
    Q = binary_heap()
    Q.insert((s, 0))
    
    for node in G.keys():
        A[node] = float('Infinity')
    A[s] = 0

    while not Q.is_empty():
        u, ud = Q.extract_min()
        for v, d in G[u]:
            greedy = A[u] + d
            if greedy < A[v]:
                A[v] = greedy
                Q.insert((v, greedy))
    return A

In [7]:
with benchmark("Heap implementation 1 O[m * log(n)]") as r:
    A = Dijkstra_heap(G, 1)

with benchmark("Heap implementation 2 O[m * log(n)]") as r:
    A = Dijkstra_heap2(G, 1)
    
out = None
for v in [7,37,59,82,99,115,133,165,188,197]:
    if out == None:
        out = "{0}".format(A[v])
    else:
        out += ",{0}".format(A[v])
print out

Heap implementation 1 O[m * log(n)] : 0.0152 seconds
Heap implementation 2 O[m * log(n)] : 0.0256 seconds
