# MIT 6.006 Part 2

This week I will finish the introduction to algorithms course I started last week.

## Single-Source Shortest Path Problem

Often we want to find the shortest distance from one vertex to another. This problem can be hard to solve as even with a relatively small number of vertices it would take a long time (many years) to find the shortest path to a given vertex.

If a graph is weighted it means there is a cost to go from one vertex to another.

A graph can have negative weights. One example is when a taxi gets from one destination to another a negative weight could represent money earned and a positive weight could represent money spent on fuel.

If a graph has a negative cycle then certain shortest paths are undefined.

Single-Source Shortest Path algorithms work by setting the starting node to have a cost of 0 and all other nodes to having a cost of infinity.

Then we select an edge (u, v).

If d[v] > d[u] + w(u, v) then we update d[v] to d[u] + w(u, v). We also set π[v] = u. 

Where d is a list of all the distances to the other nodes and π is a list of all the parents of the other nodes.

We repeat this until all d[v] <= d[u] + w(u, v)

Hence this means that d has converged to the actual shortest path from one vertex to another.

### Optimal Substructure

Theorem: Subpaths of shortest paths are shortest paths.

### Triangle Inequality

Theorem: For all u, v, x ∈ X we have:

δ(u,v)<= δ(u,x) +δ(x,v)

## Dijikstra's Algorithm

Only works for graphs in which every edge has a positive weight.

The algorithm finds the shortest path from a given vertex to every other vertex in the graph.

### The Algorithm

1. Set the weight of the starting vertex to 0 and all other vertices to infinity and add the distances to a priority queue.
2. Pop from the priority queue and assign to variable v.
3. Update the weight of adjacent nodes to the vertex v if the cost is lower.
4. Repeat 2-3 until the priority queue is empty.

![dijikstra's algorithm](https://i.stack.imgur.com/90Qwu.png)

Using the above graph as an example I will find the shortest path from A to F.

In [43]:
import heapq

'''
Represents a single graph.
'''        
class Graph:
    def __init__(self, *nodes):
        # Creating the adjacency list.
        self.graph = {node:{} for node in nodes}
        
    def add_edge(self, u, v, weight):
        self.graph[u][v] = weight
        
g = Graph('A', 'B', 'C', 'D', 'E', 'F')
g.add_edge('A', 'B', 3)
g.add_edge('A', 'C', 5)
g.add_edge('A', 'D', 9)
g.add_edge('B', 'A', 3)
g.add_edge('C', 'A', 5)
g.add_edge('D', 'A', 9)
g.add_edge('B', 'C', 3)
g.add_edge('B', 'E', 7)
g.add_edge('B', 'D', 4)
g.add_edge('C', 'B', 3)
g.add_edge('C', 'D', 2)
g.add_edge('C', 'E', 6)
g.add_edge('C', 'F', 8)
g.add_edge('D', 'C', 2)
g.add_edge('D', 'B', 4)
g.add_edge('D', 'E', 2)
g.add_edge('D', 'F', 2)
g.add_edge('E', 'D', 2)
g.add_edge('E', 'C', 6)
g.add_edge('E', 'B', 7)
g.add_edge('E', 'F', 5)
g.add_edge('F', 'D', 2)
g.add_edge('F', 'E', 2)
g.add_edge('F', 'C', 8)

'''
Implementing Dijikstra's Algorithm in Python
'''
def dijikstra(gr, start_point):
    # Keeps track of all the distances.
    distances = {vertex:{'distance': float('inf'), 'prev_node': None} for vertex in g.graph.keys()}
    distances[start_point]['distance'] = 0
    
    remaining_nodes = distances.copy()
    
    priority_queue = [(value['distance'], key) for key, value in distances.items()]
    
    # Convert the priority queue to a heap
    heapq.heapify(priority_queue)
    
    visited_nodes = set()
    
    # Add the start point to the queue
    # Get the last element in the priority queue.
    while (len(priority_queue) != 0):
        current_node = heapq.heappop(priority_queue)
        
        # Update the values of other nodes
        connected_nodes = gr.graph[current_node[1]].items()
        
        for node, weight in connected_nodes:
            if node not in visited_nodes and distances[node]['distance'] > current_node[0] + weight:
                distances[node]['distance'] = current_node[0] + weight
                distances[node]['prev_node'] = current_node[1]
                
        visited_nodes.add(current_node[1])
        

        del remaining_nodes[current_node[1]]
        
        priority_queue = [(value['distance'], key) for key, value in remaining_nodes.items()]
        # Convert the priority queue to a heap
        heapq.heapify(priority_queue)
    
    return distances
        
shortest_distances = dijikstra(g, 'A')

for node, info in shortest_distances.items():
    print(node, info)

A {'distance': 0, 'prev_node': None}
B {'distance': 3, 'prev_node': 'A'}
C {'distance': 5, 'prev_node': 'A'}
D {'distance': 7, 'prev_node': 'B'}
E {'distance': 9, 'prev_node': 'D'}
F {'distance': 9, 'prev_node': 'D'}


## Optimizing Dijikstra's Algorithm

Dijikstra's algorithm can be optimized so that in practice it runs a lot faster. 

The speedup techniques below will not change worst-case behaviour. But causes the algorithm to speed up in practice by reducing the number of vertices visited.
### Single Source Single Target

One way to improve the performance of Dijikstra's algorithm is to stop when we have visited the destination node. At this point we can stop as we do not need to know the most efficient route to the other nodes.

### Bi-Directional Search

Using a single target destination we can speed up Dijikstra's algorithm even further. 

In bi-directional search we alternate search from source and backward searching from the destination.
When we remove an element from the queue that has already been visited by the other search then we stop.
When we stop we find the node x which has a minimum value of df(x) + db(x).

### Goal-Directed Search or A* Algorithm

In goal-directed search we add a heuristic function ensure that we always take the node which is likely to be part of the path to the destination.

In [94]:
# Implementing Bi-Directional Search
def bi_directional_search(gr, source, destination):
    forward_distances = {vertex:{'distance': float('inf'), 'prev_node': None} for vertex in g.graph.keys()}
    backward_distances = {vertex:{'distance': float('inf'), 'prev_node': None} for vertex in g.graph.keys()}
    
    forward_distances[source]['distance'] = 0
    backward_distances[destination]['distance'] = 0
    
    remaining_forward = forward_distances.copy()
    remaining_backward = backward_distances.copy()
    
    forward_priority_queue = [(value['distance'], key) for key, value in forward_distances.items()]
    backward_priority_queue = [(value['distance'], key) for key, value in backward_distances.items()]
    
    heapq.heapify(forward_priority_queue)
    heapq.heapify(backward_priority_queue)
    
    visited_forward = set()
    visited_backward = set()
    
    done = False
    
    is_forward = True
    
    while not done:
        print(backward_priority_queue)
        current_node = heapq.heappop(forward_priority_queue) if is_forward else heapq.heappop(backward_priority_queue)
        
        if not is_forward:
            print("Current Node", current_node)
            
        print(visited_forward)
        print(visited_backward)
        
        if current_node[1] in visited_forward or current_node[1] in visited_backward:
            break
            
        # Get adjacent nodes.
        connected_nodes = gr.graph[current_node[1]].items()
                    
        distances = forward_distances if is_forward else backward_distances
        
        for node, weight in connected_nodes:
            if distances[node]['distance'] > current_node[0] + weight:
                distances[node]['distance'] = current_node[0] + weight
                distances[node]['prev_node'] = current_node[1]
                
        if is_forward: 
            del remaining_forward[current_node[1]]
            heapq.heapify(forward_priority_queue)
            visited_forward.add(current_node[1])
            forward_priority_queue = [(value['distance'], key) for key, value in remaining_forward.items()]
            heapq.heapify(forward_priority_queue)
        else:
            del remaining_backward[current_node[1]]
            heapq.heapify(backward_priority_queue)
            visited_backward.add(current_node[1])
            backward_priority_queue = [(value['distance'], key) for key, value in remaining_backward.items()]
            heapq.heapify(backward_priority_queue)
            
        is_forward = not is_forward
        
        
    lowest_distance = float('inf')

    common_nodes = [item for item in forward_distances.items() if backward_distances[item[0]]['distance'] != float('inf')
                                                                    and item[1]['distance'] != float('inf')]
                    
    for item in common_nodes:
        print(item[0])
        if forward_distances[item[0]]['distance'] + backward_distances[item[0]]['distance'] < lowest_distance:
            lowest_distance = forward_distances[item[0]]['distance'] + backward_distances[item[0]]['distance']
                
bi_directional_search(g, 'A', 'F')                        

[(0, 'F'), (inf, 'B'), (inf, 'A'), (inf, 'D'), (inf, 'E'), (inf, 'C')]
set()
set()
[(0, 'F'), (inf, 'B'), (inf, 'A'), (inf, 'D'), (inf, 'E'), (inf, 'C')]
Current Node (0, 'F')
{'A'}
set()
[(2, 'D'), (2, 'E'), (8, 'C'), (inf, 'B'), (inf, 'A')]
{'A'}
{'F'}
[(2, 'D'), (2, 'E'), (8, 'C'), (inf, 'B'), (inf, 'A')]
Current Node (2, 'D')
{'A', 'B'}
{'F'}
[(2, 'E'), (6, 'B'), (4, 'C'), (11, 'A')]
{'A', 'B'}
{'D', 'F'}
[(2, 'E'), (6, 'B'), (4, 'C'), (11, 'A')]
Current Node (2, 'E')
{'C', 'A', 'B'}
{'D', 'F'}
[(4, 'C'), (6, 'B'), (11, 'A')]
{'C', 'A', 'B'}
{'D', 'E', 'F'}
A
B
C
D
E
F


## Bellman-Forde Algorithm

## Dynamic Programming

In [40]:
# Explanation of some dynamic programming.