<font color='red'>
    
This is a Jupyter Notebook demo for the shortest path lab, ENGRI 1101 by Sander Aarts, 2020.
    
Meta notes are in red. These are not intended for students.
    
</font>

# The Shortest Path Problem 

### Objectives
 - Introduce the shortest path problem
 - Show students the inner workings of a combinatorial algorithm
 - Demonstrate the usefulness of sensitivity analysis in problem solving
 
### Brief description
In this lab, we review some motivation and observations behind Dijkstra’s algorithm for shortest path computation, and analyze how sensitive the solution to the shortest path problem is to changes in the input data.

## Part 0: Load the necessary packages

This lab should run by simply downloading the lab folder and opening the jupyter notebook. If you have not installed the packages below an error will appear.

In [1]:
import pandas as pd
import numpy as np
import math
import itertools
import networkx as nx
from bokeh.io import output_notebook
output_notebook()

#  Part 1: The shortest path problem and Dijkstra’s algorithm

You are a pizza delivery person for Good Pizza, Inc. Good Pizza is located on Cornell’s campus and guarantees delivery in 30 minutes or less anywhere in Tompkins County. In order to maximize the amount of your tips, you want to deliver as many pizzas as you can and deliver them as fast as you can. Therefore, you would like to know the quickest way of getting from the pizza shop to various parts of Tompkins County. Because of speed limits and various geographical obstacles (of which there are many in Tompkins County), it is not always best to take the route that covers the shortest distance. For instance, to get to Groton from the Northeast, it is quicker to go through Freeville than to try to go straight to Groton because Route 38 goes directly from Freeville to Groton and has a speed limit of 55 mph.
To help you in finding the quickest routes, you have the attached map which contains approximate travel times for going directly from one location to another by the shortest distance route. The approxiamte travel time is called the $\texttt{cost}$.

You can view the graph by first loading the files $\texttt{tompkins_nodes.csv}$ and $\texttt{tompkins_links.csv}$ from the $\texttt{data}$ folder. The data is kept in pandas dataframes. To view the data as tables, run the cells belows.

In [2]:
# load nodes
data = pd.read_csv('data/tompkins_nodes.csv')
dfn = pd.DataFrame(data)
# load links
data = pd.read_csv('data/tompkins_links.csv')
dfl = pd.DataFrame(data)

print('Loaded %d nodes and %d edges.' % (dfn.shape[0], dfl.shape[0]))

Loaded 18 nodes and 42 edges.


In [3]:
dfn.head() # inspect the nodes

Unnamed: 0,name,lat,lon,x,y
0,Trumansburg,42.540556,-76.66,0,8
1,Enfield,42.435278,-76.626667,2,5
2,Newfield,42.354444,-76.600278,4,2
3,West Hill,42.466523,-76.539525,8,6
4,Cayuga Heights,42.466389,-76.486111,12,7


In [4]:
dfl.head() # instect the edges

Unnamed: 0,start,end,cost
0,Trumansburg,Enfield,25
1,Trumansburg,West Hill,20
2,Enfield,West Hill,9
3,Enfield,Newfield,10
4,Newfield,West Hill,10


To get a better feel for the graph, you should plot it. Running the cell below generates an interactive plot. You can view the edges and associated costs by mousing over edges. Clicking a node lets you highlight all edges associated with it. Plot the graph and familiarize yourself with the nodes and edges.

In [5]:
from graph_tools import plotNetworkTompkins
plotNetworkTompkins(dfn, dfl)

<font color='red'>
The names are to be placed more visibly later.   
</font>

Your pizza place is located on Campus, so this is our source s.

**Q1:** What is the node that is closest to campus (not considering the campus itself)? In case of ties, choose any of them. 

**A:**

**Q2:** Can there be a quicker route to get to this destination (call it point X) than just to go directly there from Campus? Why or why not? (Hint: You can click on the campus node to hilight its adjacent edges).

**A:**

**Q3:** Explain why the predecessor (“prev(X)” from class) of X must be the Campus.

**A:**

For now on, let X be downtown (X has ID 1). This is one of the two nodes which were closest to Campus. Since the predecessor of X is Campus, we can mark the edge between Campus and X. How do we find the next edge to add to the tree (i.e., the next edge to “iluminate”)? One way is as follows. Compute the travel times for all routes that either go directly from Campus to a destination or go from campus to point X and then directly to a destination. Take the shortest of all these routes and add it to the tree. For instance, in
this case, we consider the following routes. Node IDs are in parentheses.

<br>
<font color='red'>
    [insert table here]
</font>

**Q4:** Campus to Varna is the shortest route on this list. Can there be a shorter route to Varna than the one that we just added? Why or why not?

**A:**

Now we update the table: if two entries have the same destination we drop the more expensive one (e.g., Campus-Fall Creek takes 10 units of time while Campus-Downtown-Fall Creek takes 9, so we keep the latter). Then for each destination that can be reached from Varna, we compute the length of the path using only nodes already marked, with Varna being the second-to-last node on the path. Fill in the missing entries in the table below and choose the next edge to be added to the shortest path tree.

<br>
<font color='red'>
    [insert table here]
</font>

Dijkstra’s Algorithm continues in this manner. In the next step, we compute the travel times to destinations not already marked that are adjacent to the lastly marked node, using routes that involve only already marked nodes as intermediate steps.

**Q5:** Do the next two iterations of the algorithm and write down the nodes that get marked in each of these iterations. It will help to look at the map in some cells above.

**A:**

Now trace the execution of Dijkstra’s algorithm with **the software** up to the point that you have already computed by hand. (Press B and click on the Campus to set it as the starting node s, then go step by step by pressing N. At any point you can hover your mouse over a particular node to check its information)
The program uses colors to distinguish different types of information. Explain when each of the following colors is used (be sure to mention all uses of a particular color, such as nodes in the graph, edges in the graph).

 - blue
 - red
 - ...
 
 <br>
 <font color='red'>
 The software can be made as just another interactive graph. It would sit in a cell about here.    
    
 The 'software' should terminate with a hilighted shortest path tree.
</font>

How can you read out the shortest path from Campus to other places in Tompkins County from the shortest path tree?

<br>

## Part 2: Get the shortest path tree using a built-in solver.

<br>

<font color='red'>
   
 - Networkx is a library for dealing with graphs and graph algorithms in python. There may well be other / better options.
    
 - To the best of my knowledge, ortools in python does not have a build in shortest path algorithm
    
 - In this section students find and plot a shortest path tree using networkx
</font>

Next, your goal is to let your computer do the work. Load the package 'networkx' and define a graph using the links dataframe from earlier.

Recall that edges (see dfl) were defined by the 'start' node and the 'end node. Below we load the data as a graph by specifying (1) that our data sits in dfl, (2) that edges start at nodes from the 'start' columsn, (3) that edges end in nodes in the 'end' column, and (4) that edge costs are in the 'cost' column. I.e.

$$\texttt{G = nx.from_pandas_edgelist(<dataframe of edges>, <start col name>, <end col name>, <cost col name>)} $$

In [6]:
# load networkx model from edge dataset
G = nx.from_pandas_edgelist(dfl, 'start', 'end', 'cost')

In [7]:
# solve shortest path problem
out = nx.single_source_dijkstra(G, source='Campus', weight='cost')

Run the solver on the Tompkins instance G we defined above. The cell below prints the time it takes to traverse the quickest path to each location from Campus. Do the values agree with what you found before? 

In [8]:
# print values
#pd.DataFrame(out[0].values(), index=out[0].keys()).rename(columns={0:'cost'}) # fancy version
out[0] # raw output

{'Campus': 0,
 'The Commons': 5,
 'Varna': 5,
 'East Hill': 6,
 'Slaterville': 6,
 'Northeast': 10,
 'Fall Creek': 10,
 'Caroline': 10,
 'West Hill': 12,
 'Cayuga Heights': 12,
 'Danby': 12,
 'Freeville': 13,
 'Dryden': 15,
 'Lansing': 15,
 'Newfield': 20,
 'Enfield': 21,
 'Groton': 27,
 'Trumansburg': 32}

In [9]:
out[1] # raw tree

{'Campus': ['Campus'],
 'The Commons': ['Campus', 'The Commons'],
 'East Hill': ['Campus', 'East Hill'],
 'Danby': ['Campus', 'East Hill', 'Danby'],
 'Varna': ['Campus', 'Varna'],
 'Slaterville': ['Campus', 'Slaterville'],
 'Northeast': ['Campus', 'Northeast'],
 'Fall Creek': ['Campus', 'Fall Creek'],
 'Newfield': ['Campus', 'The Commons', 'Newfield'],
 'West Hill': ['Campus', 'The Commons', 'West Hill'],
 'Cayuga Heights': ['Campus', 'The Commons', 'Cayuga Heights'],
 'Freeville': ['Campus', 'Varna', 'Freeville'],
 'Dryden': ['Campus', 'Varna', 'Dryden'],
 'Caroline': ['Campus', 'Slaterville', 'Caroline'],
 'Lansing': ['Campus', 'Northeast', 'Lansing'],
 'Groton': ['Campus', 'Northeast', 'Groton'],
 'Trumansburg': ['Campus', 'The Commons', 'West Hill', 'Trumansburg'],
 'Enfield': ['Campus', 'The Commons', 'West Hill', 'Enfield']}

Next, we want to visualize the shortest path tree. To do so we first add shortest path information to the node and link dataframes.

In [10]:
from graph_tools import getTree

In [11]:
# add node labels (out[0]) to node dataframe 
dfn['label'] = dfn['name'].map(out[0])
# add an 'in shortest-path-tree' indicator variable
tree_dict = getTree(out, dfl) # get dictionary
dfl['in_tree'] = dfl.index.map(tree_dict)

In [12]:
from graph_tools import plotTreeTompkins

In [13]:
plotTreeTompkins(dfn, dfl)

<font color='red'>
   
 - We could add edge tooltips so students can verfiy that branches of the tree are indeed shortest paths.
    
 - This should agree with the output of the 'interactive solver'
</font>