# Traveling Salesman Problem (TSP)

**Objectives**

- Introduce students to a real world problem solved by OR practitioners
- Demonstrate the use of heuristics to obtain good solutions to optimization problems
- Give students an appreciation of the difficulty of solving optimization problems exactly

**Reading:** Read Handout 2 on the traveling salesman problem.

**Brief description:** Finding an optimal solution to a Traveling Salesman Problem, and proving that it is, in fact, an optimal solution, is a difficult task. In practice, when a feasible solution to a difficult problem needs to be provided quickly, one often resorts to using heuristics, i.e., procedures for generating feasible solutions, or improving existing ones, that can be executed quickly and, hopefully, produce a pretty good result. In this lab we will consider several such heuristic procedures for the TSP.

*TSP VLSI instances adapted from [TSPLIB](http://comopt.ifi.uni-heidelberg.de/software/TSPLIB95/) and [Bonn Institute](http://www.math.uwaterloo.ca/tsp/vlsi/index.html)*

<font color='blue'> <b>Solutions are shown blue.</b> </font> <br>
<font color='red'> <b>Instuctor comments are shown in red.</b> </font>

<font color='red'>A tool we might want to use: [TSP DIY](https://www.math.uwaterloo.ca/tsp/app/diy.html)  </font>

## Jupyter Notebook Introduction

The labs for this course will be done through Jupyter Notebooks. Each lab will be distributed to you via a `.zip` file that contains a Jupyter Notebook (which has file extension `.ipynb`) and any other files necessary to run that lab. This may include supplementary python code, images, or data.

In this first lab, we will begin with a breif introduction to Jupyter Notebooks. The notebook consists of cells of two main types: text (Markdown) and Code. This notebook is comprised only of text cells up until this point. The text is written in a lightweight markup language called Markdown. It allows you to do things like make tables and mathematical equations. Right now, you are seeing the Markdown text after it is already compiled. Double-click the text `CLICK HERE` to see the Markdown before it is compiled.

`CLICK HERE` Now that you have double-clicked, you should be able to type in this cell. Try it out:



To compile this text, press either CTRL+Enter to compile or Shift+Enter to compile and move to the next cell.

Here are some other cool things you can do in Markdown!

- **Bold** and *italics*
- Equations: 
    - $3x_1 + 4x_2 = 10$ 
    - $\frac{2}{3} + \frac{2}{3} = \frac{4}{3}$
    - $2^x = 4$
- Tables:

|   | 1 | 2 | 3 | 4 |
|---|---|---|---|---|
| a |   |   |   |   | 
| b |   |   |   |   | 

And now to the most important cell of a Jupyter Notebook: the Code cell! We can use a code cell to write and run Python code. Let's look at an example. Like compiling the Markdown cell, to run the code in a Code cell we can use either CTRL+Enter to compile or Shift+Enter to compile and move to the next cell. 

In [1]:
# This is a code cell
# We use # to make a comment in a code cell
a = 7

The cell  above creates a variable called `a` and sets its value to 7. This variable and its value carries over to the other cells of the notebook. To the left of a code cell, we can see a number in brackets if the cell has been run. If it has not been run, there is no number. The number indicates the order in which the cells are run. If a variable is on the last line of code cell, its value will be output under the cell.

In [2]:
a

7

Here are a few examples of some different operations in python.

In [3]:
# addition and multiplication
b = 3
c = a + b
print(c)
d = a * b
print(d)

# lists
array = [1,2,3]
print(array[1])

array # recall, since this is on the last line, its value will be printed

10
21
2


[1, 2, 3]

As we do more and more in Jupyter Notebooks throughout the rest of these labs, we will introduce new tools and functionality. **NOTE:** It is important to mention that this course does not require any previous programming experience nor is it a programming course. The programming in this course will be limited to small changes to code and writing mathematical models. 

On to the lab!

## Part I: Solving TSP Manually

In [4]:
# Imports -- make sure you run this cell!
# This cell contains multiple import statements. Each import statement gives the notebook access to pre-bundled
# python code that serves some functionality. The first import statement imports all of the python code in the
# file tsp.py. You can open this file and examine it if you so choose!

from tsp import *
from bokeh.io import output_notebook
output_notebook()

In lectute, you learned about the traveling salesman problem (TSP). The input to this problem is a set of cities along with the distances between them. Our goal is to find a path that visits every city starting and ending at the same city that minimizes the distance traveled. We call a path like this a tour. 

Consider the following TSP problem consisting of 23 US cities. Fun fact: this instance comes from the Beyoncé *On the Run II Tour*.

In [5]:
# This cell imports a CSV file called us_cities_23.csv. We will often import CSV files.
# The data from the CSV file is put in the variable nodes.
# Using display(), we can see the contents of this file: each city (node) and its (x,y) position.
nodes = pd.read_csv('data/us_cities_23.csv', index_col=0)
display(nodes)

Unnamed: 0,name,x,y
0,0.0,751.5,1235.8
1,1.0,634.6,200.4
2,2.0,701.4,-100.2
3,3.0,784.9,-267.2
4,4.0,985.3,-367.4
5,5.0,1753.5,768.2
6,6.0,1920.5,-801.6
7,7.0,2338.0,-868.4
8,8.0,2404.8,-1102.2
9,9.0,2187.7,-400.8


Before we try to solve the TSP instance, we need to generate the distances between each of the 23 cities. The function `distance_matrix` (described in `tsp.py`) does just this! The entry $(i,j)$ in the distance matrix gives the distance from city $i$ to city $j$.

**NOTE:** The function `distance_matrix` takes a parameter called manhattan. If manhattan is true, the manhatten distance is computed. Otherwise, the euclidean distance is computed. For a city located at $(a,b)$ and a city at $(c,d)$, the manhattan distance is $|c-a| + |d-b|$ (horizontal distance plus the vertical distance). The euclidean distance is $\sqrt{(c-a)^2 + (d-b)^2}$ (striaght-line distance).

In [6]:
G = distance_matrix(nodes, manhattan=False)
display(pd.DataFrame(G))

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,13,14,15,16,17,18,19,20,21,22
0,0.0,1041.978296,1336.939045,1503.371065,1620.158227,1105.73675,2348.94865,2635.268466,2863.502207,2177.413603,...,1938.711049,1491.451186,1901.52803,1947.537933,1669.331866,1732.298011,1688.682993,1568.378079,1941.801432,2295.93116
1,1041.978296,1.1e-05,307.932785,491.161735,667.373456,1254.724691,1630.197169,2010.946295,2197.811366,1665.401168,...,2025.318276,1367.463875,1474.525836,1787.212122,1750.316157,1554.535895,1620.158227,1555.970466,1209.453616,1488.081517
2,1336.939045,307.932785,0.0,186.711676,389.865425,1364.196822,1406.473167,1807.924445,1976.252909,1516.393105,...,2052.945611,1380.050655,1369.807257,1752.227157,1792.821031,1531.035744,1626.6863,1585.444912,1002.139157,1236.251269
3,1503.371065,491.161735,186.711676,0.0,224.054011,1417.829016,1255.058054,1665.401168,1822.443692,1409.147544,...,2046.482624,1375.496187,1292.821631,1711.566698,1798.645046,1501.143298,1613.344467,1586.763662,867.114433,1069.973574
4,1620.158227,667.373456,389.865425,224.054011,0.0,1371.0283,1031.081316,1442.497241,1598.408987,1202.863799,...,1922.531969,1262.590686,1110.393034,1556.955966,1690.251212,1359.589247,1487.050372,1476.416012,648.510694,848.419171
5,1105.73675,1254.724691,1364.196822,1417.829016,1371.0283,0.0,1578.657987,1737.843437,1980.55241,1247.032734,...,857.573863,435.482721,955.992955,855.457398,576.573395,662.97107,588.304759,464.007629,1276.212208,1638.728718
6,2348.94865,1630.197169,1406.473167,1255.058054,1031.081316,1578.657987,0.0,422.810229,570.006009,481.70165,...,1621.276735,1204.254126,684.7,1150.240914,1523.274368,1101.693823,1275.884372,1356.303226,425.112597,259.253717
7,2635.268466,2010.946295,1807.924445,1665.401168,1442.497241,1737.843437,422.810229,0.0,243.15567,491.161735,...,1559.461702,1316.760848,781.873954,1105.358354,1536.490758,1145.503143,1306.448162,1414.776848,806.629016,668.0
8,2863.502207,2197.811366,1976.252909,1822.443692,1598.408987,1980.55241,570.006009,243.15567,0.0,734.230461,...,1781.507283,1559.908731,1024.569158,1336.104371,1772.168245,1386.502351,1545.449436,1655.659932,988.690776,771.098878
9,2177.413603,1665.401168,1516.393105,1409.147544,1202.863799,1247.032734,481.70165,491.161735,734.230461,0.0,...,1147.44921,826.270367,291.652962,676.298455,1077.117635,668.0,836.50165,935.796239,576.573395,697.612392


Now we can find a tour manually. Running the cell below will generate a visual of the 23 cities. Click on the cities one at a time to create a tour. Clicking on the last node will automatically complete the tour. In the lower-left, you will see the cost of the tour update as you create it. In the lower-right, you will see the tour.

In [7]:
plot_create_tour(nodes, G, width=600, height=375, show_us=True)

**Q1:** What was the smallest tour cost you found?

**A:** <font color='blue'> No less than 8356.8</font>

## Part II: TSP Heuristics

In this part of the lab, we will consider 4 different TSP heuristics. A heuristic aims to find a good feasible solution to a problem although it is not guaranteed to be optimal. Before we move on, we will abstract the TSP to finding an optimal tour on a set of *nodes* rather than cities. These nodes could represent anything. 

- **Random Neighbor:** Start at some node. Randomly select one of the nodes which has not been visited to visit next. Continue doing so until all nodes have been visited. Return to the start.
- **Nearest Neighbor:** Start at some node. Visit the closest unvisited node next (if there are multiple closest nodes, choose one randomly). Continue doing so until all nodes have been visited. Return to the start.
- **Nearest Insertion:** Start with a “tour” on two of the nodes (e.g., the closest pair of nodes). Find the closest unvisited node to any node currently in tour. Insert the node into the tour at the best place (if there are multiple closest nodes, choose one to add randomly).
- **Furthest Insertion:** Start with a “tour” on two of the nodes (e.g., the closest pair of nodes). Find the node whose smallest distance to a node already in the tour is maximized. Insert the node into the tour at the best place (if there are multiple furthest nodes, choose one to add randomly).

**Q2:** Which heuristic to you expect to perform the best? Which do you expect to perform the worst?

**A:** <font color='blue'> Will vary. Maybe random neighbor worst and nearest insertion best.</font>

To compare the heuristics, we will use a simple 6x8 grid of nodes. The cell below creates this instance. Note that we will initally use the manhattan distance.

In [8]:
# nodes is the list of nodes and their position and G is the distance matrix
nodes, G = tsp_grid_instance(6,8,manhattan=True)

Let's use random neighbor (a terrible heuristic) to get a baseline for the length of a tour. The function `random_neighbor` will run the random neighbor heuristic and the function `plot_tour` will plot the tour and its cost in the lower-left.

In [9]:
tour = random_neighbor(G)
# tour is an ordered list of the nodes starting and ending at the same node
print(tour)
plot_tour(nodes, G, tour)

[0, 18, 35, 25, 4, 13, 36, 41, 45, 11, 21, 2, 47, 26, 40, 30, 16, 19, 38, 5, 29, 3, 15, 46, 14, 20, 17, 10, 32, 22, 39, 8, 42, 23, 12, 44, 37, 34, 28, 24, 9, 31, 6, 43, 1, 27, 33, 7, 0]


**Q3:** Does this look like a good tour to you? Run it a few times and see what the average tour cost is.

**A:** <font color='blue'>No. Average around 230.</font>

Now, let's look at the nearest neighbor heuristic. This time, we will use the function `plot_tsp_heuristic` to see every iteration of the heuristic. We can move through iterations with the `Next` and `Previous` buttons. The tour cost will update in the bottmom-left. `done.` will appear in the bottom-right when the heuristic has finished.

In [10]:
tour = plot_tsp_heuristic(nodes, G, heuristic='nearest_neighbor', initial=0)

**Q4:** As you iterate through, examine the "choices" made by the algorithm at each step. What does it do well? What does it do poorly?

**A:** <font color='blue'>It does better than random neighbor because it often moves to a node close to the one it is currently on. However, it can essentially box itself out of certain regions of the graph. In the end, it often has to make lengthy jumps to get nodes it missed along the way. </font>

**Q5:** Run this a few times. Do you get the same tour every time? Why or why not?

**A:** <font color='blue'>No, because if there are multiple choices for the closest node, one is chosen randomly.</font>

Now, let's look at the nearest insertion heuristic.

In [11]:
tour = plot_tsp_heuristic(nodes, G, heuristic='nearest_insertion', initial=[0,1,0])

**Q6:** Run this a few times. How does is compare to the previous heuristics?

**A:** <font color='blue'> This is the best heuristic yet. By starting with a small tour and expanding it, the boxing out issue nearest neighbor experienced is reduced. </font>

Now, let's look at the furthest insertion heuristic.

In [12]:
tour = plot_tsp_heuristic(nodes, G, heuristic='furthest_insertion', initial=[0,len(G)-1,0])

**Q7:** Run this a few times. How does is compare to the previous heuristics?

**A:** <font color='blue'> This heuristic is comparable to nearest insertion although there may be certain circumstanes where one would be more likely to outperform the other. </font>

To compare the heuristics further, lets run each on the 6x8 grid say, 250 times.

**Q8:** Now that you have seen each heuristic, which do you think will do the best and which will do the worst?

**A:** <font color='blue'> Random neighbor will do the worst and either nearest or furthest insertion will do the best. </font>

In [13]:
n = 250
random_neighbor_total = 0
nearest_neighbor_total = 0
nearest_insertion_total = 0
furthest_insertion_total = 0
for i in range(n):
    random_neighbor_total += tour_cost(G, random_neighbor(G))
    nearest_neighbor_total += tour_cost(G, nearest_neighbor(G))
    nearest_insertion_total += tour_cost(G, nearest_insertion(G))
    furthest_insertion_total += tour_cost(G, furthest_insertion(G))
print("Heuristic Averages:")
print("Random Neighbor: %s" % (random_neighbor_total / n))
print("Nearest Neighbor: %s" % (nearest_neighbor_total / n))
print("Nearest Insertion: %s" % (nearest_insertion_total / n))
print("Furthest Insertion: %s" % (furthest_insertion_total / n))

Heuristic Averages:
Random Neighbor: 224.688
Nearest Neighbor: 63.6
Nearest Insertion: 60.632
Furthest Insertion: 58.104


**Q9:** What were the results? Was this what you expected?

**A:** <font color='blue'> Random neighbor did significantly worse. Furthest insertion did the best with nearest insertion and nearest neighbor not to much worse. This was what I expected. </font>

Let's look at 9x9 grid using the euclidian distance now. Run each of the cells below to see each heuristic executed on the new instance.

In [14]:
nodes, G = tsp_grid_instance(9,9,manhattan=False)

In [15]:
plot_tour(nodes, G, random_neighbor(G))

In [16]:
tour = plot_tsp_heuristic(nodes, G, heuristic='nearest_neighbor', initial=0)

In [17]:
tour = plot_tsp_heuristic(nodes, G, heuristic='nearest_insertion', initial=[0,1,0])

In [18]:
tour = plot_tsp_heuristic(nodes, G, heuristic='furthest_insertion', initial=[0,len(G)-1,0])

**Q10:** How did the results compare to the 6x8 grid using manhattan distances?

**A:** <font color='blue'> Similar. </font>

Again, let's run each heuristic numerous times and see how they compare.

In [19]:
n = 100
random_neighbor_total = 0
nearest_neighbor_total = 0
nearest_insertion_total = 0
furthest_insertion_total = 0
for i in range(n):
    random_neighbor_total += tour_cost(G, random_neighbor(G))
    nearest_neighbor_total += tour_cost(G, nearest_neighbor(G))
    nearest_insertion_total += tour_cost(G, nearest_insertion(G))
    furthest_insertion_total += tour_cost(G, furthest_insertion(G))
print("Heuristic Averages:")
print("Random Neighbor: %s" % (random_neighbor_total / n))
print("Nearest Neighbor: %s" % (nearest_neighbor_total / n))
print("Nearest Insertion: %s" % (nearest_insertion_total / n))
print("Furthest Insertion: %s" % (furthest_insertion_total / n))

Heuristic Averages:
Random Neighbor: 385.32279071262246
Nearest Neighbor: 101.35433216508982
Nearest Insertion: 87.7074442746385
Furthest Insertion: 84.8893996235872


**Q11:** How did the results compare to the 6x8 grid using manhattan distances?

**A:** <font color='blue'> Similar. </font>

## Part III: Improving Tours: 2-OPT

In **Part II**, we used heuristics to create TSP tours. However, we can also use heuristics to try and improve tours we have already found. We will examine a tour improvement heuristic called 2-OPT in this part. 2-OPT looks for pairs of edges which can be reconnected to strictly improve the tour cost. (Note: there is only one way to reconnect a pair of edges). It continues in this fashion until no more improvements can be made. Let's start by running 2-OPT after the nearest neighbor heuristic (using a 5x5 euclidian distance example).

In [20]:
# First, we run the nearest neighbor heuristic to get an initial tour
nodes, G = tsp_grid_instance(5,5,manhattan=False)
tour = nearest_neighbor(G)
plot_tour(nodes, G, tour)

Run the following cell to generate a visualization of 2-OPT. In each iteration, 2 edges will be highlighted red and 2 will be highlighted blue. The red edges indicate the current position and the blue indicate the positon they will be reconnected in.

In [21]:
plot_two_opt(nodes, G, list(tour));

**Q12:** Run 2-OPT a few times. Do you get the same result every time? Why or why not?

**A:** <font color='blue'> Yes. There is no randomness in this algorithm.</font>

Let's run 2-OPT a few times on a slightly larger grid.

In [22]:
nodes, G = tsp_grid_instance(9,9,manhattan=False)
tour = nearest_neighbor(G)
tour = plot_two_opt(nodes, G, tour)

**Q13:** After running 2-OPT, do you ever get a tour which crosses itself? When using euclidian distances, is this even possible? Explain why or why not.

**A:** <font color='blue'> No. This is not possible. Assume you have a tour that crosses itself. Consider the two edges that cross. You can always reconnect them to be cheaper. Hence, 2-OPT has not been run to completion. It follows that 2-OPT always terminates with a tour with no crosses. </font>

Let's compare the heuristics with and without executing 2-OPT. While we are at it, we will use the function `solve_tsp` to get an optimal tour.

In [23]:
nodes, G = tsp_grid_instance(6,6,manhattan=False)
tour = solve_tsp(G)
plot_tour(nodes, G, tour)

In [24]:
n = 50
nearest_neighbor_total = 0
nearest_insertion_total = 0
furthest_insertion_total = 0
nearest_neighbor_2_total = 0
nearest_insertion_2_total = 0
furthest_insertion_2_total = 0
optimal_total = 0
for i in range(n):
    nearest_neighbor_total += tour_cost(G, nearest_neighbor(G))
    nearest_insertion_total += tour_cost(G, nearest_insertion(G))
    furthest_insertion_total += tour_cost(G, furthest_insertion(G))
    nearest_neighbor_2_total += tour_cost(G, two_opt(G, nearest_neighbor(G)))
    nearest_insertion_2_total += tour_cost(G, two_opt(G, nearest_insertion(G)))
    furthest_insertion_2_total += tour_cost(G, two_opt(G, furthest_insertion(G)))
    optimal_total += tour_cost(G, solve_tsp(G))
print("Nearest Neighbor: %s" % (nearest_neighbor_total / n))
print("Nearest Neighbor + 2-OPT: %s" % (nearest_neighbor_2_total / n))
print("Nearest Insertion: %s" % (nearest_insertion_total / n))
print("Nearest Insertion + 2-OPT: %s" % (nearest_insertion_2_total / n))
print("Furthest Insertion: %s" % (furthest_insertion_total / n))
print("Furthest Insertion + 2-OPT: %s" % (furthest_insertion_2_total / n))
print("Optimal: %s" % (optimal_total / n))

Nearest Neighbor: 44.62606765974404
Nearest Neighbor + 2-OPT: 37.99348592421335
Nearest Insertion: 38.69050679116014
Nearest Insertion + 2-OPT: 37.85462512589236
Furthest Insertion: 37.66500706654747
Furthest Insertion + 2-OPT: 37.59873289656778
Optimal: 36.0


**Q14:** Compare the heuristics to their before and after 2-OPT performance. Compare them to the optimal.

**A:** <font color='blue'> The nearest neighbor heuristic improved the most from 2-OPT. Both nearest and furthest insertion were often not improved by 2-OPT and came very close to the optimal. After running 2-OPT, the nearest neighbor heuristic was *near* optimal as well. </font>

For fun, let's go back to the 23 US city example. Let's run 2-OPT on the tour you created in **Part I** (or a new one if you would like). To do this, you will need to define the tour as follows:

In [25]:
nodes = pd.read_csv('data/us_cities_23.csv', index_col=0)
G = distance_matrix(nodes, manhattan=False)
plot_create_tour(nodes, G, width=600, height=375, show_us=True)

**Q15:** Set the `tour` variable to be the tour you manually created.

In [26]:
# We can define a tour like this:
tour = [0,2,1,3,4,22,21,6,8,7,9,10,15,16,18,11,12,13,17,19,20,14,5,0]

# After manually creating a tour, you can copy the list associated with that tour from the bottom-right

# TODO: Define your tour.

### BEGIN SOLUTION
tour = [0,2,1,3,4,22,21,6,8,7,9,10,15,16,18,11,12,13,17,19,20,14,5,0]
### END SOLUTION

Run 2-OPT!

In [27]:
plot_two_opt(nodes, G, list(tour), width=600, height=375, show_us=True);

**Q16:** Did 2-OPT improve your tour? By how much?

**A:** <font color='blue'> (Based on example tour) Yes. It went from 9592.0 to 8356.8. </font>

Now, let's look at an optimal solution!

In [28]:
plot_tour(nodes, G, solve_tsp(G), width=600, height=375, show_us=True);

**Q17:** Was your tour optimal before or after 2-OPT?

**A:** <font color='blue'> (Based on example tour) It was not optimal before 2-OPT but it become optimal afterwards! </font>

## Part IV: TSP Application: VLSI Drilling

<font color='red'> In progress. </font>

In [29]:
nodes = pd.read_csv('data/xqg237.csv', index_col=0)
G = distance_matrix(nodes, manhattan=False)
plot_tour(nodes, G, solve_tsp(G))

## Part V: TSP Application: VLSI Etching

<font color='red'> In progress. </font>

In [30]:
nodes = pd.read_csv('data/xqf131_etching.csv', index_col=0)
G = distance_matrix(nodes, manhattan=False, vlsi=True)
plot_vlsi_tour(nodes,G, solve_tsp(G))