In [1]:
# setup
from IPython.core.display import display,HTML
display(HTML('<style>.prompt{width: 0px; min-width: 0px; visibility: collapse}</style>'))
display(HTML(open('../rise.css').read()))

# imports
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
sns.set(style="whitegrid", font_scale=1.5, rc={'figure.figsize':(12, 6)})


  from IPython.core.display import display,HTML


# CMPS 2200
# Introduction to Algorithms

## Traveling Salesperson Problem [Optional]


Consider a slight variant of the MST problem:

Given a graph $G=(V,E)$, find a tour that visits each node exactly once and then returns to the origin node.
 - every node is visited
 - no edges are repeated

<center>
<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/1/11/GLPK_solution_of_a_travelling_salesman_problem.svg/480px-GLPK_solution_of_a_travelling_salesman_problem.svg.png"/>
</center>

Often, we assume the graph is *complete* (fully connected) and edge weights are distance between each city.

<br>

How does this differ from the MST problem?

- TSP solution has one more edge than MST solution (graph instead of a tree)


- Therefore, weight(MST solution) < weight(TSP solution)


Thus, MST solution provides a lower bound on the TSP solution.

Can we also use MST to find an upper bound?

### Euclidean TSP

Variant of TSP where triangle inequality holds:

$w(u,v) + w(v,w) \ge w(u,w)$

where all weights are non-negative.

Consider a MST solution for the graph:

<center>
    <img src="figures/tsp1.jpg"/>
</center>

<br>

How could we convert this tree into a tour for TSP?

<br><br>


We need to determine an order to visit the nodes in the MST solution.

Let's try depth-first search:

<center>
    <img src="figures/tsp2.jpg"/>
</center>

This is called the **Euler tour** of the tree:

 - a cycle in a graph that visits every edge exactly once.
 - Since $T$ spans the graph, the Euler tour will visit every vertex at least once, but possibly multiple times.

<br>

This is close to a TSP solution, but: 

- it visits each edge twice

- $(d,f)$ should have an edge

- The weight of the Euler tour is equal to twice the MST weight (since we visit each edge twice).


How can we convert this to a proper solution to TSP?

**idea**: 

Compute DFS order, but when we find a repeated edge, instead find the next unvisited vertex.

<center>
    <img src="figures/tsp3.jpg"/>
</center>

The red edges above are called *shortcut edges*.

Because of triangle inequality, we know that the shortcut edges are no longer than the paths they replace

$w(f,c) \le w(\langle f,e,b,a,c \rangle)$
  
<br>
  
Since we know:


  - $weight(MST) < weight(TSP)$ , and
  
  - $weight(Euler) = 2 \cdot weight(MST)$
  
then we know

 - $weight(MST) \le weight(TSP) \le 2 \cdot weight(MST)$
 
 
###  Thus, we now have a polynomial-time algortithm to solve TSP that is no worse than 2 times the optimal solution!
 

For such Nondeterministic Polynomial (NP)-hard problems, approximations are often the best we can do.
 

## Divide-and-Conquer?

What intuition can we get about the fact that this problem is in 2D?


    
<center>
    <img src="figures/eTSP_harder_sol.jpg"/>
</center>

Since points that are "clustered" can possibly be dealt with first, how about a divide-and-conquer approach? How would that work?



We can split the input using a "cut" through the plane that separates the input points into two equal parts. Then, recursively solve eTSP for each smaller point set. 

How do we combine smaller solutions into larger ones?



We need to make sure that two tours can be combined into the best possible single tour.

    
<center>
    <img src="figures/eTSP_merge.jpg"/>
</center>

To do this, we can try all possible ways to merge each tour by rerouting across the cut and back and choose the least costly. This yields the following algorithm:


<p><span class="math display">\[\begin{array}{l}  
\mathit{eTSP}~(P) =  
\\  
~~~~\texttt{if}~|P|<2~\texttt{then}  
\\  
~~~~~~~~\texttt{raise}~\mathit{TooSmall}  
\\  
~~~~\texttt{else if}~|P| = 2~\texttt{then}  
\\  
~~~~~~~~\left\langle\, (P[0],P[1]),(P[1],P[0]) \,\right\rangle  
\\  
~~~~\texttt{else}  
\\  
~~~~~~~~\texttt{let}  
\\  
~~~~~~~~~~~~(P_\ell, P_r) = \mathit{split}~P~\texttt{along the longest dimension}  
\\  
~~~~~~~~~~~~(L, R) = (\mathit{eTSP}~P_\ell) \mid\mid{} (\mathit{eTSP}~P_r)  
\\  
~~~~~~~~~~~~(c,(e,e')) = \mathit{minVal}_{\mathit{first}} \left\{ (\mathit{swapCost}(e,e'),(e,e')) : e \in L, e' \in R \right\}  
\\  
~~~~~~~~\texttt{in}  
\\  
~~~~~~~~~~~~~~~~\mathit{swapEdges}~(\mathit{append}~(L,R),e,e')  
\\  
~~~~~~~~\texttt{end}  
\end{array}\]</span></p>

<p>The function $\mathit{minVal}_{\mathit{first}}$ uses the first value of the pairs to find the minimum, and returns the (first) pair with that minimum. The function $\mathit{swapEdges}(E,e,e')$ finds the edges $e$ and $e'$ and swaps the endpoints. As there are two ways to swap, it picks the cheaper one.</p>

   

**Correctness**: Does this algorithm compute a tour? Does this algorithm compute a minimum-cost tour?
    
We can show by induction that this algorithm always produces a tour. 

### However, the combine step does not necessarily produce a minimum cost tour!

<br>

Actually, we currently do not know of any  polynomial-work algorithm to solve this problem. In fact, the brute-force algorithm is essentially the best we can do. 


**Work/Span**:

This algorithm has two recursive calls that each operate on $n/2$ points. To combine the solution we must check $O(n^2)$ ways too cross the cut and compute the best. This requires $O(n^2)$ work and $O(\log n)$ span. 

So we have that the work is $W(n) = 2W(n/2) + O(n^2).$ This is a root-dominated recurrence, and thus $W(n) = O(n^2)$. 

The span is $S(n) = S(n/2) + O(\log n)$. This is a balanced recurrence with $\lg n$ levels, and so $S(n) = O(\log^2 n)$.
