In [3]:
# setup
from IPython.core.display import display,HTML
display(HTML('<style>.prompt{width: 0px; min-width: 0px; visibility: collapse}</style>'))
display(HTML(open('rise.css').read()))

# imports
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
sns.set(style="whitegrid", font_scale=1.5, rc={'figure.figsize':(12, 6)})


# CMPS 2200
# Introduction to Algorithms

## Eucildean TSP and MCSS 


Today's agenda:

- Divide-and-Conquer with `reduce`
- Euclidean TSP
- Maximum Contiguous Subsequence Sum

Recall that we gave a divide-and-conquer algorithm for `reduce`:

$reduce \: f \: id \: a =
\begin{cases}
id & \hbox{if} \: |a| = 0\\
a[0] & \hbox{if} \: |a| = 1\\
f(reduce \: f \: id \: (a[0 \ldots \lfloor \frac{|a|}{2} \rfloor - 1]), \\ \:\:\:reduce \: f \: id \: (a[\lfloor \frac{|a|}{2} \rfloor \ldots |a|-1])& \hbox{otherwise}
\end{cases}
$

What happens when $f$ is the method for combining solutions? 

`reduce(merge, [], list(map(singleton, [1,3,6,4,8,7,5,2])))`

This is Merge Sort! Can all divide-and-conquer algorithms be implemented with `reduce`?


The divide-and-conquer framework is much more general than `reduce`. So `reduce` cannot be used when, for example, we wish to split the input into 3 or more parts, or if they are of unequal size. 

## The Euclidean Traveling Salesperson Problem

In the Euclidean Traveling Salesperson Problem (eTSP), you are given a set of $n$ 2D points. The goal is to find a "tour" (of the points with minimum cost. That is, we must construct a sequence of all the points (i.e., a sequence of 2D points) that begins and ends with the same point such that:

- every point is visited exactly once (except the starting point) 
- the sum of distances between adjacent points is minimized

This is an incredibly widespread and useful problem -- consider all the various kinds of routing problems that are solved every day. For a simple example, think of Amazon/USPS/UPS package deliveries.

Which solution is better?

<br><p> 
 ![eTSP_simple.jpg](eTSP_simple.jpg)
<br><p> 

Given an input with $n$ points, how many possibly solutions are there?

## Brute-Force?

If we take a brute-force approach to this problem, what is the solution space and how can we search it?

There are $n!$ possible solutions, and we must check the cost of each by summing $n-1$ distances. This can be done with $O(n) work and $O(\log n)$ span. So we can solve eTSP with $O(n\cdot n!)$ work and $O(\log n)$ span. 

This is good span, but an astronomical amount of work. What if we had more points?

<br><p> 
 ![eTSP_harder.jpg](eTSP_harder.jpg)
<br><p> 

16! is about $2 x 10^13$, so while there are very few points the brute-force approach is not tractable!

Is the brute-force algorithm work-efficient?


## Divide-and-Conquer?

What intuition can we get about the fact that this problem is in 2D?


<br><p> 
 ![eTSP_harder.jpg](eTSP_harder.jpg)
<br><p> 

Since points that are "clustered" can be dealt with first, how about a divide-and-conquer approach? How would that work?



We can split the input using a "cut" through the plane that separates the input points into two equal parts. Then, recursively solve eTSP for each smaller point set. 

How do we combine smaller solutions into larger ones?



We need to make sure that two tours can be combined into the best possible single tour.

<br><p> 
 ![eTSP_merge.jpg](eTSP_merge.jpg)
<br><p> 

To do this, we can try all possible ways to merge each tour by rerouting across the cut and back and choose the least costly. This yields the following algorithm:


<p><span class="math display">\[\begin{array}{l}  
\mathit{eTSP}~(P) =  
\\  
~~~~\texttt{if}~|P|<2~\texttt{then}  
\\  
~~~~~~~~\texttt{raise}~\mathit{TooSmall}  
\\  
~~~~\texttt{else if}~|P| = 2~\texttt{then}  
\\  
~~~~~~~~\left\langle\, (P[0],P[1]),(P[1],P[0]) \,\right\rangle  
\\  
~~~~\texttt{else}  
\\  
~~~~~~~~\texttt{let}  
\\  
~~~~~~~~~~~~(P_\ell, P_r) = \mathit{split}~P~\texttt{along the longest dimension}  
\\  
~~~~~~~~~~~~(L, R) = (\mathit{eTSP}~P_\ell) \mid\mid{} (\mathit{eTSP}~P_r)  
\\  
~~~~~~~~~~~~(c,(e,e')) = \mathit{minVal}_{\mathit{first}} \left\{ (\mathit{swapCost}(e,e'),(e,e')) : e \in L, e' \in R \right\}  
\\  
~~~~~~~~\texttt{in}  
\\  
~~~~~~~~~~~~~~~~\mathit{swapEdges}~(\mathit{append}~(L,R),e,e')  
\\  
~~~~~~~~\texttt{end}  
\end{array}\]</span></p>

<p>The function $\mathit{minVal}_{\mathit{first}}$ uses the first value of the pairs to find the minimum, and returns the (first) pair with that minimum. The function $\mathit{swapEdges}(E,e,e')$ finds the edges $e$ and $e'$ and swaps the endpoints. As there are two ways to swap, it picks the cheaper one.</p>



**Correctness**: Does this algorithm compute a tour? Does this algorithm compute a minimum-cost tour?
    
We can show by induction that this algorithm always produces a tour. 

However, the combine step does not necessarily produce a minimum cost tour! (Can you think of a counterexample?)

Unfortunately we currently do not know of any  polynomial-work algorithm to solve this problem. In fact, the brute-force algorithm is essentially the best we can do. (We'll get to this in more detail at the end of the semester.)

What we do know how to do, is that we can efficiently compute an *approximation* to the optimal solution. We can compute a solution that is within $(1+\epsilon)$ of optimal. The running time is polynomial in $n$ and $1/\epsilon$. 

This algorithm is actually not correct in the sense that it is not necessarily an approximation to the optimal solution. But, it does work well in practice.

**Work/Span**:

This algorithm has two recursive calls that each operate on $n/2$ points. To combine the solution we must check $O(n^2)$ ways too cross the cut and compute the best. This requires $O(n^2) work and $O(\log n)$ span. 

So we have that the work is $W(n) = 2W(n/2) + O(n^2).$ This is a root-dominated recurrence, and thus $W(n) = O(n^2)$. 

The span is $S(n) = S(n/2) + O(\log n)$. This is a balanced recurrence with $\lg n$ levels, and so $S(n) = O(\log^2 n)$.


## Maximum Continguous Subsequence Sum

<p>Given a sequence of integers, the&#160; <strong><em>Maximum Contiguous Subsequence Problem</em></strong> (<span class="sans-serif">MCS</span>) requires finding the contiguous subsequence of the sequence with maximum total sum:
    $$\textsf{MCS}{}\,(a) = \arg\max_{0 \leq i,j < |a|} \left( {\left( \sum_{k=i}^j a[k]  \right)} \right).  $$ 
    
We define the sum of an empty sequence to $-\infty$.</p>

Example: For $a = \langle 1, -2, 0, 3, -1, 0, 2, -3 \rangle$ a maximum contiguous subsequence (MCS) is $\langle\, 3, -1, 0, 2 \rangle$. Another is $\langle 0, 3, -1, 0, 2 \rangle.$</p>

This is similar to Problem 3 on [HW2](https://classroom.github.com/a/M6svXppx). How?

Let's take a brute-force approach to this problem. What is the solution space, and how long does it take to evaluate it?



We must consider every contiguous subsequence and evaluate the maximum element within each. There are $O(n^2)$ contiguous subsequences. To evaluate the maximum in each contiguous subsequence we need $O(n)$ work and $O(\log n)$ span. Thus the brute-force approach takes $O(n^3)$ work and $O(\log n)$ span.

Can we do better using divide-and-conquer?

As usual let's start by dividing the input into two equal parts and recursively finding the solution. If the MCS is within either part entirely, then in the combine step we just need to return the subsequence with larger maximum. 

But what if the MCS spans the two halves?

![mcss_combine.jpg](mcss_combine.jpg)

Example: $a = \langle 1, -2, 0, 3, -1, 0, 2, -3 \rangle$



<p><span class="math display">\[\begin{array}{l}  
\mathit{MCSSDC}~a =  
\\  
~~~~\texttt{if}~ |a| = 0~\texttt{then}  
\\  
~~~~~~~~{-\infty}{}  
\\  
~~~~\texttt{else if}~|a| = 1 ~\texttt{then}  
\\   
~~~~~~~~a[0]  
\\  
~~~~\texttt{else}  
\\   
~~~~~~~~\texttt{let}  
\\   
~~~~~~~~~~~~(b, c)  = \mathit{splitMid}~a  
\\   
~~~~~~~~~~~~(m_b, m_c) = \left( \mathit{MCSSDC}~b \ ||\ \mathit{MCSSDC}~c \right)  
\\   
~~~~~~~~~~~~m_{bc} = \mathit{bestAcross}~(b, c)  
\\   
~~~~~~~~\texttt{in}  
\\   
~~~~~~~~~~~~\max\{m_b, m_c, m_{bc}\}  
\\   
~~~~~~~~\texttt{end}  
\end{array}\]</span></p>