In [4]:
# setup
from IPython.core.display import display,HTML
display(HTML('<style>.prompt{width: 0px; min-width: 0px; visibility: collapse}</style>'))
display(HTML(open('rise.css').read()))

# imports
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
sns.set(style="whitegrid", font_scale=1.5, rc={'figure.figsize':(12, 6)})


# CMPS 2200
# Introduction to Algorithms

## Computational Complexity and NP-Hardness


Going back to our discussions of asymptotic complexity, we said that we were interested in algorithms for computational problems that:

- Did polynomial work in the input size

- Leveraged concurrency to achieve span (i.e., parallel speedup)

We have studied different algorithmic paradigms to try and achieve these two goals. But, when is this actually possible?

What problems are solvable with polynomial work? Of those, which allow us to achieve a good parallel speedup?

Or, which problems **aren't** solvable with polynomial work? Perhaps we could just avoid or approximate these, instead of trying to find efficient algorithms.

The field of **computational complexity** tries to characterize problems by resource complexity (e.g., work, span, space).


### Checking versus Solving ###

Another interesting thing about these problems that we can't seem to solve is that, given a solution, we can actually check whether or not the solution is correct very quickly. We just don't know how to produce correct solution efficiently.

Likewise we can easily check solutions to TSP, and Knapsack with polynomial work. 

- The class of problems for which we can **check**, with a provided candidate solution, whether the input produces correct answer is known as $\mathcal{NP}$ (nondeterministic polynomial). 

- $\mathcal{P}$ is the class of problems for which we can compute solutions directly with polynomial work.

Now we know that $\mathcal{P}\subseteq\mathcal{NP}$, since we can efficiently check a problem solution by constructing it in polynomial work.

But does $\mathcal{P} = \mathcal{NP}$? Or more informally, do we need substantially more work to solve a problem than to check a solution to it?



### Reduction ###

Interestingly, many of these problems reduce to one another.

- A problem $\mathcal{X}$ is *polynomial-work reducible* to a problem $\mathcal{Y}$ if we can i) perform an input transformation from $\mathcal{X}$ to $\mathcal{Y}$ and ii) an output transformation from $\mathcal{Y}$ to $\mathcal{X}$ with polynomial work. This shows that $\mathcal{Y}$ "is as hard" as $\mathcal{X}$, because an algorithm for $\mathcal{Y}$ then yields an algorithm for $\mathcal{X}$ (with an additional polynomial amount of work).


- A problem $\mathcal{X}$ in $\mathcal{NP}$ is $\mathcal{NP}$-complete if every other problem in $\mathcal{NP}$ can be transformed (or reduced) into $\mathcal{X}$ in polynomial work. It is not known whether every problem in $\mathcal{NP}$ can be quickly solved.


Richard Karp used reductions to show 21 different problems were all $\mathcal{NP}$-complete.

<a href="http://cgi.di.uoa.gr/~sgk/teaching/grad/handouts/karp.pdf"><img src ="karp_21_problems.jpg" width=70%></a>


### Parallelism? ###

Can we parallelize and solve $\mathcal{NP}$-complete problems? 

Since the definition of span doesn't really care about the number of processors, we can solve problems in $\mathcal{NP}$ using brute force with polynomial span. This is because the definition of $\mathcal{NP}$ ensures that we can efficiently check candidate solutions.

However if we were able to achieve polynomial work, we'd immediately have shown $\mathcal{P}=\mathcal{NP}$ since we could just do all the work on a single processor.

A more interesting question is whether we can effectively parallelize problems in $\mathcal{P}$. That is, for any problem $\mathcal{X}$ that is solvable in polynomial work, does it also have low span?

Let $\mathcal{NC}$ (``Nick's Class``) denote the set of all problems with span $O(\log^c n)$ for some constant $c$ using a polynomial number of processors. 

We know that $\mathcal{NC}\subseteq \mathcal{P}$, but is $\mathcal{P}\subseteq\mathcal{NC}$? 

What does this statement imply?

It implies that **every** problem in $\mathcal{P}$ is parallelizable. 

As with $\mathcal{NP}$, it's possible to define $\mathcal{P}$-complete problems. 

The Circuit Value Problem ($\mathit{CVP}$) asks, for a given circuit on AND, OR, and NOT gates along with inputs, does the circuit produce an output of 1?

$\mathit{CVP}$ is $\mathcal{P}$-complete.

As with $\mathcal{NP}$, we have [a long list of $\mathcal{P}$-complete problems](https://en.wikipedia.org/wiki/P-complete#P-complete_problems). We cannot find polylogarithmic span algorithms for any of them, nor can we prove that it is not possible to come up with such algorithms.

We have just looked at two complexity classes, but this <a href="https://complexityzoo.uwaterloo.ca/Complexity_Zoo">area</a> is quite deep.


<a href = "https://jeremykun.com/2012/02/29/other-complexity-classes/"><img src="complexity_venn_diagram.jpg" width=60%></a>