# NP-completeness

We say that a problem $x$ is NP-complete if it satatsfies the following criteria 

1. $x \in \text{NP}$
2. $x$ is NP-hard

What does it mean to for a problem to belong to the complexity class NP? It means that given a particular instance of that problem and a solution to that problem we can *verify* that solution is correct or incorrect in time *polynomial* in the input problem size. 

What does it mean for a problem to be NP-hard? It means that there is a *polynomial* time reduction of an instance of *any* problem that is in the complexity class NP to an instance of the problem $x$. 

## Practice problems

### 8.1 Traveling salesperson - Optimization vs search variant

Suppose we have a black box that can solve TSP in polynomial time $O(\text{poly}(n))$ where $n$ is the number of cities in the input. Now take the the largest distance in the matrix $D$ and compute $nD$. To find the shortest tour which passes through all the cities (TSP-OPT) we can use a binary search on the domain of real number $[0,nD]$. 

So first start by seeing if there is a solution to TSP for $b=\frac{nD}{2}$ if not then we look on the new range $[0,\frac{nD}{2})$. We do this until we reach an array with a right bound $[0,x]$ that is less than the *smallest* distance in the matrix $D_{min}$ because clearly there can't be a tour that visits all cities that is shorter than that. This binary search will need to be done $\log{(nD)}$ times and each time we use our black box algorithm for TSP which has complexity $O(\text{poly}(n))$ so our overall complexity is $O(\log{(nD)}\times \text{poly}(n))$ which is polynomial in the input size.

### 8.2 Hamiltonian (Rudrata) path - Decision vs search variant

Suppose we have a black box that given a graph $G=(V,E)$ will return True if there is a Hamiltonian path in $G$ and False if there isn't one in polynomial time $O(\text{poly}(m+n))$ where $n$ is the number of nodes in $G$ and $m$ is the number of edges in $G$.

To find the Hamiltonian path we can do the following. First run the black box algorithm on $G$ if it returns False we return the empty set as there is no Hamiltonian path. If it returns True we do the following: 

1. Iterate over the edges of $G$. 
2. For each edge $e$ remove it from $G$. Then run our black box algorithm on $G$. 
3. If the black box returns False that means $e$ is part of a Hamiltonian path in $G$ so we will keep it. If it returns True that means $e$ is not part of a Hamiltonian path in $G$ so we keep it removed from $G$.
4. Do this for all edges of $G$ and at the end only edges that are part of a Hamiltonian path in $G$ will remain in $G$.

This algorithm requires we run the black box for each edge in the graph so our time complexity is $O(m\times\text{poly}(m+n))$.

### Stingy-SAT

First we show Stingy-SAT is in NP. Suppose we have a solution to an instance of the Stingy SAT problem. We can iterate over the CNF boolean expression and count the number of variables in the expression, denote it as $n$, this takes $O(n)$ time. Now we can count the number of variables that are set to True in the solution, denote this as $r$. Now we just need to check if $r\leq k$ to see if the stingy constraint is satisfied. This takes $O(1)$ time. Finally, we can evaluate the CNF expression with the variable assignments in the solution and see if it evaluates to True as we would for the regular SAT problem which takes $O(n)$ time. So this whole process of verifying the solution takes polynomial time so Stingy SAT is in NP.

Second we need to show Stingy-SAT is NP-hard which means all problems in NP can be reduced to this problem. We can straightforwardly show this by showing SAT, which is known to be NP-hard, can be reduced to Stingy-SAT. This can be done by computing the number of variables, $n$, in a CNF expression for the SAT problem and setting $k=n$. Now this is an instance of the Stingy-SAT problem and the solution to this problem is a solution to the original SAT problem with no need of modification. 