# NP2 3SAT

## 512 NP Completeness

Our plan now is to prove that the 3SAT problem is NP-complete. What we just saw is that if we have a problem such as SAT, which is known to be NP-complete, then this makes our task much easier. So, we're going to use the fact that SAT is NP-complete. This is known as a Cook-Levin-Theorem. It was proved independently in 1971 by Steven Cook and Leonid Levin. And they proved that SAT is NP-complete. Now, we're going to take this theorem for granted. And later, I'll give you some high-level idea of the proof that the SAT problem is NP-complete. Now, after Steven Cook published his paper, then the importance of NP-completeness was highlighted by a paper by Dick Karp in 1972. He showed 21 other problems which were NP-complete. We're going to look at a few of these 21 in the next few lectures. We'll start with 3SAT, which was one of the 21

## 513 3SAT

Let me give you a quick reminder of the formulation of the 3SAT problem. So the input to the 3SAT problem is a boolean formula f in conjunctive normal form and we use n for the number of variables and m for the number of clauses in the formula f. Now the extra restriction in 3SAT as opposed to SAT is that we assume this formula f has the following constraint. Each clause has at most three literals. Now the output for the 3SAT problem is a satisfying assignment if one exists. This is an assignment of true and false to the n variables, so that the formula f evaluates to true. Now if there's no satisfying assignment, we simply output-

## 514 Proof Outline

We're going to show now that 3SAT problem is NP complete. Before we dive into the proof, let's outline what we need to show in order to establish that the 3SAT problem is NP complete. The first thing we need to show is that the 3SAT problem lies in the class NP. This will be straightforward to prove. Now our main task is to take a known NP complete problem. In this case, all we know is that SAT is NP complete. So we have to just show a reduction from SAT problem, to the 3SAT problem. Now once we've shown this reduction from SAT to 3SAT, what does that establish? That establishes that for every problem in NP, we have a reduction from this problem A to SAT. And then we have this reduction from SAT to 3SAT, therefore, we have a reduction from A to 3SAT. The implication of this is, that if we have a polynomial time algorithm for 3SAT, then we have a polynomial time algorithm for every problem in NP, because we can reduce every problem in NP to 3SAT. So let's start with the easy task. Let's prove that 3SAT is in the class N-

## 515 3SAT in NP

Now to prove that 3SAT is in the class ϵNP, we have to show that we can verify solutions efficiently. So, let's take a particular input for the 3SAT problem. So f is our input for the 3SAT problem. And let's take a proposed solution. So this is a true false assignment for the n variables. Now we need to check that this assignment is a satisfying assignment for this formula. Now, how are we going to verify that this assignment is a satisfying assignment? Well, we're going to go through the clauses. So let's take each clause. For a particular clause C, you will take us order one time to check that at least one of the literals in C is satisfied. Why is it order one time? Because each clause has at most three literals. Now if every clause C is satisfied, then the formula f is satisfied. It takes order one time per clause, there's M clauses, so it takes order M total time to verify that this assignment satisfies the formula. So this proves that the 3SAT problem is in the class ϵNP.

## 516 SAT 3SAT

Now let's look at the task of reducing SAT to 3SAT. Let's outline first what we need to prove. Now, we're assuming we have an algorithm a polynomial time algorithm for the 3SAT problem, and we're going to use this as a black box. And we're going to construct an algorithm for the SAT problem, using this 3SAT algorithm as a subroutine. So what do we need to do? We need to take an input for the SAT problem. So we have an f, which is an input formula for the SAT problem, then we have to transform this input f into an input for the 3SAT problem. We'll use f prime to denote the input to 3SAT problem. Now this is a bit tricky to do. Why? Because f might have some big clauses. It might have some clauses which contain maybe n literals. But our input for 3SAT has to have clauses of size and most three. So somehow we have to transform these big clauses, into a series of small clauses. And we need to do it in such a way that if we have a satisfying assignment sigma prime, which is a satisfying assignment for f prime, our 3SAT formula, then we can transform this satisfying assignment for the 3SAT input into a satisfying assignment sigma for the original SAT input. Moreover, we want that if our 3SAT formula, f prime has no satisfying assignments. So there's no Sigma prime. So the algorithm is going to output No, we want that our original SAT formula f also has no satisfying assignment. So you want that f prime has no satisfying assignment, if and only if f has no satisfying assignment. And that way we can simply output No, in both cases. So once again what do we need to do? We need to take an input f, for the SAT problem, and we need to create an input for the 3SAT problem, and then given a satisfying assignment sequence prime for the 3SAT input, we need to transform it and make a satisfying assignment, for the SAT formula. And we need that Sigma prime satisfies f prime. So this is a satisfying assignment for f prime the 3SAT input, if and only if, this transformed output sigma, is a satisfying assignment for our original set formula. So we have a satisfying assignment for the 3SAT input if and only if we have a satisfying assignment for the SAT formula. Why do we need this equivalence? Because we need that No instances for the 3SAT formula, correspond to no instances for the SAT formula. So if we get a No output, we can output a No, for the SAT input. Now this is the main test, so let's dive into it. We want to take an input f for the SAT problem, and transform it into a valid input for the 3SAT problem.

## 517 Example

Let's try to get some idea for the reduction. So let's take a sample input for the SAT problem. So here's a formula with four variables and three clauses. Let's label the clauses as C1, C2, and C3. Now, let's define our input for the three set problem, let's denote this as f prime, where clause 1 and clause 3 we can keep the same. Those are valid inputs. Those are valid clauses for a three set. Now, the challenge is what do we do with the second clause. This is a clause of size four, but valid inputs with three set has size at most three. So what are we going to do for this second clause. Well, we're going to create a new variable y and we're going to replace the second clause by the following pair of clauses. So it's x2 bar or x3 or y. These are the first two literals in C2 and the second clause is y bar or x1 bar or x4 bar. These are the last two literals in the clause C2. Now, the key claim is that this original clause C2 of size four is satisfiable if and only if this new pair of clauses is satisfiable. So C2 is satisfiable if and only if C2 prime is satisfiable. Therefore, replacing C2 by this pair of clauses makes an equivalent formula. And in this new formula, all the clauses have size-

## 518 Claim Forward

Now, let's go ahead and prove this claim, so that we have some idea how to generalize this construction. Here's our original clause of size 4 and here's our new pair of clauses each of size, the most three. And we want to prove that this original clause C_2 is satisfiable, if and only if this new pair of clauses C'_2 is satisfiable. Let's do the forward direction first. Let's take a satisfying assignment for C_2. This is an assignment for x_2, x_3, x_1 and x_4, which satisfies this clause and we want to show that there is a satisfying assignment for C_2 prime. Now in order for this assignment to satisfy C_2, one of the four cases must hold. Either x_2 equals false, x_3 equals true, x_1 equals false or x_4 equals false. Maybe some combination of those hold, but at least one of those has to hold. We'll break up these four cases into the following pair of cases. We'll consider whether x_2 equals false or x_3 equals true. These are the pair of literals which appear in the first clause of C'_2 and the other case is if x_1 equals false or x_4 equals false. This is the pair of literals which appear in the second clause of C'_2. Now if x_2 equals false or x_3 equals true, then we're going to set y to be false. Notice in this case, we have x_2 is false or x_3 is true. Therefore, the first clause is satisfied and we're setting Y to be false. So this satisfies the second clause. C_2 prime is satisfied in this case. Similarly in the second case, if x_1 is false or x_4 is false, then we set y to be true. In this case, since y is true, this satisfies the first clause. And since x_1 is false or x_4 is false this satisfies the second clause. And given a satisfying assignment for C_2, we've constructed a satisfying assignment for C'_2.

## 519 Claim Reverse

Now let's look at the reverse implication. In this case we're going to take a satisfying assignment for C to the second prime and we're going to construct a satisfying assignment for C to the second. What we're going to show is that if we just ignore y, whatever the assignment was for x to the first, x to the second, x to the third, and x to the fourth which satisfied C to the second prime, that's going to satisfy C to the second. Now there will be two cases, either y is true or y is false. Suppose y is true. We know that C to the second prime is satisfied, so each of these two clauses is satisfied. y is true so that satisfies this clause. How about the second clause? Well, either x to the first must be false or x to the fourth must be false. Well, if x to the first is false or x to the fourth is false, both of these satisfy C to the second. So in either of these scenarios, we have an assignment which satisfies C to the second. Now suppose y is false. So this assignment for y satisfies this second clause. How do we satisfy this first clause? What we're assuming that this is a satisfying assignment for C to the second prime so it must satisfy this first clause. y is false, so that's not helping. So it must be the case that either x to the second is false or x to the third is true. So either x to the second is false or x to the third is true. Now if x to the second is false, that means that this is an assignment which satisfies C to the second and if x to the third is true, then also this is an assignment which satisfies C to the second. So in any of these four cases, we have an assignment which satisfies C to the second. So every satisfying assignment for C to the second prime if we ignore y, it's also a satisfying assignment for C to the second. And this establishes the reverse implication. We've done both directions, so we've shown the if and only if.

## 520 Quiz 5 SAT 3 SAT Question

Now, we just saw how to transform a clause of size 4 into a pair of clauses of size 3. Now, let's try to generalize this idea. Let's do it one step at a time. So let's take a clause of size 5 now, and let's try to transform this into a triple of clauses of size 3. Now, previously, to transform the clauses size 4 into a pair of clauses of size 3, we added one new variable y. Now, to transform a clause of size 5 into triple clauses of size 3, we're going to create two new variables, y and z. Now, we're going to make a formula, C prime, which is three clauses, and each of these clauses is going to be of size and most 3. Actually, they're going to be a size equal to 3. And we want to do this in such a way that C is satisfiable. The original clause C is satisfiable, if and only if this triple of clauses, C prime, is satisfiable. And, therefore, in our original input F, if we have a clause such as this, we can replace it by this triple of clauses, and we'll get an equivalent formula. Now, the quiz is to define C prime. Define this triple of clauses, each is of size exactly three literals, and use this pair of new variables, y and z. For simplicity, for entering your solution, let me enter the or symbols for you. And also to simplify your input, let's relabel these literals. So let's relabel these five literals as a, b, c, d, e. So a is equal to X2 bar, b is equal to X3, c is equal to X1 bar, d is equal to X4 bar, e is equal to X5. So use a, b, c, d, e, and y and z, and create a formula, C prime, such that C Prim is satisfiable, if and only if C is satisfiable.

## 521 Quiz 5 SAT 3 SAT Solution

The solution is the following. I take the first two literals, X_2_bar or X_3 or Y, the new variable. That's the first clause. The second clause is the following, I take y_bar or X_1_bar, the third literal or Z. The last clause is Z_bar or X_4_bar, the fourth literal, or X_5, the fifth literal. Instead of formally proving this now, let me give you a quick idea of the proof and then we'll do the general construction and then we'll prove the general construction is correct. Let's suppose that we have a satisfying assignment for C and let's see how we construct a satisfying assignment for C_prime. So, one of these five literals must be satisfied. Let's suppose the middle one, X_1_bar is satisfied, so X_1 is set to false, so that's going to satisfy the second clause. How do we satisfy the first and third clause? Well, here we use these auxiliary variables, setting Y to true satisfies the first clause and setting Z to false satisfies the last clause. And in general, one of these five is going to be satisfied, that's going to satisfy one of these three clauses. And then we can use these other two auxiliary variables to satisfy the other two clauses. How about the reverse implication? Suppose we have a satisfying of assignment for _prime, how do we get a satisfying assignment for C? Well, take the case where Y is false, so this literal is not satisfied so, one of these two literals must be satisfied. If one of these two literals is satisfied, they're satisfied here as well, so C is satisfied. Now, suppose that Z is set to true, then this literal Z_bar is not satisfied. So, one of these two literals must be satisfied, either X_4 is set to false or X_5 is set to true, in which case, one of these last two literals of C is satisfied. The last case is a complement of these two. Y is set to true and Z is set to false. Well, since we're taking a satisfying assignment for C_prime, look at how this second clause can be satisfied. Y is set to true and Z is set to false so these two literals are not satisfied. So the only way to satisfy this clause is to set X_1 to false. So if X_1 is set to false, then we satisfy this third literal in C. Now, let's do the general construction. Let's say we have a clause of size K. Now we're going to create a series of clauses which is equivalent to this original clause. Now when we had a clause of size four, we added one new variable and we had a pair of clauses. When we had a clause of size five, we created two new variables and then we have a triple of clauses. Now, if we have a clause of size K, we're going to create a K minus three new variables and we're going to create K minus two clauses. So, let's go ahead and do the general construction.

## 522 Big Clauses

Let's consider now a general clause of size k, and let's label the literals in this clause by a1, a2 up to ak. Now, for this clause, we're going to create k-3 new variables. Recall that when k was 4, we created one new variable, when k was 5, we created two new variables. In general, we're going to create k-3 new variables. Let's label these new variables as Y1 through Yk-3. Now it's important to note that every clause of size greater than 3 creates new variables, and these new variables are distinct for each clause. So, for each clause, we might create order n new variables. There's m clauses so we might have order, n times m new variables in total. Now we're going to replace this original clause C, by the following k-2 clauses. So we take the first two literals, a1, a2, and the first new variable Y1, and the first clause is a1 or a2 or Y1 and we use the negative of the first new variable, the third literal a3, and then the positive of the next new variable. And this gives us our pattern. So, we then use Y2_bar, the negative of the second variable, and then we use the next literal, a4, and then we use the next new variable, positive form, Y3. Now we continue this pattern, and then the last two clauses look as follows. So the penultimate clause looks as follows. It's going to have Yk-4_bar, or ak-2, this is the third to last literal, or Yk-3 is the last new variable in the positive form. This penultimate clause follows the same pattern. The last clause is going to be slightly different. It has Yk-3_bar, same pattern as before and then we use the last two literals of C. So we have ak-1, or ak. This defines a formula C_prime. Now our claim is that the original clause C is satisfiable, if and only if this new sequence of clauses, C_prime is satisfiable. Now, for our original input to the SAT problem, for every clause, which is size bigger than three, we can replace it under this following construction. We take this clause, which is size bigger than three. We replace it by this sequence of clauses which are all of size exactly equal to three, and then we get a valid input to the three set formula and the key is that this new formula is satisfiable, if and only if the original formula is satisfiable, within the equivalent formula. So let's go ahead and prove this claim, that C satisfiable if and only if C_prime is satisfiable, and then we'll be pretty much done with the reduc-

## 523 General Claim Forward

This was our construction, we took this clause of size k. So we took C which was a₁ or a₂ up to aₖ and we defined this formula C prime, which consisted of K-2 clauses. And our claim is that C is satisfiable if and only if C prime is satisfiable. Now, let's prove this claim. Let's start with the forward implication. So, let's take an assignment to these literals which satisfies this clause C and let's prove that there is a satisfying assignment to C prime. Now, in order for this assignment to satisfy C, one of these literals must be satisfied. Let's let ai be the first satisfied literal. So, let ai be the first literal satisfied. Now, if ai is satisfied that's going to satisfy one of these clauses. i = 1 so, a1 that's in the first clause and if i is at least two, then ai appears in the i minus first clause. So, this literal ai being set to true satisfies the i minus first clause of C prime. Let's suppose i was four. So, we have a four equals to true and this clause is satisfied, so we can remove them. Now, what about the i minus two earlier clauses? Well, we can use these positive forms of these auxiliary variables to satisfy these earlier clauses. So, we set y one in y two to be true in this case and in general we set y one through yi - 2 to true and this satisfies the first i minus two clauses of C prime. So, this setting of the first i - 2 auxiliary variables satisfies the first i - 2 clauses, ai set to true satisfies the i - 2 first clause. What do we do about the later clauses? Well, here we're going to use the negative form of these auxiliary variables. We set the remaining auxiliary variables to false. In this case yk minus four bar and yk minus three bar and in general we set yi - 1 through yk - 2 to false and this satisfies the remaining clauses which appear after the i minus first clause. The punchline is that this literal satisfies the i minus first clause of C prime. We use these auxiliary variables to satisfy the earlier clauses and we use these auxiliary variables to satisfy the later clauses. So, we only need one literal of C to be satisfied, and then we can use the auxiliary variables to satisfy all the other clauses of C prime. Now, let's do the reverse implication.

## 524 General Claim Reverse

Now let's prove the reverse implication. So let's take an assignment to these original K literals and these auxiliary, K minus three variables which satisfy C prime. Let's prove that there is a satisfying assignment for C. What we'll do is we'll just ignore these auxiliary variables and we'll prove that these are settings where these original K literals, satisfies the original clause. Now in order to satisfy this clause C, we just need to show that at least one of these literals is set to true. Let's suppose that's not the case. Suppose all of these K literals are set to false. Under this assumption, is it possible to satisfy C prime? We'll show it's not possible. Now we're supposing we have an assignment which satisfy C prime. So it satisfies all of these clauses. Let's look at the first clause. Now we're supposing that a1 and a2 are set to false. So the first two literals are not satisfied. So the third literal must be satisfied. That means Y1 must be set to true. That's the only way to satisfy this clause, under this assumption. Similarly, let's look at the second clause. What we're seeing Y1 is true. So this literal is not satisfied. Also a3, we're assuming is set to false. So this literal is not satisfied. So, we better satisfied this third literal, Y2. So, Y2 has to be set to true. Continuing in on, look at the penultimate clause. In order to satisfy this penultimate clause, we have to satisfy this literal. So, we have to set Yk-3 to be true. Now look at the last clause, where Yk-3 is set to true. So this literal is not satisfied. Similarly, these last two literals are not satisfied because they're set to false. So this clause is not satisfied. That means that C prime is not satisfied. That's a contradiction. We were assuming that this was assignment which satisfied C prime. So that means, that this assumption that all of these literals are set to false, is not true. So at least one of them must be set to true and therefore that literal satisfies this clause C. So if we just ignore the assignment to these auxiliary variables, then the setting for the original literals satisfies this original clause.

## 525 SAT 3SAT

Now we can formalize our reduction from SAT to 3SAT. So let's consider our input formula for the satisfiability problem. And let's create an input formula for the 3SAT problem, by the following procedure. Let's consider the clauses of f one by one. So, for clause C in f, we'll have two cases depending on whether the clauses of size are most three, or strictly bigger than three. If C contains at most three literals, then we can add this clause as is to this new formula. Now what if C contains more than three literals? Then we have to use our previous construction. In this case, we create k-3 new variables as we said before, and we replace C by this new formula C prime, this sequence of k-2 clauses, as we defined before. So, small clauses stay the same, big clauses are replaced by k-2 new clauses, as we defined before. Now that we've defined the input to the 3SAT problem, now we have to prove that this original input to the SAT problem is satisfiable, f is satisfiable if and only if its input to the 3SAT formula f prime, is satisfiable. So f is satisfiable, if and only if f prime is satisfiable. Now this statement is quite straightforward to prove, given that we've already proven that C is satisfiable, if and only if C prime is satisfiable. Then afterwards, the remaining task is to show that given a satisfying assignment to f prime, we can construct a satisfying assignment to f. That again will be straightforward, because we just ignore these auxiliary variables and the setting for the original variables will give us a satisfying assignment to the original form-

## 526 Correctness

Let's prove this equivalence and hopefully it seems obvious given what we've shown so far. So let's start with the forward implication So let's take an assignment to the original n variables which satisfies the original formula f. Now we want to show that keeping this assignment the same for these original n variables there is an assignment to the new variables so that this new formula f prime is satisfied. Now since we're keeping the assignment for these n variables the same and the new variables are as distinct for each new clause. Therefore, we can look at it a clause by clause. So let's look at clause C in the original formula f. If C has at most three literals then it's obvious. If C is bigger than size three then we replace C by this sequence of K minus two clauses called C prime. Now the key is that C prime uses K minus three new variables. These variables only appear in C prime for C. They don't appear in any other sequence of clauses. So the new variables for C are distinct from the new variables for any other clause. So we can set these variables however you want with respect to C and this won't affect any other clauses. And what we saw before is that there is an assignment to these K minus three new variables so that this new formulas C prime is satisfied. In particular, we took the first literal was satisfied in C, call it Ai and then we set the first Ai minus 2 new variables to true and the remaining new variables to false. And we show that that satisfied C prime. Now let's look at the reversed implication. Let's take a satisfying assignment for f prime. And we want to ignore the assignment for the new variables and show that this assignment for the original n variable satisfies f. Consider a sequence of clauses corresponding to sum C prime in f. What we saw before is that at least one of the literals in the original clause C must be satisfied. If all of these literals in the original clause were set to false then there would be no way to satisfy this sequence of clauses C prime. So therefore we can ignore the setting on the new variables and just looking at the setting on the original n variables. That's going to satisfy each of the clauses in the original formula f.

## 527 Satisfying Assignment

Now going back to our earlier reduction, we showed how to take input formula for SAT and transform it into an input formula for 3SAT. This was the reduction for creating f prime. And then we just proved that the original formula f is satisfiable if and only if this new formula, f prime is satisfiable. Now, what remains? While we run our 3SAT algorithm, our black box algorithm on f prime, if it produces no, that there is no satisfying assignment, then we say no, there is no satisfying assignment for f. What if it produces a satisfying assignment? So suppose it gives us a satisfying assignments Sigma prime which satisfies f prime. Well, we have to produce a satisfying assignment for f. How do we transform this satisfying assignment, Sigma prime, to a satisfying assignment for f? What we just saw from this proof is that if we ignore the assignment for the new variables and keep the assignment for the original variables the same, then we get a satisfying assignment for f. So we take this satisfying assignment for f prime, we ignore the assignment for all of the new variables and the assignment for the original variables gives us a satisfying assignment for f. And that completes our reduction. One last thing to note is what is the size of f prime. F our original input SAT, has n variables and m clauses. How many variables does f prime have in the worst case? Well we might create n new variables for each clause. So it has order nm variables in the worst case. And we're also replacing every clause by order n clauses, in the worst case. So we have order nm clauses. But this is okay because the size of f prime is polynomial in the size of f. So we have an algorithm which is polynomial running time in the size of f prime. It's still polynomial in the size of f as well. So this completes our first NP completeness proof.

## 528 Practice Problems

Now that you've seen your first NP completeness proof, the fact that three SAT is NP complete. There are a few relevant practice problems from the text book that you can try now. In problem 8.3, they consider a variant of SAT, called stingy SAT, and you want to prove that stingy SAT is NP complete. In problem 8.8, you consider exact 4-SAT. This means that every class has exactly four literals, not at most four literals but exactly four literals in every class. And you want to prove that exact 4-SAT is