>Suppose that there is a polynomial-time randomized algorithm that, for any input graph $G$ and natural number $k$, with probability at least 2/3, correctly reports whether $G$ has a clique of size $k$. Assume that, on a graph with $n$ vertices, the algoirthm makes at most $\log_2 n$ randomized steps, each time choosing uniformly at random between two options; all other steps are deterministic.
>
>Prove that, in this case, P = NP.

Since _the algoirthm makes at most $\log_2 n$ randomized steps, each time choosing uniformly at random between two options_, for a input $x$ size of $n$. The randomized algorithm $A$ would have $2^{log_2 n} = n$ branches.

Let a deterministic algorithm $B$ to verify the $n$ certificates from $A$, Since clique decision problem is in NP, verify a certificate takes polynomial time. There are $n$ certificates, and $A$ is a polynomial-time algorithm, hence $B$ takes polynomial time. Which means the clique decision problem solvable in polynomial-time.

And the clique decision problem is NP-complete, Thus NP=P.

> Let $\phi$ be a formula in conjunctive normal form with an arbitrary non-zero number of literals in ever clause. Assume that all literals within one clause are different and, for every variable $x$, formula $\phi$ contains at most one of the clauses $(x)$ and $(\neg x)$.
>
>Describe an algorithm that, given such a formula $\phi$ with $m$ clauses, computes an assignment that satisfies at least $0.6m$ clauses of $\phi$ on average.

We know the expected value of satisfied clause of 3CNF is $\frac{7}{8}m$. If we use the same method, choose between $x_i = 1$ and $x_i = 0$ with equal proobability 1/2, and calculate the expected value in each step for both $x_i=1$ and $x_i = 0$, and fix $x_i$ with the bigger expected value. It's easy to see, the expected value for a 2CNF formula it's $\frac{3}{4}m$ and for a 1CNF it's $\frac{1}{2}m$. Let this approach be $A_1$. It can be generalized to k-CNF, that can satisfy at least $(1-\frac{1}{2^k})m$ clauses.

$E(\phi) \ge \sum_{k\ge 1} (1-\frac{1}{2^k}) m$
  

We can see in $A_1$, the expected value decreases as k decreases. Instead random choosing $x \in {0,1}$ uniformly.  Define algorithm $A_2$ that relax the restriction to [0,1]:  

Let $x_1, \dots, x_i$ denote $n$ variables in $\phi$  
Let $c_1, \dots, c_j$ denote $m$ clauses in $\phi$.  
Let $y_i$ denote the probability of $x_i=1$. $y_i \in [0,1]$  
Let $z_j \in [0,1]$, for each clause $c_j$ s.t. $z_j = 1$ if the clause is satisfied and $z_j = 0$ otherwise.  
Let $l_j$ denote the number of variables in clause $c_j$  
Let $P_j$ denote the group of variables without negation in $c_j$  
Let $N_j$ denote the group of variables with negation in $c_j$  

We have a linear program:

Maximize $\sum^m_{j=1} z_j$  
subject to $\sum_{i \in P_j}y_i + \sum_{i \in N_j}(1-y_i) \ge z_j$  

$\forall c_j = \bigvee_{i \in P_j} x_i \lor \bigvee_{i \in N_j} \overline{x_i} \;\;\;\; y_i\in[0,1] \;\;\;\; z_j\in[0,1]$

Let $(y^{∗}, z^{∗})$ be an optimal solution to the linear program relaxation. We set $x_i=1$ with the probability of $y^{∗}_i$ for each $x_i$ independently. 

Hence $Pr[c_j \text{is not satisfied}] = \prod_{i \in P_j} (1-y^*)\prod_{i \in N_j} y^*$

__The inequality of arithmetic and geometric means__

$(\prod_{i \in P_j} (1-y^*)\prod_{i \in N_j} y^*)^{l_j} \le \frac{1}{l_j} (\sum_{i \in P_j}(1-y^*) + \sum_{i \in N_j} y^*)$

$\implies Pr[c_j \text{is not satisfied}] \le [\frac{1}{l_j} (\sum_{i \in P_j}(1-y^*) + \sum_{i \in N_j} y^*)]^{l_j} = [1 - \frac{1}{l_j}(\sum_{i \in P_j}y^* + \sum_{i \in N_j} (1-y^*))]^{l_j} \le (1-\frac{z^*_j}{l_j})^{l_j}$

__Fact, if a function $f(x)$ is concabe on [0,1] ($f^{''} \le 0$), and $f(0)=a, f(1)=b+a$, then $f(x) \ge bx+a$ for $x\in[0,1]$__

So the function $1-(1-\frac{z^*_j}{l_j})^{l_j}$ is concave.

$Pr[c_j \text{is satisfied}] \ge 1-(1-\frac{z^*_j}{l_j})^{l_j} \ge [1-(1-\frac{1}{l_j})^{l_j}]z^*_j$

$E(\phi) = \sum^m_{j=1} Pr[c_j \text{is satisfied}] \ge \sum_{k\ge 1}[1-(1-\frac{1}{k})^{k}] \sum^m_{j=1}z^*_j$

Finally, put $A_1$ and $A_2$ together, choose the max between $A_1$ set $x_i=1$ with probability = 1/2, and $A_2$ set $x_i=1$ with probability = $y^*$, then

$max(A_1, A_2) \ge \frac{A_1 + A_2}{2} \ge  (\sum_{k\ge 1} (1-\frac{1}{2^k})\sum^m_{j=1}z^*_j + \sum_{k\ge 1}[1-(1-\frac{1}{k})^{k}] \sum^m_{j=1}z^*_j)/2$

when $k=1,2$, $(1-\frac{1}{2^k}) + [1-(1-\frac{1}{k})^{k}] = \frac{3}{2}$, for $k \ge 3$, $(1-\frac{1}{2^k}) + [1-(1-\frac{1}{k})^{k}] \ge \frac{7}{8} + 1-\frac{1}{e} \ge \frac{3}{2}$.

Thus $max(A_1, A_2) \ge \frac{3}{4} \sum^m_{j=1}z^*_j$

---

To speak frankly, I got the above from internet, looks like it can satisfy at least $0.6m$ on average. But it is so complicated.

Since there is at most one of the clauses $(x)$ and $(\neg x)$. And the smaller the number of literal in clause the lower the expected value.  

So when there is a cluase $(x)$, in this case there is no clause $(\neg x)$. We can just find any clause $C_n$ that contains $\neg x$ and claues $C_p$ contains $x$, we only consider clauses that is 2-CNF, because for $k-CNF$ where $k\ge 3$, set one of the literal to 0, the expected value of such clause is still $\ge (1-\frac{1}{2^{k-1}}) \ge 3/4$. 

And set $x=0$ if $|C_n| > |C_p|$,  $x=1$ if $|C_n| < |C_p|$, choose randomly when $|C_n| = |C_p|$.

The worst case would be something like $(\neg x_1 \lor \neg x_2) \;\&\; x_1 \;\&\; x_2$, for $x_1$, $|C_n| = |C_p|=1$, if set $x_1=0$, the formula becomes $\neg x_2 \;\&\; x_2$. We can satisfy $\frac{2}{3}m$ clauses. If set $x_1=0$, the formula become $()\;\&\;x_2$, we still satisfy $\frac{2}{3}m$ clauses.

And when there is no 1-CNF, we can choose between $x_i=0$ and $x_i=1$ with probability 1/2, fix the value of $x_i$ that has a bigger expected value, so the expected value never decrease, the expected value of satisfied clauses of $\phi$ is $\ge \frac{3}{4}m$.

Since both $\frac{2}{3} \ge 0.6$ and $\frac{3}{4} \ge 0.6$, it can compute an assignment that satisfies at least $0.6m$ clauses of $\phi$ on average.

> Recall that ZPP is the class of problems decidable by probabilistic algorithms that always return a corect answer and whose expected running time is polynomial in the size of the input.
>
> Prove that $ZPP = RP \cap coRP$.


1. prove $ZPP \subseteq RP \cap coRP$

Markov's inequality $P(X \ge a) \le \frac{E(X)}{a}$. Let $a = 2E(X) \implies P(X \ge 2E(X)) \le 1/2$

Assume we have a Las Vegas algorithm $A$, an input $w$ and a language $L$. Let the algorithm run twice of its expected time. If it retuens an answer before the time is up. Return the answer. 

If it doesn't return an answer, Return "No", it can never go wrong for $w \notin L$, we have:    
If $w \in L$, $Pr[A \;\text{reject}\; w] \le 1/2 \implies w \in L, Pr[A \;\text{accept}\; w] \ge 1/2$.  
If $w \notin L$, $Pr[A \;\text{reject}\; w] = 1$  
Hence $A \in RP$

If we return "Yes" after the time is up, it can never go wrong for $w \in L$, we have:  
If $w \in L, Pr[A \;\text{accept}\; w] = 1$  
If $w \notin L, Pr[A \;\text{accept}\; w] \le 1/2 \implies w \notin L, Pr[A \;\text{reject}\; w] \ge 1/2$   
Hence $A \in coRP$

Thus $ZPP \subseteq RP \cap coRP$

2. prove $RP \cap coRP \subseteq ZPP$

Assume a language $L$ has a RP algorithm $A$, and a coRP algorithm $B$

Given a input $w$, run $A$ on the input, if it rejects, reject $w$. otherwise run $B$ on the input, if it accepts, accept $w$. If it doesn't return "No" on $A$, and "Yes" on $B$. We repeat. It always give the correct answer. And the expected running time is polynomial.  
Hence $RP \cap coRP \subseteq ZPP$

Thus $ZPP = RP \cap coRP$.

> Suppose that $NP \subseteq BPP$. Show that, in this case, there is an algorithm that runs in expected polynomial time and, given a satisfiable Boolean formula $\phi$, finds a satisfying assignment for $\phi$ with probability at least 2/3.

Since $NP \subseteq BPP$, there is an algorithm that solves $SAT$ with probability $2/3$, and we can amplify it by running it independently for $k$ times, Let the amplified algorithm be $A$, the error probability of $A$ is $2^{-poly(n)}$, where $n$ is the input size. Let algorithm $B$ run as following:

For a SAT formula $\phi$, set $x_1=0$, let $A$ decide if $\phi[x_1=0]$ satisfiable, if it accepts. We proceed to next variable. If it rejects, set $x_1=1$, let $A$ decide if $\phi[x_1=1]$ satisfiable, if it accepts. We proceed to next variable. If it rejects, $B$ rejects. If $A$ accepts everytime, we have a assignment, if this assignment can satisfy $\phi$, $B$ accept, otherwise rejects. Since $A$ is polynomial tiem, $B$ is polynomial time.

If $\phi$ is unsatisfiable, $B$ always rejects. If $\phi$ is satisfiable, $B$ rejects it with probability $\frac{n}{2^{-poly(n)}}$, which is strictly less than $1/2$, We can adjust $k$, to make the error probability at most 1/3.

> Let us slightly change the rules of the Generalized Geography game: it is no longer forbidden to visit the same node again and again, but every edge may be used at most once throughout the game.
>
> In other words, two players alternatingly move a token from one node to another node over an edge that was previously not used (while nodes may be visited several times). The first player that arrives at a node without unused edges loses. If this happens, the other player wins.
>
>Consider the following problem:
>
>Input: A graph and a start node.
Question: Does the first player have a winning strategy in the version of the Generalized Geography game described above?
>
>Prove that this problem is PSPACE-hard.

For a TQBF (True quantified Boolean formula) $\phi$, which has $n$ variables and $m$ clauses. Try to reduct from TQBF to a Generalized Geography game described in the question, by construct a directed graph in following way in polynomial time.

1. __Construct directed graph in polynomial time__

- For each variable, construct a gadget that contains 4 nodes, start node $s_i$ which has outdegree 2, one edge goes to $x_i$ while another one goes to $\neg x_i$, both $x_i$ and $\neg x_i$ have outdegree 1, both go to an end node $e_i$, which has out degree 1, goes to the start node $s_{i+1}$ of the next variable.  
  
  
- The edge connecting an end node $e_i$ and a start node $s_{i+1}$ represent switching player.  
  
  
- The end node $e_n$ of the last variable, has outdegree $m$, each edge goes to a clause node.  
  
  
- For a clause $c_j$ node, let $l_j$ denote the number of literal in this clause, the node $c_j$ has out degree $l_j$. each edge goes to the corresponding negation literals. (e.g. a clause $(x \lor \neg y \lor \neg z)$, has out degree 3, connect to $\neg x$, $y$ and $z$).

The number of variable gadget (has constant number of nodes) is equal to the number of variable in the formula. Which is bounded by the input size.

The number of clause gadget also equal to the number of clause in the formula. Which is bounded by the input size.

The edges is proportional to the variables and clauses. Thus construct such graph takes polynomial time.

2. __Claim: If $\phi$ is True, then the first player in the modified generalized geography game has a winning strategy.__

The first player (exists player in TQBF) start from the start node of the first variable, if he choose $x_1$ node, mark the two edges that goes to and out of node $x_1$ as used. And the same rule for $\neg x_1$.  

Switch player, and mark the edge connectting the two variable gadgets as used.

If the second player choose $\neg x_2$, marke the two edges that connect to it used. And the same rule for $x_2$. 

Switch player until all variables visited. Switch to the second player, he has to choose one of the clause. __If a clause is False, it means all edges out of this clause node, go to a variable node without unused edges__. (e.g. $(x \lor \neg y \lor \neg z)$ connected to $\neg x$, $y$ and $z$, if $x=0, y=1, z=1$, then all edges connected to $\neg x$, $y$ and $z$ are marked as unsed), the second player can choose a clause that is False, therefore force the first player go to one of such node without unused edges, and the second player win the game.

__If a clause is True, it means there is at least one edge out of this clause node, goes to a variable node that has unused edge__. So if $\phi$ is True, then all clauses in it is True, the second player can not find a clause and force the first player goes to a node without unsed edge. If the second player arrive on any of the True clause, the first player can go to the negation of one of the True literal in this clause, and force the second player go to the end node of this variable which has no unused edges. The first player win the game.

Thus if $\phi$ is True, the first player in the modified generalized geography game has a winning strategy.

3. __Claim: If first player in the modified generalized geography game has a winning strategy. then $phi$ is True.__

If the first player play the winning stratergy, so when the second layer arrive at any of the clause gadget, the fisrt player can choose a node with unused edge, (the negation of any of the true literal in this clause). And force the second player move to the end node of this variable, which has no unused edges. It means all clause gadgets are True.

So we have a polynomial time reduction from TQBF to this new version of the Generalized Geography game, since TQBF is NP-complete. this game is NP-hard.