# Task 2 

## Problem Statement

_Implement a circuit that returns $\lvert01\rangle$ and $\lvert10\rangle$ with equal probability._

**Requirements:**
- The circuit should consist only of CNOTs, RXs and RYs. 
- Start from all parameters in parametric gates being equal to 0 or randomly chosen. 
- You should find the right set of parameters using gradient descent (you might use more advanced optimization methods if you like). 
- Simulations must be done with sampling - i.e. a limited number of measurements per iteration and noise. 

Compare the results for different numbers of measurements: 1, 10, 100, 1000. 

**Bonus question:
How to make sure you produce state $\lvert01\rangle$ + $\lvert10\rangle$ and not $\lvert01\rangle$ - $\lvert10\rangle$?**

---

I've decided to use **Qiskit 0.20.0** for this task - it's the one I'm most comfortable with. Version details follow...

In [None]:
import qiskit
qiskit.__qiskit_version__

## 1. Creating a reference circuit

For starters, I decided to make a reference circuit that creates the state $\lvert\psi\rangle = \lvert01\rangle + \lvert10\rangle$ (upto a global phase) using only RX, RY, and CX gates. The intention behind was this was to have something against which I could compare the circuits that my program _learnt_. This was not strictly necessary - I could have skipped the circuit altogether and just initialized the `Statevector()` object, but this way I don't have to worry about normalizing the state manually. It also lets me experiment with different reference circuits easily and run the circuits using the `QasmSimulator()`.

In [None]:
# imports and general configuration
import random
import numpy as np

from qiskit import QuantumCircuit, execute, Aer
from qiskit.visualization import plot_histogram, plot_state_qsphere as plot_q
from qiskit.quantum_info import Statevector
%matplotlib inline
%config InlineBackend.figure_format = 'svg'

config = {
    'output': 'mpl',
    'qsphere_fig_size': (6.5, 6.5)
}
statevector_sim = Aer.get_backend('statevector_simulator')
qasm_sim = Aer.get_backend('qasm_simulator')

In [None]:
t2_ref_ckt = QuantumCircuit(2)

# X gate on qubit 1
t2_ref_ckt.rx(np.pi, 1)
# Hadamard on qubit 0
t2_ref_ckt.ry(np.pi/2, 0) 
t2_ref_ckt.rx(np.pi, 0)
# CNOT with qubit 0 as control and qubit 1 as target
t2_ref_ckt.cx(0,1)

# evolving the Statevector() using the QuantumCircuit() to reach the final state
# this way there's no need for a simulator 
ref_sv = Statevector.from_label('00').evolve(t2_ref_ckt)

# plot the state on the q-sphere for visualization
plot_q(ref_sv, figsize=config['qsphere_fig_size'], show_state_phases=True)

## 2. Approach 1

This first approach is straightforward. The order of the gates is fixed and I'm attempting to learn the parameters of the RX and RY gates using Gradient Descent...  
The easiest way to generate $\lvert\psi\rangle$ is by applying the Bell circuit (H + CX) on an initial state of $\lvert01\rangle$. Since the CX gate is unparametrized and is the last gate we apply, we only need to worry about generating the state that, on application of a CX gate, would give us $\lvert\psi\rangle$, which turns out to be $\lvert\psi^\prime\rangle = \lvert01\rangle + \lvert11\rangle = \lvert+\rangle\otimes\lvert1\rangle$.

This state is completely separable, which means I can treat each qubit individually and optimize one at a time.

The structure of the circuit is as shown below...

In [None]:
# Initialize random circuit parameters

# q0_thetas is a list of 2 angles (between 0 and 2*pi) which parametrize the RY and RX gates on qubit 0 respectively
q0_thetas = 2 * np.pi * np.random.rand(2)
# same as q0_thetas but for qubit 1
q1_thetas = 2 * np.pi * np.random.rand(2)

t2_ckt = QuantumCircuit(2)

t2_ckt.ry(q0_thetas[0], 0)
t2_ckt.rx(q0_thetas[1], 0)

t2_ckt.ry(q1_thetas[0], 1)
t2_ckt.rx(q1_thetas[1], 1)

t2_ckt.measure_all()
print("Circuit created with random initialization of parameters...")
t2_ckt.draw('mpl')

## 2.1 Calculation

Gradient Descent is a fairly simple algorithm to implement from scratch. Since we're working with individual qubits and optimizing over 2 gates only, it is easy to construct the unitary matrix representing the effect of the circuit on each qubit.  
  
  
Let $\theta_1$ be the rotation angle of the RY Gate, and $\theta_2$ be the rotation angle of the RX Gate.
  
The unitary matrices corresponding to the RY and RX gates are given by:
$$
\begin{aligned}
    RY(\theta_1) & = 
    \begin{bmatrix} 
        cos(\frac{\theta_1}{2}) & -sin(\frac{\theta_1}{2}) \\ 
        -sin(\frac{\theta_1}{2}) & cos(\frac{\theta_1}{2})
    \end{bmatrix}\\
    RX(\theta_1) & = 
    \begin{bmatrix} 
        cos(\frac{\theta_1}{2}) & -isin(\frac{\theta_1}{2}) \\ 
        -isin(\frac{\theta_1}{2}) & cos(\frac{\theta_1}{2})
    \end{bmatrix}\\
\end{aligned}
$$  

### Arriving at the final state in terms of parameters $\theta_1$ and $\theta_2$:
  
We start in the state $\lvert0\rangle = \begin{bmatrix} 1\\0 \end{bmatrix}$, and then apply $RY(\theta_1)$ followed by $RX(\theta_2)$

$$
\begin{bmatrix} 
    1 \\
    0 
\end{bmatrix}
\xrightarrow{RY(\theta_1)}
\begin{bmatrix} 
    cos(\frac{\theta_1}{2}) \\
    -sin(\frac{\theta_1}{2})
\end{bmatrix}
\xrightarrow{RX(\theta_2)}
\begin{bmatrix} 
    cos(\frac{\theta_1}{2})cos(\frac{\theta_2}{2}) - isin(\frac{\theta_1}{2})cos(\frac{\theta_2}{2}) \\
    -icos(\frac{\theta_1}{2})sin(\frac{\theta_2}{2}) - sin(\frac{\theta_1}{2})cos(\frac{\theta_2}{2})
\end{bmatrix}
$$

### Probabilities of measuring $\lvert0\rangle$ and $\lvert1\rangle$:
$$
\begin{aligned}
P_0 & = cos^2(\frac{\theta_1}{2})cos^2(\frac{\theta_2}{2}) + sin^2(\frac{\theta_1}{2})sin^2(\frac{\theta_2}{2}) \\
P_1 & = cos^2(\frac{\theta_1}{2})sin^2(\frac{\theta_2}{2}) + sin^2(\frac{\theta_1}{2})cos^2(\frac{\theta_2}{2})
\end{aligned}
$$

### Cost Function:

The cost function must be one that penalizes a circuit which yields measurement probabilities that we don't want.

#### For qubit 0:
We want qubit 0 to end up in a state with equal probabilities of measuring $\lvert0\rangle$ and $\lvert1\rangle$, like  $\lvert+\rangle = \frac{\lvert0\rangle + \lvert1\rangle}{\sqrt{2}}$, so a reasonable cost function is:
$$
J = \frac{1}{2}(P_0 - 0.5)^2 + \frac{1}{2}(P_1 - 0.5)^2
$$

#### For qubit 1:
We want qubit 1 to end up in the state $\lvert1\rangle$, so we would like $P_0$ to be $0$ and $P_1$ to be $1$: 
$$
J = \frac{1}{2}(P_0)^2 + \frac{1}{2}(P_1 - 1)^2
$$  
where the factors of $\frac{1}{2}$ are taken for convenience with derivatives...  
<font color=green>Because we're starting out in a state along the Z-axis of the Bloch sphere, and applying rotations along the X and Y axes, the 2 rotations effectively act independently, and so we can optimize them independently without having to worry about backpropagation of errors.</font>
  
  
### Partial Derivatives

At any step during the optimization, gradient descent changes the values of each parameter by an amount proportional to the partial derivative of the cost function with respect to that parameter at that point.
$$
\begin{aligned}
    \frac{\partial P_0}{\partial \theta_1} & = -sin\theta_1cos^2\frac{\theta_2}{2} + sin\theta_1sin^2\frac{\theta_2}{2} \\
    & = -sin\theta_1cos\theta_2 \\
    \frac{\partial P_0}{\partial \theta_2} & = -cos^2\frac{\theta_1}{2}sin\theta_2 + sin^2\frac{\theta_1}{2}sin\theta_2 \\
    & = -cos\theta_1sin\theta_2 \\
    \frac{\partial P_1}{\partial \theta_1} & = cos^2\frac{\theta_2}{2}sin\theta_1 - sin^2\frac{\theta_2}{2}sin\theta_1 \\
    & = sin\theta_1cos\theta_2 \\
    \frac{\partial P_1}{\partial \theta_2} & = -sin^2\frac{\theta_1}{2}sin\theta_2 + cos^2\frac{\theta_1}{2}sin\theta_2 \\
    & = cos\theta_1sin\theta_2 \\
\end{aligned}
$$
  
  
Substituting these values:
$$
\begin{aligned}
    \frac{\partial J}{\partial \theta_1} & = \frac{1}{2}*2*(P_0-0.5)*(-sin\theta_1cos\theta_2) + \frac{1}{2}*2*(P_1-0.5)*(sin\theta_1cos\theta_2)\\
    & = sin\theta_1cos\theta_2(P_1-P_0)\\
    \textrm{and}\\
    \frac{\partial J}{\partial \theta_2} & = \frac{1}{2}*2*(P_0-0.5)*(-sin\theta_2cos\theta_1) + \frac{1}{2}*2*(P_1-0.5)*(sin\theta_2cos\theta_1)\\
    & = sin\theta_2cos\theta_1(P_1-P_0)\\
\end{aligned}
$$

## Qubit 0

Optimizing the RY and RX angles for qubit 0, which should give us $\lvert0\rangle$ and $\lvert1\rangle$ with equal probability... 

In [None]:
q0_ckt = QuantumCircuit(1)

q0_ckt.ry(q0_thetas[0], 0)
q0_ckt.rx(q0_thetas[1], 0)
q0_ckt.measure_all()

q0_ckt.draw(config['output'])

In [None]:
q0_initial_state = Statevector.from_label('0')
# learning rate
alpha = 0.01

In [None]:
for _ in range(1000):
    
    # TODO: show with statevector simulator also
    result = execute(q0_ckt, backend=qasm_sim, shots=1000).result()
    counts = result.get_counts()
    print(counts)
    
    p0 = counts['0'] / 1000
    p1 = counts['1'] / 1000
    cost = (p0 - 0.5) ** 2 + (p1 - 0.5) ** 2
    print("cost: ", cost)
    
    partial_1 = np.sin(q0_thetas[0]) * np.cos(q0_thetas[1]) * (p1 - p0)
    partial_2 = np.sin(q0_thetas[1]) * np.cos(q0_thetas[0]) * (p1 - p0)

    q0_thetas = [q0_thetas[0] - alpha * partial_1, q0_thetas[1] - alpha * partial_2]
    
    # TODO: instead of recreating, explore adding a new gate with just the delta angle
    q0_ckt = QuantumCircuit(1)
    q0_ckt.ry(q0_thetas[0], 0)
    q0_ckt.rx(q0_thetas[1], 0)
    q0_ckt.measure_all()
    
print(q0_thetas)

In [None]:
q0_ckt.draw(config['output'])

## Qubit 1

Optimizing the RY and RX angles for qubit 1, which should give us $\lvert1\rangle$ with probability $1$... 

In [None]:
q1_ckt = QuantumCircuit(1)

q1_ckt.ry(q1_thetas[0], 0)
q1_ckt.rx(q1_thetas[1], 0)
q1_ckt.measure_all()

q1_ckt.draw(config['output'])

In [None]:
alpha = 0.01
for _ in range(1000):
    result = execute(q1_ckt, backend=qasm_sim, shots=1000).result()
    counts = result.get_counts()
    print(counts)
    
    p0 = counts['0'] / 1000
    p1 = counts['1'] / 1000
    cost = (p0) ** 2 + (p1 - 1) ** 2
    print("cost: ", cost)
    
    partial_1 = np.sin(q1_thetas[0]) * np.cos(q1_thetas[1]) * (p1 - p0 - 1)
    partial_2 = np.sin(q1_thetas[1]) * np.cos(q1_thetas[0]) * (p1 - p0 - 1)

    q1_thetas = [q1_thetas[0] - alpha * partial_1, q1_thetas[1] - alpha * partial_2]
    q1_ckt = QuantumCircuit(1)
    q1_ckt.ry(q1_thetas[0], 0)
    q1_ckt.rx(q1_thetas[1], 0)
    q1_ckt.measure_all()
    
print(q1_thetas)

In [None]:
q1_ckt.draw(config['output'])

## Putting it all together...

In [None]:
final_ckt = QuantumCircuit(2)

final_ckt.ry(q0_thetas[0], 0)
final_ckt.rx(q0_thetas[1], 0)
final_ckt.ry(q1_thetas[0], 1)
final_ckt.rx(q1_thetas[1], 1)
final_ckt.cx(0,1)

final_ckt.measure_all()

final_ckt.draw('mpl')

In [None]:
res = execute(final_ckt, backend=qasm_sim, shots=4096).result()
plot_histogram(res.get_counts())

In [None]:
res2 = execute(final_ckt.remove_final_measurements(inplace=False), backend=statevector_sim).result()
sv = res2.get_statevector()
Statevector(sv).probabilities_dict()

#final_ckt.draw('mpl')

In [None]:
%autosave 30
import qiskit
qiskit.__qiskit_version__