# P1: Learning a single-qubit wavefunction variationally.
We are given a single-qubit wave function $\psi$. We seek to approximate $\psi$. It is easily seen that $\psi$ can be written as the first column of some unitary $U$, that is, $\psi = U|0\rangle$. We try to come up with a unitary $\text{varU}(\alpha, \beta)$ with first column $ = \psi'(\alpha, \beta) \approx \psi$. The parameters $\alpha$ and $\beta$ are learned using gradient descent. 

### Overview

We use **2 qubits** and learn $\psi$ using the variational wave function $\psi'(\alpha, \beta) = \text{varU}(\alpha, \beta)|0\rangle$ - **2 variational parameters**.
The loss function is taken to be $\mathcal{L}(\psi') = 1 - \Vert\langle\psi|\psi'\rangle\Vert^2$. Why? The key ingredient is the following observation(the motivation for this comes the fact that in the case when the inner product has unit modulus, equality holds in Cauchy-Schwarz):
$$\mathcal{L}(\psi') = 1 - \Vert\langle\psi|\psi'\rangle\Vert^2 = \Vert \psi - \langle\psi'|\psi\rangle \psi' \Vert^2$$
So our loss function is literally the same usual $\mathcal{L}^2$ loss function where the 'predicted value' $\psi'$ tries to estimate the 'label' $\psi$ but in a quantum setting, where global phases are unimportant!

The explicit (Frobenius) distance between $U$ and $\text{varU}$ is computed in the ``dist`` function. This value need not be small since it is not minimized - it is not indicative of the loss! - rather, we are minimizing the loss function aka the distance(upto a phase) of the first columns of the two unitaries.

In this example, we use the target unitary $U = R_z(0.3)R_y(1.5)R_z(0.8)$.

In [424]:
import pennylane as qml
from pennylane.devices.default_qubit import DefaultQubit
import pennylane.numpy as np
from prettytable import PrettyTable

In [425]:
dev = DefaultQubit(wires=2, shots=4096)
table = PrettyTable(field_names=("Iter", "params (α, β)", "grad(params)", "Ψ'(α,β)"), float_format='.4', alignment='r')
np.set_printoptions(floatmode='fixed', precision=4) # like cout << fixed; cout.precision(4) but his applies to all numpy arrays only.

In [426]:
# the two 1-qubit operations, one given and one variational.
def U(wire):
    qml.RZ(0.8, wires=wire)
    qml.RY(1.5, wires=wire)
    qml.RZ(0.3, wires=wire)

def varU(params, wire):
    #qml.RZ(params[2], wires=wire) # 2 parameters suffice. The third parameter just sets |0> -> exp(-iγ/2)|0> so varU|0> = Ψ' becomes exp(-iγ/2)Ψ', which is just Ψ' with a phase. Anyway we adjust the phase of Ψ' in the end so no need of the third parameter.
    qml.RY(params[1], wires=wire)
    qml.RZ(params[0], wires=wire)

@qml.qnode(dev)
def circuit(params):
    # create the state \psi \otimes \psi'
    U(0)
    varU(params, 1)
    
    # implement the SWAP test. Convert bell basis to computational and measure the circuit.
    qml.CNOT(wires=[0, 1])
    qml.Hadamard(0)

    counts = qml.sample(wires=range(2))
    return counts

def cost(params, iter:int=3):
    prob_11 = np.zeros(iter) # we take an average (probability of [1, 1]) over iter iterations.
    for i in range(iter):
        counts = list(circuit(params))
        count_11 = sum(np.all(measurement == [1, 1]) for measurement in counts)
        prob_11[i] = count_11/len(counts)
    return 2*np.average(prob_11) 
    ### Note that cost could return an exact 0 even if \psi is not exactly \psi' -> if psi is close enough to psi\, then [1, 1] hardly occurs(tiny probability) and in a sample could never occur giving us a cost of exactly 0. ###

def dist(params):
    #print("U:", qml.matrix(U)(0), "varU:", qml.matrix(varU)(params, 0), sep='\n')
    return np.linalg.norm(qml.matrix(U)(0) - qml.matrix(varU)(params, 0)) # The Frobenius norm of the difference between the two unitaries.

In [427]:
# computing grad(params) numerically
def e_i(dim, i): 
    ans = np.zeros(dim); ans[i] = 1; return ans

def grad(params, epsilon:float=0.01):
    # for i in range(3):
    #     print(params + epsilon*e_i(len(params), i), params - epsilon*e_i(len(params), i))
    return np.array([(cost(params + epsilon*e_i(len(params), i)) - cost(params - epsilon*e_i(len(params), i)))/(2*epsilon) for i in range(len(params))])

In [428]:
# the optimizer. We use simple gradient descent, no backprop required.
def optimize(params_init=None, tol:float=2e-2, stepsize:float=0.8, max_iter:int=100):
    params = np.array([0., 0.], requires_grad=True) if params_init is None else params_init
    gradient = grad(params)
    table.add_row([0, np.array(params), np.array(grad(params)), qml.matrix(varU)(params, 0)[:, 0]])
    for _ in range(1, max_iter+1):
        #print(qml.matrix(varU)(params, 0)[:, 0], np.array(gradient))
        params = params - stepsize*gradient
        if not _%4:
            table.add_row([_, np.array(params), np.array(gradient), qml.matrix(varU)(params, 0)[:, 0]]) # the np.array cast is to print without the 'requires_grad=True' string
        
        if np.max(np.abs(gradient)) < tol and not np.all(gradient == np.zeros(len(gradient))):
            if _%4: table.add_row([_, np.array(params), np.array(gradient), qml.matrix(varU)(params, 0)[:, 0]]) # _%4 for don't reprint the last iteration.
            break
        gradient = grad(params)
    return params

In [429]:
print("Ψ = U|0⟩ =", qml.matrix(U)(0)[:, 0])
params = optimize()
print(table)
print()
print("U:", qml.matrix(U)(0), "Estimate for U:", qml.matrix(varU)(params, 0), sep='\n') # qml.matrix(U) returns a (parametric) matrix counterpart (with same args as the function U) of the quantum operation U.
print(f"Distance between U and estimate for U: d = {dist(params):.4f}")

Ψ = U|0⟩ = [0.6238-0.3824j 0.6604-0.1686j]
+------+-----------------+-------------------+---------------------------------+
| Iter |  params (α, β)  |    grad(params)   |             Ψ'(α,β)             |
+------+-----------------+-------------------+---------------------------------+
|  0   | [0.0000 0.0000] |  [0.0570 0.3174]  | [1.0000+0.0000j 0.0000+0.0000j] |
|  4   | [0.2344 1.2630] | [-0.1628 -0.3255] | [0.8016-0.0944j 0.5863+0.0690j] |
|  8   | [0.2995 1.4974] | [ 0.0000 -0.0488] | [0.7244-0.1093j 0.6731+0.1015j] |
|  9   | [0.3060 1.4974] | [-0.0081  0.0000] | [0.7240-0.1116j 0.6727+0.1037j] |
+------+-----------------+-------------------+---------------------------------+

U:
[[ 0.6238-0.3824j -0.6604-0.1686j]
 [ 0.6604-0.1686j  0.6238+0.3824j]]
Estimate for U:
[[ 0.7240-0.1116j -0.6727+0.1037j]
 [ 0.6727+0.1037j  0.7240+0.1116j]]
Distance between U and estimate for U: d = 0.5616


In [430]:
# finally, a bit of post processing to get \psi' to match \psi (right now it does but upto a global phase)
# \psi' ≈ exp(i\theta)\psi so that exp(i\theta) = psi'[0]/psi[0] -> set psi' = psi'/(psi'[0]/psi[0]). So now the first elements of psi and psi' will match exactly, and we should expect the second elements to be very close to each other(Ψ' ≈ Ψ now).
psi = qml.matrix(U)(0)[:, 0]
psi_ = qml.matrix(varU)(params, 0)[:, 0]
phase = psi_[0]/psi[0]
psi_ = psi_/phase
print(f"Finally, we have:\nΨ  = {psi}\nΨ' = {psi_}")

Finally, we have:
Ψ  = [0.6238-0.3824j 0.6604-0.1686j]
Ψ' = [0.6238-0.3824j 0.6597-0.1643j]
