# Tutorial Q1 - Qubit rotation

This tutorial demonstrates the very basic working principles of openqml for qubit-based backends. We only look at a single quantum function consisting of a single-qubit circuit. The task is to optimize two rotation gates in order to flip the qubit from state $|0\rangle$ to state $|1\rangle $. 

## Imports

First we need to import openqml, as well as openqml's version of numpy. This allows us to automatically compute gradients for functions that manipulate numpy arrays, including quantum functions. We call this numpy version `onp` in case we need it alongside the original version.

In [9]:
import openqml as qm
from openqml import numpy as onp
from openqml._optimize import GradientDescentOptimizer, AdagradOptimizer

Next, create a "device" to run the quantum node. We only need a single quantum wire. This example uses the default qubit simulator.


In [10]:
dev1 = qm.device('default.qubit', wires=1)

## Quantum function

We define a quantum function called "circuit". 

In [11]:
@qm.qfunc(dev1)
def circuit(weights):
    
    qm.RX(weights[0], [0])
    qm.RY(weights[1], [0])
    
    return qm.expectation.PauliZ(0)

This function uses openqml to run the following quantum circuit:

<img src="figures/rotation_circuit.png">

Starting with a qubit in the ground state, 

$$ |0\rangle = \begin{pmatrix}1 \\ 0 \end{pmatrix}, $$

we first rotate the qubit around the x-axis by 
$$R_x(w_0) = e^{-iw_0 X /2} = 
\begin{pmatrix} \cos \frac{w_0}{2} &  -i \sin \frac{w_0}{2} \\  
                -i \sin \frac{w_0}{2} &  \cos \frac{w_0}{2} 
\end{pmatrix}, $$ 
               
and then around the y-axis by 
$$ R_y(w_1) = e^{-i w_1 Y/2} = 
\begin{pmatrix} \cos \frac{w_1}{2} &  - \sin \frac{w_1}{2} \\  
                \sin \frac{w_1}{2} &  \cos \frac{w_1}{2} 
\end{pmatrix}. $$ 

After these operations the qubit is in the state

$$ | \psi \rangle = R_y(w_0) R_x(w_1) | 0 \rangle $$

Finally, we measure the expectation $ \langle \psi | Z | \psi \rangle $ of the Pauli-Z operator 
$$Z = 
\begin{pmatrix} 1 &  0 \\  
                0 & -1 
\end{pmatrix}. $$ 


Depending on the circuit parameters $w_1$ and $w_2$, the output expectation lies between $1$ (if $| \psi \rangle = | 0  \rangle $) and $-1$ (if $| \psi \rangle = | 1  \rangle $).

## Objective

Next, we define a cost. Here, the cost is directly the expectation of the PauliZ measurement, so that the cost is trivially the output of the circuit.

In [4]:
def objective(weights):
    return circuit(weights)

With this objective, the optimization procedure is supposed to find the weights that rotate the qubit from the ground state 

 <img src="figures/bloch_before.png" width="250"> 
 
 to the excited state
 
 <img src="figures/bloch_after.png" width="250">
 
 The rotation gates give the optimization landscape a trigonometric shape with four global minima and five global maxima.
 
 <img src="figures/optlandscape.png" width="450">

 
 

## Optimization

The initial values of the x- and y-rotation parameters $w_1, w_2$ are set to near-zero. This corresponds to identity gates, in other words, the circuit leaves the qubit in the ground state. *Note that at zero exactly the gradient vanishes and the optimization algorithm will not descent from the maximum.*

In [15]:
weights0 = np.array([0.01, 0.01])

weights0

array([0.01, 0.01])

The value of the objective at the initial point is close to $1$.

In [17]:
objective(weights0)

0.9999000033332889

We choose a simple Gradient Descent Optimizer and update the weights for 10 steps. The final parameters correspond to a $Z$ expectation of nearly $-1$, which means that the qubit is flipped.

In [14]:
o = GradientDescentOptimizer(0.5)

weights = weights0
for step in np.arange(1, 101):
    weights = o.step(objective, weights)
    if step%5==0:
        print('Objective after step {:5d}: {: .7f}'.format(step, objective(weights)) )

print()
print('Optimized rotation angles:', weights)

Objective after step     5:  0.9942561
Objective after step    10:  0.7312843
Objective after step    15:  0.0081459
Objective after step    20:  0.0000081
Objective after step    25:  0.0000000
Objective after step    30:  0.0000000
Objective after step    35:  0.0000000
Objective after step    40:  0.0000000
Objective after step    45:  0.0000000
Objective after step    50: -0.0000000
Objective after step    55: -0.0000000
Objective after step    60: -0.0000000
Objective after step    65: -0.0000000
Objective after step    70: -0.0000000
Objective after step    75: -0.0000001
Objective after step    80: -0.0000056
Objective after step    85: -0.0003210
Objective after step    90: -0.0182799
Objective after step    95: -0.5894464
Objective after step   100: -0.9988743

Optimized rotation angles: [0.03355765 3.108035  ]


Starting at a different offset, we train another optimizer called Adagrad, which improves on gradient descent.

In [19]:
weights0 = np.array([-0.01, 0.01])
print('Initial rotation angles:', weights0)

o = AdagradOptimizer(0.5)

weights = weights0
for step in np.arange(1, 101):
    weights = o.step(objective, weights)
    if step%5==0:
        print('Objective after step {:5d}: {: .7f}'.format(step, objective(weights)) )

print()
print('Optimized rotation angles:', weights)

Initial rotation angles: [-0.01  0.01]
Objective after step     5:  0.0001331
Objective after step    10:  0.0000000
Objective after step    15:  0.0000000
Objective after step    20:  0.0000000
Objective after step    25:  0.0000000
Objective after step    30: -0.0000000
Objective after step    35: -0.0000000
Objective after step    40: -0.0000000
Objective after step    45: -0.0000000
Objective after step    50: -0.0000000
Objective after step    55: -0.0000019
Objective after step    60: -0.0005601
Objective after step    65: -0.1376308
Objective after step    70: -0.9702268
Objective after step    75: -0.9998837
Objective after step    80: -0.9999996
Objective after step    85: -1.0000000
Objective after step    90: -1.0000000
Objective after step    95: -1.0000000
Objective after step   100: -1.0000000

Optimized rotation angles: [-9.24310272e-09  3.14159264e+00]


 Adagrad and gradient descent find the same minimum, and, since neither has information on second order derivatives, both take a detour through a saddle point. However, Adagrad takes considerably fewer steps.
 
 <img src="figures/gd_vs_adag.png" width="450">