# Fitting a Function with a Quantum Neural Network
***
Adapted from the original Xanadu Tutorial [here](https://pennylane.ai/qml/app/quantum_neural_net.html).

In this example uses a variational quantum circuit to learn a fit for a one-dimensional function when being trained with noisy samples from that function.The variational circuit used is the continuous-variable quantum neural network model described [here](https://arxiv.org/abs/1806.06871).

### What is a Quantum Neural Network? (QNN)

Broadly, a QNN is any quantum circuit with **trainable** (continuous) **parameters**. 

<img src="./images/qnn.png" alt="QNN" style="width: 1000px;"/>


We use qubits to physically represent the hidden layer nodes. In this case, our qubits are something called 'continuous-variable' photon states (each state or qubit is on a different 'rail' or 'photonic waveguide'). We can apply standard neural network transformations to these (weights, biases, and nonlinearities) through a physical modification of the photon state (e.g. weights = mixing through beamsplitters, bias = boosting the power, nonlinearity = performing an optical measurement). But unlike in classical neural networks, our nodes can be **entangled** or in **superposition**. Note that a N quantum nodes can encode 2^N classical nodes. 

### Imports

In [None]:
import pennylane as qml
from pennylane import numpy as np  # be sure to import numpy from pennylane itself!
from pennylane.optimize import AdamOptimizer

import matplotlib.pyplot as plt
%matplotlib inline

The device we use is the Strawberry Fields simulator, this time with only one quantum mode (or `wire`). You will need to have the Strawberry Fields plugin for PennyLane installed. Strawberry fields is a library for _emulating_ a quantum computer. Obviously this is not scalable beyond a small number of qubits (because that's the whole point of quantum computing... achieving compute capacity that classical computers cannot match). 

Note: the main innovation of PennyLane is automatic differentiation (i.e. computing gradients) on the quantum circuit. This is not a trivial task. Xanadu figured out a way to compute these gradients directly on the quantum hardware itself.  

In [None]:
# using the "Strawberry Fields" quantum computer simulator as our device...
device = qml.device("strawberryfields.fock", wires=1, cutoff_dim=10)

### Data

Our data will be a noisy sine curve. Let's generate one:

In [None]:
NUM_SAMPLES = 60
NOISE_SPREAD = 0.15

# seed the random function, for consistency across runs
np.random.seed(0)

# randomly sample across the x-axis
X = np.random.uniform(-1, 1, NUM_SAMPLES)

# compute corresponding y-values
Y = np.sin(X*3.14159)

# generate random noise for the y-axis
noise = np.random.normal(0, NOISE_SPREAD, NUM_SAMPLES)
Y = Y + noise

Plotting our data:


In [None]:
# plot the data
plt.figure()
plt.scatter(X, Y)
plt.xlabel("x", fontsize=18)
plt.ylabel("f(x)", fontsize=18)
plt.tick_params(axis="both", which="major", labelsize=16)
plt.tick_params(axis="both", which="minor", labelsize=16)
plt.show()

### Quantum Node

Our QNN will be comprised of hidden layers having only a single qubit -- but remember that a quantum neuron can encode the equivalent of two classical neurons (it scales as $2^{N}$). For a single quantum node (qubit), the "hidden layers" of our QNN are defined as:

In [None]:
def layer(v):
    # Matrix multiplication of input layer (the weights)
    qml.Rotation(v[0], wires=0)
    qml.Squeezing(v[1], 0.0, wires=0)
    qml.Rotation(v[2], wires=0)

    # Bias
    qml.Displacement(v[3], 0.0, wires=0)

    # Element-wise nonlinear transformation
    qml.Kerr(v[4], wires=0)

where the vector `v` represents our tunable parameters. 

Rotation, squeezing, displacement, and Kerr non-linearity are all quantum transformations (like "gates"). You don't need to worry about the details, only that they are equivalent to multiplication by weights, adding a bias, and applying our nonlinear activation function.

We then build up our quantum neural net, where `var` is an array of tunable parameter vectors, one for each layer. 


In [None]:
@qml.qnode(device)
def quantum_neural_net(var, x=None):
    # Encode input x into quantum state
    qml.Displacement(x, 0.0, wires=0)

    # "layer" subcircuits
    for v in var:
        layer(v)

    return qml.expval(qml.X(0))

 (A technical detail: the output is the expectation value of the 'x-quadrature', which is just a real-valued scalar number that represents the center of our quantum state along one dimension. Although our quantum state is multi-dimensional, we cannot read out all of these dimensions at once. We have to choose only one dimension to measure, and this is it, because the act of measurement 'collapses' the wavefunction and destroys the quantum state. Wavefunction collapse is simlar to Bayesian inference... when we learn something about the probability distribution, we change it; it's like going from the prior to the posterior distribution). 

### Cost Function

The cost function we'll minimize is the square loss between target values (labels) and model predictions. Function fitting is a regression problem, and we interpret the expectations from the quantum node as predictions (i.e. without applying postprocessing such as thresholding).



In [None]:
def square_loss(labels, predictions):
    loss = 0
    for l, p in zip(labels, predictions):
        loss = loss + (l - p) ** 2

    loss = loss / len(labels)
    return loss

def cost(var, features, labels):
    preds = [quantum_neural_net(var, x=x) for x in features]
    return square_loss(labels, preds)

### Training

The network’s weights (called `var` here) are initialized with values sampled from a normal distribution. We use 4 layers; performance has
been found to plateau at around 6 layers.

In [None]:
NUM_LAYERS = 4

np.random.seed(0)
var_init = 0.05 * np.random.randn(NUM_LAYERS, 5)
print(var_init)

Using the Adam optimizer, we'll perform training to update our weights for 500 epochs (# passes over the training data). Grab some popcorn, this could take some time. Why? Because quantum systems are notoriously expensive to simulate on a classical computer. 

In [None]:
# execute this cell to begin training

NUM_EPOCHS = 500

opt = AdamOptimizer(0.01, beta1=0.9, beta2=0.999)

cost_vals = []
var = var_init
for itr in range(NUM_EPOCHS):
    var = opt.step(lambda v: cost(v, X, Y), var)
    this_cost = cost(var, X, Y)
    print("Iter: {:5d} | Cost: {:0.7f} ".format(itr + 1, this_cost))
    cost_vals.append(this_cost)

### Evaluation

Now let's take a look at the loss function versus number of epochs:

In [None]:
plt.figure()
plt.scatter(np.linspace(0, NUM_EPOCHS, NUM_EPOCHS), cost_vals)
plt.xlabel("Epoch")
plt.ylabel("Cost")
plt.tick_params(axis="both", which="major")
plt.tick_params(axis="both", which="minor")
plt.show()

Now let's use our model to generate some predictions, representing the function fitting, and plot this alongside our original datapoints and plot the shape of the function that the model has “learned” from the noisy data (green line).

In [None]:
x_pred = np.linspace(-1, 1, 50)
predictions = [quantum_neural_net(var, x=x_) for x_ in x_pred]

plt.figure()
plt.scatter(X, Y)
plt.plot(x_pred, predictions, color="green")
plt.xlabel("x")
plt.ylabel("f(x)")
plt.tick_params(axis="both", which="major")
plt.tick_params(axis="both", which="minor")
plt.show()

The model has learned to smooth the noisy data!