##### Copyright 2020 The TensorFlow Authors.

In [0]:
#@title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# MNIST classification

<table class="tfo-notebook-buttons" align="left">
  <td>
    <a target="_blank" href="https://www.tensorflow.org/quantum/tutorials/mnist"><img src="https://www.tensorflow.org/images/tf_logo_32px.png" />View on TensorFlow.org</a>
  </td>
  <td>
    <a target="_blank" href="https://colab.research.google.com/github/quantumlib/TFQuantum/blob/master/docs/tutorials/mnist.ipynb"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" />Run in Google Colab</a>
  </td>
  <td>
    <a target="_blank" href="https://github.com/quantumlib/TFQuantum/blob/master/docs/tutorials/mnist.ipynb"><img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />View source on GitHub</a>
  </td>
  <td>
    <a href="https://storage.googleapis.com/tensorflow_quantum/docs/tutorials/mnist.ipynb"><img src="https://www.tensorflow.org/images/download_logo_32px.png" />Download notebook</a>
  </td>
</table>

This tutorial builds a quantum neural network (QNN) to classify a simplified version of MNIST, similar to the approach used in <a href="https://arxiv.org/pdf/1802.06002.pdf" class="external">Farhi et al</a>. The performance of the quantum neural network on this classical data problem is compared with a classical neural network.

## Setup

Download and install the required packages:

In [0]:
%%capture
!pip install --upgrade pip
!pip install cirq==0.7.0

In [0]:
%%capture
!pip install --upgrade tensorflow==2.0.0

Note: If the following code cell fails, execute the first code cells and then restart the Colab runtime (*Runtime > Restart Runtime*).

In [0]:
%%capture
h = "2dfcfceb9726fa73c40381c037dc01facd3d061e"
!cd ~/
!rm -r -f TFQuantum/
!git clone https://{h}:{h}@github.com/quantumlib/TFQuantum.git;cd TFQuantum/
!pip install --upgrade ./TFQuantum/wheels/tfquantum-0.2.0-cp36-cp36m-linux_x86_64.whl

Now import TensorFlow and the module dependencies:

In [0]:
import tensorflow as tf
import tensorflow_quantum as tfq

import cirq
import sympy
import numpy as np
import seaborn
import collections

# visualization tools
%matplotlib inline
import matplotlib.pyplot as plt
from cirq.contrib.svg import SVGCircuit

## 1. Load the data

Load the MNIST data distributed with Keras. Since this tutorial demonstrates a binary classification problem for the numbers 3 and 6, remove the other numbers. Then display the first training example:

In [0]:
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()

print("Number of original training examples:", len(x_train))
print("Number of original test examples:", len(x_train))

# Keep the 3s and 6s and remove the other numbers.
x_train, y_train = zip(*((x, y) for x, y in zip(x_train, y_train) if y in [3, 6]))
x_test, y_test = zip(*((x, y) for x, y in zip(x_test, y_test) if y in [3, 6]))

print("Number of filtered training examples:", len(x_train))
print("Number of filtered test examples:", len(x_test))

print(y_train[0])
seaborn.heatmap(x_train[0])

But an image size of 28x28 is much too large for current quantum computers. Resize the image down to 4x4 and scale the numbers between 0 and 1:

In [0]:
def reduce_image(x):
    x = tf.reshape(x, [1, 28, 28, 1])
    x = tf.image.resize(x, [4, 4])
    x = tf.reshape(x, [4, 4])
    x = x / 255
    return x.numpy()


x_train = [reduce_image(x) for x in x_train]
x_test = [reduce_image(x) for x in x_test]


# Remove examples where the same input has multiple labels.
def remove_contradicting(xs, ys):
    mapping = collections.defaultdict(set)
    for x, y in zip(xs, ys):
        mapping[str(x)].add(y)
    return zip(*((x, y) for x, y in zip(xs, ys) if len(mapping[str(x)]) == 1))


x_train, y_train = remove_contradicting(x_train, y_train)
x_test, y_test = remove_contradicting(x_test, y_test)

print("Number of non-contradicting training examples: ", len(x_train))
print("Number of non-contradicting test examples: ", len(x_test))

Again, display the first training example—after resize: 

In [0]:
print(y_train[0])
seaborn.heatmap(x_train[0])

To process images using a quantum computer, <a href="https://arxiv.org/pdf/1802.06002.pdf" class="external">Farhi et al.</a> proposed representing each pixel with a qubit, with the state depending on the value of the pixel.

To classify these images, <a href="https://arxiv.org/pdf/1802.06002.pdf" class="external">Farhi et al.</a> proposed taking the expectation of a readout qubit in a parameterized circuit. The expectation returns a value between 1 and -1, so choosing 1 and -1 as the targets is natural.

In [0]:
def convert_to_circuit(image):
    """Encode truncated classical image into quantum datapoint."""
    values = np.ndarray.flatten(image)
    qubits = cirq.GridQubit.rect(4, 4)
    circuit = cirq.Circuit()
    for i, value in enumerate(values):
        if value > 0.5:
            circuit.append(cirq.X(qubits[i]))
    return circuit


x_train = [convert_to_circuit(x) for x in x_train]
x_test = [convert_to_circuit(x) for x in x_test]


def convert_label(y):
    if y == 3:
        return 1.0
    else:
        return -1.0


y_train = [convert_label(y) for y in y_train]
y_test = [convert_label(y) for y in y_test]

And display the circuit diagram for the first example:

In [0]:
print(y_train[0])
print(x_train[0])

## 2. Quantum neural network

There is little guidance for a quantum circuit structure that classifies images. Since the classification is based on the expectation of the readout qubit, <a href="https://arxiv.org/pdf/1802.06002.pdf" class="external">Farhi et al.</a> propose using two qubit gates, with the readout qubit always acted upon.

This following example shows this layer approach. It uses *n* instances of the same gate, with each of the data qubits acting on the readout qubit:

In [0]:
def create_quantum_model():
    """Create a QNN model circuit and readout operation to go along with it."""
    data_qubits = cirq.GridQubit.rect(4, 4)
    readout = cirq.GridQubit(16, 0)

    symbols = []
    circuit = cirq.Circuit()

    # Generates a layer of the gate type.
    def layer(gate, prefix):
        for i, qubit in enumerate(data_qubits):
            symbol = sympy.Symbol(prefix + '-' + str(i))
            circuit.append(gate(qubit, readout)**symbol)
            symbols.append(symbol)

    # Prepare the readout qubit.
    circuit.append(cirq.X(readout))
    circuit.append(cirq.H(readout))

    # Then add layers (experiment by adding more).
    layer(cirq.XX, "xx-1")
    layer(cirq.ZZ, "zz-1")

    # Finally, prepare the readout qubit.
    circuit.append(cirq.H(readout))

    return circuit, cirq.Z(readout)

Build the Keras model with the quantum components. This model is fed the "quantum data" that encodes the classical data.

In [0]:
# Get the quantum components.
model_circuit, model_readout = create_quantum_model()

# Build the Keras model.
model = tf.keras.Sequential()
model.add(tf.keras.layers.Input(shape=(), dtype=tf.dtypes.string))
model.add(tfq.layers.PQC(model_circuit, model_readout))

# Define a custom accuracy that equals the sign of the output.
@tf.function
def custom_accuracy(y_true, y_pred):
    y_true = tf.squeeze(y_true)
    y_pred = tf.map_fn(lambda x: 1.0 if x >= 0 else -1.0, y_pred)
    return tf.keras.backend.mean(tf.keras.backend.equal(y_true, y_pred))


print(model.summary())

model.compile(loss=tf.keras.losses.hinge,
              optimizer=tf.keras.optimizers.Adam(),
              metrics=[custom_accuracy])

Reduce the dataset size for faster training. For better results, ignore this code cell:

In [0]:
# Comment out for increased accuracy.
NUM_EXAMPLES = 500
x_train = x_train[:NUM_EXAMPLES]
y_train = y_train[:NUM_EXAMPLES]

Using the full dataset, training this model should achieve >85% accuracy on the test set. Here, only `NUM_EXAMPLES` datapoints are used to save time.

In [0]:
# Wrap the training and test sets so Keras can handle them.
x_train = tfq.convert_to_tensor(x_train)
x_test = tfq.convert_to_tensor(x_test)
y_train = np.array(y_train)
y_test = np.array(y_test)

model.fit(x_train,
          y_train,
          batch_size=32,
          epochs=3,
          verbose=1,
          validation_data=(x_test, y_test))

qnn_results = model.evaluate(x_test, y_test)

## 3. Classical neural network

While the quantum neural network works for this simplified MNIST problem, a basic classical neural network can easily outperform a QNN on this task. After a single epoch, a classical neural network can achieve >98% accuracy on the holdout set.

In the following example, a classical neural network is used for a 10-class classification problem (instead of the 2-class problem for the QNN), and uses the entire 28x28 image instead of subsampling the image.

In [0]:
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()

y_train = tf.keras.utils.to_categorical(y_train, 10)
y_test = tf.keras.utils.to_categorical(y_test, 10)

x_train = x_train.astype('float32') / 255
x_test = x_test.astype('float32') / 255


def create_classical_model():
    # A simple model based off LeNet from https://keras.io/examples/mnist_cnn/
    model = tf.keras.Sequential()
    model.add(tf.keras.layers.Reshape([28, 28, 1]))
    model.add(tf.keras.layers.Conv2D(32, [3, 3], activation='relu'))
    model.add(tf.keras.layers.Conv2D(64, [3, 3], activation='relu'))
    model.add(tf.keras.layers.MaxPooling2D(pool_size=(2, 2)))
    model.add(tf.keras.layers.Dropout(0.25))
    model.add(tf.keras.layers.Flatten())
    model.add(tf.keras.layers.Dense(128, activation='relu'))
    model.add(tf.keras.layers.Dropout(0.5))
    model.add(tf.keras.layers.Dense(10, activation='softmax'))
    return model


model = create_classical_model()
model.compile(loss=tf.keras.losses.categorical_crossentropy,
              optimizer=tf.keras.optimizers.Adam(),
              metrics=['accuracy'])

model.fit(x_train,
          y_train,
          batch_size=128,
          epochs=1,
          verbose=1,
          validation_data=(x_test, y_test))

cnn_results = model.evaluate(x_test, y_test)

## 4. Comparison

Despite a more difficult problem, the classical neural network easily outperforms the quantum neural network. For classical data, it is difficult to beat a classical neural network.

In [0]:
qnn_accuracy = qnn_results[1]
cnn_accuracy = cnn_results[1]

seaborn.barplot(["quantum", "classical"], [qnn_accuracy, cnn_accuracy])