In [1]:
import numpy as np
import tensorflow as tf
import classifier, training
tf.enable_eager_execution()

# Data preprocessing

First let's load the MNIST dataset of hand-written digits from `tensorflow`.

In [2]:
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data(path='mnist.npz')

print(x_train.shape, y_train.shape)
print(x_test.shape, y_test.shape)

(60000, 28, 28) (60000,)
(10000, 28, 28) (10000,)


Next let's encode the data using the feature map $\Phi (p) = (p, 1-p)^T$ and transform the labels to one-hot format.

In [3]:
def data_encoder(data):
  return np.array([1 - data, data]).transpose([1, 2, 0])

def to_one_hot(labels, n_labels=10):
  one_hot = np.zeros((len(labels), n_labels))
  one_hot[np.arange(len(labels)), labels] = 1
  return one_hot

n_labels = len(np.unique(y_train))

# Flatten and normalize
x_train = x_train.reshape((len(x_train), np.prod(x_train.shape[1:]))) / 255.0
x_test = x_test.reshape((len(x_test), np.prod(x_test.shape[1:]))) / 255.0
# Encode
x_train = data_encoder(x_train)
x_test = data_encoder(x_test)
y_train = to_one_hot(y_train)
y_test = to_one_hot(y_test)

print(n_labels)
print(x_train.shape, y_train.shape)
print(x_test.shape, y_test.shape)

10
(60000, 784, 2) (60000, 10)
(10000, 784, 2) (10000, 10)


# Define MPS classifier

Note that our MPS has one more site than the data because of the label tensor. We also have to set the bond dimension which is a hyperparameter and remains constant during training. In a more sophisticated implementation the bond dimension can be adaptively changed according to the complexity of training data by performing some SVD steps. This is currently not implemented but can be added in a future version.

In [4]:
mps = classifier.MatrixProductState(n_sites=x_train.shape[1] + 1,
                                    n_labels=n_labels,
                                    d_phys=x_train.shape[2],
                                    d_bond=12)

# Example training

We can train the `mps` object we created using the `training.fit` data. Here we perform a quick training in a small portion of the data without validation.

In [5]:
optimizer = tf.train.AdamOptimizer(learning_rate=1e-4)

mps, history = training.fit(mps, optimizer, x_train[:1000], y_train[:1000],
                            n_epochs=10, batch_size=50, n_message=1)


Epoch: 0
Time: 2.2689077854156494
Loss: 2.28141188621521
Accuracy: 0.124

Epoch: 1
Time: 4.476986646652222
Loss: 1.630860686302185
Accuracy: 0.422

Epoch: 2
Time: 6.673081636428833
Loss: 0.9281184077262878
Accuracy: 0.723

Epoch: 3
Time: 8.884441137313843
Loss: 0.6122371554374695
Accuracy: 0.82

Epoch: 4
Time: 11.061261653900146
Loss: 0.4799060523509979
Accuracy: 0.847

Epoch: 5
Time: 13.23032832145691
Loss: 0.3774247467517853
Accuracy: 0.884

Epoch: 6
Time: 15.399298191070557
Loss: 0.29510781168937683
Accuracy: 0.917

Epoch: 7
Time: 17.57841396331787
Loss: 0.2591344714164734
Accuracy: 0.926

Epoch: 8
Time: 19.780704498291016
Loss: 0.2326657474040985
Accuracy: 0.934

Epoch: 9
Time: 21.967876195907593
Loss: 0.23971489071846008
Accuracy: 0.934
