forked from PennyLaneAI/qml
-
Notifications
You must be signed in to change notification settings - Fork 0
/
tutorial_embedding_generalization.py
674 lines (524 loc) · 24.5 KB
/
tutorial_embedding_generalization.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
"""
Generalization Performance of Quantum Metric Learning Classifiers (Breast Cancer Dataset)
======================================
.. meta::
:property="og:description": This demonstration illustrates the idea of training
a quantum embedding for metric learning. This technique is used to train
a hybrid quantum-classical data embedding to classify breast cancer data.
:property="og:image": https://github.com/Rlag1998/QML_Generalization/blob/main/embedding_metric_learning/figures/All_Figures/3.4.2.png?raw=true
*Adapted from work authored by Maria Schuld and Aroosa Ijaz*
*Authors: Jonathan Kim and Stefan Bekiranov*
This tutorial uses the idea of quantum embeddings for metric learning presented in
`Lloyd, Schuld, Ijaz, Izaac, Killoran (2020) <https://arxiv.org/abs/2001.03622>`_
by training a hybrid classical-quantum data embedding to classify breast cancer data.
Lloyd et al.'s appraoch was inspired by `Mari et al. (2019) <https://arxiv.org/abs/1912.08278>`_
(see also this `tutorial <https://pennylane.ai/qml/demos/tutorial_quantum_transfer_learning.html>`_).
This tutorial adapts the work of Lloyd et al. by changing the data pre-processing steps,
including the use of principal component analysis for feature reduction.
This tutorial aims to produce good generalization peformance for test set data (something that
was not demonstrated in the original quantum metric learning code).
More details on this topic can be found in the research paper, `Generalization Performance of Quantum Metric Learning Classifiers <https://doi.org/10.3390/biom12111576>`_.
Illustrated below is the general circuit used.
|
.. figure:: ../embedding_metric_learning/classification.png
:align: center
:width: 90%
|
After all necessary data pre-processing steps, ``n`` input features are reduced via matrix multiplication
to ``x1``, ``x2`` intermediate values, which are then fed into a quantum feature map consisting of ZZ
entanglers, as well as RX and RY rotational gates. This results in ``2n + 12`` total parameters
(``2n`` from the classical part, ``12`` from the quantum feature map) which are trained and updated over
a set number of iterations, resulting in a trained embedding. The trained embedding is able to embed
input datapoints in Hilbert space such that the Hilbert-Schmidt distance between datapoints of different
classes is maximized. A linear decision boundary can then be drawn across the datapoints in Hilbert space,
which corresponds to a complex decision boundary in classical space. This form of embedding training is
known as Quantum Metric Learning.
Through explorations with the ImageNet Ants & Bees image dataset, we find that datasets with too many features
show poor generalization when using this method. In this demo, we instead use a breast cancer dataset with
just 30 features per sample.
Let us begin!
"""
######################################################################
# Setup & Preparation
# ----
#
# The tutorial requires the following imports:
# %matplotlib inline
from sklearn.datasets import load_breast_cancer
import pandas as pd
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
from mpl_toolkits.axes_grid1 import make_axes_locatable
import pennylane as qml
from pennylane import numpy as np
from pennylane import RX, RY, RZ, CNOT
######################################################################
# .. note:: PennyLane version 0.18.0 or earlier is required for this
# demo.
######################################################################
# Next, the following random seed is used:
np.random.seed(seed=22)
######################################################################
# In this example, we will be reducing each sample to 16 principle
# components:
pc = 16
######################################################################
# We now load in the breast cancer dataset as follows:
breast = load_breast_cancer()
breast_data = breast.data
breast_labels = breast.target
labels = np.reshape(breast_labels,(569,1))
final_breast_data = np.concatenate([breast_data,labels],axis=1)
######################################################################
# We can now convert the data into a pandas dataframe:
breast_dataset = pd.DataFrame(final_breast_data)
features = breast.feature_names
features_labels = np.append(features,'label')
breast_dataset.columns = features_labels
breast_dataset['label'].replace(0, 'Benign',inplace=True)
breast_dataset['label'].replace(1, 'Malignant',inplace=True)
print(breast_dataset.head(5))
######################################################################
# To apply PCA, we must first normalise the data:
x = breast_dataset.loc[:, features].values
x = StandardScaler().fit_transform(x)
feat_cols = ['feature'+str(i) for i in range(x.shape[1])]
normalised_breast = pd.DataFrame(x,columns=feat_cols)
print(normalised_breast.head(5))
######################################################################
# We now apply reduce the number of features by applying PCA:
pca_breast = PCA(n_components=pc)
principalComponents_breast = pca_breast.fit_transform(x)
principal_breast_Df = pd.DataFrame(data = principalComponents_breast)
principal_breast_Df.columns = ["PC"+str(i+1) for i in range(pc)]
copy = principal_breast_Df.copy()
extracted_column = breast_dataset['label']
copy = copy.join(extracted_column)
print(copy.head(5))
######################################################################
# To see how well the principal components account for the whole data,
# we can assess the explained variance ratio:
print('Explained variation per principal component: {}'.format(
pca_breast.explained_variance_ratio_
))
######################################################################
# We now separate the data based on its label, then split each
# separated set into training and test data in the ratio of 3:2.
# This gives us four final arrays:
b = copy[copy['label'] == 'Benign']
m = copy[copy['label'] == 'Malignant']
b['label'] = np.where(b['label'] == 'Benign', -1, 1)
m['label'] = np.where(m['label'] == 'Benign', -1, 1)
b = b.sample(frac=1).reset_index(drop=True)
m = m.sample(frac=1).reset_index(drop=True)
b_train_df = b.head(127)
b_test_df = b.tail(85)
m_train_df = m.head(214)
m_test_df = m.tail(143)
train_df = b_train_df.append(m_train_df)
test_df = b_test_df.append(m_test_df)
x_train_array = train_df.iloc[:,:-1].to_numpy()
x_test_array = test_df.iloc[:,:-1].to_numpy()
y_train_array = train_df[['label']].to_numpy()
y_test_array = test_df[['label']].to_numpy()
######################################################################
# We can save our generated test-train splits as txt files, allowing
# them to simply be imported in future:
np.savetxt('embedding_metric_learning/bc_x_array.txt', x_train_array)
np.savetxt('embedding_metric_learning/bc_x_test_array.txt', x_test_array)
np.savetxt('embedding_metric_learning/bc_y_array.txt', y_train_array)
np.savetxt('embedding_metric_learning/bc_y_test_array.txt', y_test_array)
######################################################################
# Embedding
# ----
#
# Quantum metric learning is used to train a quantum embedding, which is
# used for classifying data. Quantum embeddings are learned by maximizing
# Hilbert-Schmidt distance of datapoints from two classes. After training,
# the datapoints of different clases become maximally separated in Hilbert
# space. This results in a simple linear decision boundary in Hilbert space
# which represents a complex decision boundary in the original feature space.
#
# A cost function is used to track the progress of the training; the lower
# the cost function, the greater the class separation in Hilbert space.
#
# The model is ultimately optimized with the ``RMSPropOptimizer`` and data are
# classified according to a KNN-style classifier.
#
# Below is the code that makes up the quantum feature map:
def feature_encoding_hamiltonian(features, wires):
for idx, w in enumerate(wires):
RX(features[idx], wires=w)
def ising_hamiltonian(weights, wires, l):
# ZZ coupling
CNOT(wires=[wires[1], wires[0]])
RZ(weights[l, 0], wires=wires[0])
CNOT(wires=[wires[1], wires[0]])
# local fields
for idx, w in enumerate(wires):
RY(weights[l, idx + 1], wires=w)
def QAOAEmbedding(features, weights, wires):
repeat = len(weights)
for l in range(repeat):
# apply alternating Hamiltonians
feature_encoding_hamiltonian(features, wires)
ising_hamiltonian(weights, wires, l)
# repeat the feature encoding once more at the end
feature_encoding_hamiltonian(features, wires)
######################################################################
# The model has ``2n + 12`` trainable parameters, where ``n`` is the
# number of classical features. There are by default 30 clinical
# features in this dataset, meaning there are 60 classical parameters
# and 12 quantum parameters.
#
# Earlier, we reduced the number of classical features via PCA,
# resulting in a smaller number of trainable parameters.
# With 16 principal components, we have now have 32 linear parameters,
# meaning 44 parameters in total.
#
# Here, we load the PCA features we generated:
X = np.loadtxt("embedding_metric_learning/bc_x_array.txt", ndmin=2) # pre-prepared training inputs
Y = np.loadtxt("embedding_metric_learning/bc_y_array.txt") # training labels
X_val = np.loadtxt(
"embedding_metric_learning/bc_x_test_array.txt", ndmin=2
) # pre-prepared validation inputs
Y_val = np.loadtxt("embedding_metric_learning/bc_y_test_array.txt") # validation labels
# split data into two classes
A = X[Y == -1] # benign
B = X[Y == 1] # malignant
A_val = X_val[Y_val == -1]
B_val = X_val[Y_val == 1]
print(A.shape)
print(B.shape)
print(A_val.shape)
print(B_val.shape)
######################################################################
# Next, we turn to quantum node initialization:
n_features = 2
n_qubits = 2 * n_features + 1
dev = qml.device("default.qubit", wires=n_qubits)
######################################################################
# Defined below is the SWAP test we will use for overlap measurement:
@qml.qnode(dev)
def swap_test(q_weights, x1, x2):
# load the two inputs into two different registers
QAOAEmbedding(features=x1, weights=q_weights, wires=[1, 2])
QAOAEmbedding(features=x2, weights=q_weights, wires=[3, 4])
# perform the SWAP test
qml.Hadamard(wires=0)
for k in range(n_features):
qml.CSWAP(wires=[0, k + 1, 2 + k + 1])
qml.Hadamard(wires=0)
return qml.expval(qml.PauliZ(0))
def overlaps(weights, X1=None, X2=None):
linear_layer = weights[0]
q_weights = weights[1]
overlap = 0
for x1 in X1:
for x2 in X2:
# multiply the inputs with the linear layer weight matrix
w_x1 = linear_layer @ x1
w_x2 = linear_layer @ x2
# overlap of embedded intermediate features
overlap += swap_test(q_weights, w_x1, w_x2)
mean_overlap = overlap / (len(X1) * len(X2))
return mean_overlap
######################################################################
# Finally, below is the cost function, which takes both inter-cluster
# overlaps and intra-cluster overlaps into consideration:
def cost(weights, A=None, B=None):
aa = overlaps(weights, X1=A, X2=A)
bb = overlaps(weights, X1=B, X2=B)
ab = overlaps(weights, X1=A, X2=B)
d_hs = -2 * ab + (aa + bb)
return 1 - 0.5 * d_hs
######################################################################
# Optimization
# ------------
#
# The intial classical and quantum parameters are generated at random.
#
# The value of the lattermost integer belonging to the ``size``
# attribute of the ``init_pars_classical`` variable depends on the
# number of principal components used.
# generate initial parameters for the quantum component, such that
# the resulting number of trainable quantum parameters is equal to
# the product of the elements that make up the 'size' attribute
# (4 * 3 = 12).
init_pars_quantum = np.random.normal(loc=0, scale=0.1, size=(4, 3))
# generate initial parameters for the classical component, such that
# the resulting number of trainable classical parameters is equal to
# the product of the elements that make up the 'size' attribute.
init_pars_classical = np.random.normal(loc=0, scale=0.1, size=(2, pc))
init_pars = [init_pars_classical, init_pars_quantum]
######################################################################
# The ``RMSPropOptimizer`` is used with a step size of 0.01 and batch size
# of 5 to optimize the model over 400 iterations. The ``pars`` variable
# is updated after every iteration.
#
# .. note:: Despite the code steps shown below, all figure results in
# this demo were generated with a batch size of 10 over 1500
# iterations.
optimizer = qml.RMSPropOptimizer(stepsize=0.01)
batch_size = 5
pars = init_pars
cost_list = []
for i in range(400):
# Sample a batch of training inputs from each class
selectA = np.random.choice(range(len(A)), size=(batch_size,), replace=True)
selectB = np.random.choice(range(len(B)), size=(batch_size,), replace=True)
A_batch = [A[s] for s in selectA]
B_batch = [B[s] for s in selectB]
# Walk one optimization step
pars = optimizer.step(lambda w: cost(w, A=A_batch, B=B_batch), pars)
# print(pars)
# print("Step", i+1, "done.")
# Print the validation cost every 10 steps
# if i % 50 == 0 and i != 0:
# cst = cost(pars, A=A_val, B=B_val)
# print("Cost on validation set {:2f}".format(cst))
# cost_list.append(cst)
######################################################################
# The quantum and classical parameters are saved into txt files so
# they may be used at a future time without having to re-train the
# initial parameters.
print("quantum pars: ", pars[1])
with open(r"embedding_metric_learning/thetas.txt", "w") as file1:
for item in pars[1]:
file1.write("%s\n" % item)
print("classical pars: ", pars[0])
with open(r"embedding_metric_learning/x1x2.txt", "w") as file2:
for item in pars[0]:
file2.write("%s\n" % item)
######################################################################
# Analysis
# --------
#
# Hilbert space mutual data overlap gram matrices can be used to assess
# the separation in embedded test set datapoints. Scatter plots
# depicting the pre-training and post-training positions of the
# ``x1``, ``x2`` intermediate points can also be plotted.
#
# For generating mutual data overlap gram matrices, a smaller subset of
# the test set data is used, as determined by the ``select`` variable.
select = 10
######################################################################
# Final cost values can be printed out here:
# cost_train = cost(pars, A=A[:select], B=B[:select])
# cost_val = cost(pars, A=A_val[:select], B=B_val[:select])
# cost_train = cost(pars, A=A, B=B)
# cost_val = cost(pars, A=A_val, B=B_val)
# print("Cost for pretrained parameters on training set:", cost_train)
# print("Cost for pretrained parameters on validation set:", cost_val)
######################################################################
# Continuation of gram matrices preparation:
# A_B = np.r_[A[:select], B[:select]]
A_B = np.r_[A_val[:select], B_val[:select]]
######################################################################
# Before training, class separation is not observed within the gram matrices:
gram_before = [[overlaps(init_pars, X1=[x1], X2=[x2]) for x1 in A_B] for x2 in A_B]
ax = plt.subplot(111)
im = ax.matshow(gram_before, vmin=0, vmax=1)
divider = make_axes_locatable(ax)
cax = divider.append_axes("right", size="5%", pad=0.05)
plt.colorbar(im, cax=cax)
# plt.show()
######################################################################
#
# |
#
# .. figure:: ../embedding_metric_learning/figures/All_Figures/3.4.1.png
# :align: center
# :width: 90%
#
# |
#
# After training, the goal is for there to be a clear separation between
# the two classes, such that there are four clearly defined squares of
# mutual overlap (two yellow, two purple). This desired level of
# separation has been achieved.
gram_after = [[overlaps(pars, X1=[x1], X2=[x2]) for x1 in A_B] for x2 in A_B]
ax = plt.subplot(111)
im = ax.matshow(gram_after, vmin=0, vmax=1)
divider = make_axes_locatable(ax)
cax = divider.append_axes("right", size="5%", pad=0.05)
plt.colorbar(im, cax=cax)
# plt.show()
######################################################################
#
# |
#
# .. figure:: ../embedding_metric_learning/figures/All_Figures/3.4.2.png
# :align: center
# :width: 90%
#
# |
#
# The two-dimensional intermediate (``x1``, ``x2``) points can be graphed in the
# form of scatter plots to help visualize the separation progress from
# a different perspective.
#
# The code below results in the pre-training scatter plot:
blue_patch = mpatches.Patch(color="blue", label="Training: Benign")
red_patch = mpatches.Patch(color="red", label="Training: Malignant")
cornflowerblue_patch = mpatches.Patch(color="cornflowerblue", label="Test: Benign")
lightcoral_patch = mpatches.Patch(color="lightcoral", label="Test: Malignant")
plt.rcParams["figure.figsize"] = (8, 8)
plt.rc("xtick", labelsize=12)
plt.rc("ytick", labelsize=12)
for a in A:
intermediate_a = init_pars[0] @ a
plt.scatter(intermediate_a[:][0], intermediate_a[:][1], c="blue")
for b in B:
intermediate_b = init_pars[0] @ b
plt.scatter(intermediate_b[:][0], intermediate_b[:][1], c="red")
for a in A_val:
intermediate_a = init_pars[0] @ a
plt.scatter(intermediate_a[:][0], intermediate_a[:][1], c="cornflowerblue")
for b in B_val:
intermediate_b = init_pars[0] @ b
plt.scatter(intermediate_b[:][0], intermediate_b[:][1], c="lightcoral")
plt.xlabel(r"$x_1$", fontsize=20)
plt.ylabel(r"$x_2$", fontsize=20)
plt.legend(handles=[blue_patch, cornflowerblue_patch, red_patch, lightcoral_patch], fontsize=12)
# plt.show()
######################################################################
#
# |
#
# .. figure:: ../embedding_metric_learning/figures/All_Figures/3.3.3.png
# :align: center
# :width: 90%
#
# |
#
# The below code results in the post-training scatter plot.
# It is clear that both the training set and set set intermediate values
# separated reasonably well in two dimensions, an indication of good generalization.
for a in A:
intermediate_a = pars[0] @ a
plt.scatter(intermediate_a[:][0], intermediate_a[:][1], c="blue")
for b in B:
intermediate_b = pars[0] @ b
plt.scatter(intermediate_b[:][0], intermediate_b[:][1], c="red")
for a in A_val:
intermediate_a = pars[0] @ a
plt.scatter(intermediate_a[:][0], intermediate_a[:][1], c="cornflowerblue")
for b in B_val:
intermediate_b = pars[0] @ b
plt.scatter(intermediate_b[:][0], intermediate_b[:][1], c="lightcoral")
plt.xlabel(r"$x_1$", fontsize=20)
plt.ylabel(r"$x_2$", fontsize=20)
plt.legend(handles=[blue_patch, cornflowerblue_patch, red_patch, lightcoral_patch], fontsize=12)
# plt.show()
######################################################################
#
# |
#
# .. figure:: ../embedding_metric_learning/figures/All_Figures/3.3.4.png
# :align: center
# :width: 90%
#
# |
#
######################################################################
# Classification
# --------------
#
# A KNN-style classifier can be used to determine the class for each new
# datapoint based on the datapoint's degree of overlap with each of the two
# separated classes of the training set data.
#
# Below, test set classification is evaluated by means of a ``predict``
# function to yield subsequent F1, precision, recall, accuracy and specificity
# scores. A confusion matrix of the form (TP, FN, FP, TN) is also returned.
def predict(n_samples, pred_low, pred_high, choice):
truepos = 0
falseneg = 0
falsepos = 0
trueneg = 0
for i in range(pred_low, pred_high):
pred = ""
if choice == 0:
x_new = A_val[i] # Benign
else:
x_new = B_val[i] # Malignant
prediction = 0
for s in range(n_samples):
# select a random sample from the training set
sample_index = np.random.choice(len(X))
x = X[sample_index]
y = Y[sample_index]
# compute the overlap between training sample and new input
overlap = overlaps(pars, X1=[x], X2=[x_new])
# add the label weighed by the overlap to the prediction
prediction += y * overlap
# normalize prediction
prediction = prediction / n_samples
# This component acts as the sign function of this KNN-style method.
# 'Negative' predictions correspond to benign cancers, while 'positive' predictions
# correspond to malignant cancers. The confusion matrix is also constructed here.
if prediction < 0:
pred = "Benign"
if choice == 0:
trueneg += 1
else:
falseneg += 1
else:
pred = "Malignant"
if choice == 0:
falsepos += 1
else:
truepos += 1
# print("prediction: "+str(pred)+", value is "+str(prediction))
# print(truepos, falseneg, falsepos, trueneg)
return truepos, falseneg, falsepos, trueneg
totals = [x + y for x, y in zip(predict(20, 0, len(A_val), 0), predict(20, 0, len(B_val), 1))]
print(totals)
precision = totals[0] / (totals[0] + totals[2])
recall = totals[0] / (totals[0] + totals[1])
accuracy = (totals[0] + totals[3]) / (totals[0] + totals[1] + totals[2] + totals[3])
specificity = totals[3] / (totals[3] + totals[2])
f1 = (2 * precision * recall) / (precision + recall)
print("Precision: ", precision)
print("Recall: ", recall)
print("Accuracy: ", accuracy)
print("Specificity: ", specificity)
print("F1 Score: ", f1)
######################################################################
# Below is an example table of results based on varying the the number
# of principal components. In each row, training was performed for
# 1500 iterations with a batch size of 10. The features in row 1 did
# not undergo PCA, while the features from the rest of the rows did.
# The optimal value of each column is given in bold:
#
# .. _tbl-grid:
# +-----------------+---------------+-----------+-----------+----------+----------+
# | No. of Features | Training Cost | Test Cost | Precision | Recall | F1-Score |
# +=================+===============+===========+===========+==========+==========+
# | 30 | 0.2026 | 0.2791 | 0.9205 | 0.9720 | 0.9456 |
# +-----------------+---------------+-----------+-----------+----------+----------+
# | 30 | **0.1750** | 0.2899 | 0.9211 | 0.9790 | 0.9492 |
# +-----------------+---------------+-----------+-----------+----------+----------+
# | 16 | 0.2201 | 0.3101 | 0.9281 |**0.9930**| 0.9595 |
# +-----------------+---------------+-----------+-----------+----------+----------+
# | 8 | 0.2497 |**0.2646** |**0.9655** | 0.9790 |**0.9722**|
# +-----------------+---------------+-----------+-----------+----------+----------+
# | 4 | 0.2885 | 0.2913 | 0.9467 |**0.9930**| 0.9693 |
# +-----------------+---------------+-----------+-----------+----------+----------+
# | 2 | 0.3450 | 0.3306 | 0.9517 | 0.9650 | 0.9583 |
# +-----------------+---------------+-----------+-----------+----------+----------+
######################################################################
# References
# ----------
#
# Seth Lloyd, Maria Schuld, Aroosa Ijaz, Josh Izaac, Nathan Killoran: "Quantum embeddings for machine learning"
# arXiv preprint arXiv:2001.03622.
#
# Andrea Mari, Thomas R. Bromley, Josh Izaac, Maria Schuld, Nathan Killoran: "Transfer learning
# in hybrid classical-quantum neural networks" arXiv preprint arXiv:1912.08278.
#
# Jonathan Kim and Stefan Bekiranov: "Generalization of Quantum Metric Learning Classifiers",
# https://doi.org/10.3390/biom12111576.