# Creating Custom Feature Maps in Qiskit Aqua for <br>Quantum Support Vector Machines


Support vector machines (SVM) address the problem of supervised learning through the construction of a classifier. Havlicek *et al*. proposed two strategies to design a quantum SVM, namely the Quantum Kernel Estimator and the Quantum Variatonal Classifier. Both of these strategies use data that is provided classically and encodes it in the quantum state space through a quantum feature map.[1] The choice of which feature map to use is important and may depend on the given dataset we want to classify. In this tutorial, we show how to configure new feature maps in Aqua and explore their impact on the accuracy of the quantum classifier.

[1] Havlicek _et al_.  Nature **567**, 209-212 (2019). https://www.nature.com/articles/s41586-019-0980-2, https://arxiv.org/abs/1804.11326

Aqua provides several options for customizing the quantum feature map. In particular, there are four main parameters that can be used for model selection: the circuit depth, the data map function, the quantum gate set and the order of expansion. We will go through each of these parameters in this tutorial, but before getting started, let us review the main concepts of the quantum feature map discussed in [1].


### Review of the Quantum Feature Map


A quantum feature map nonlinearly maps classical datum **x** to a quantum state $|\Phi(\mathbf{x})\rangle\langle\Phi(\mathbf{x})|$, a vector in the Hilbert space of density matrices. Support vector machine classifiers find a hyperplane separating each vector $|\Phi(\mathbf{x}_i)\rangle\langle\Phi(\mathbf{x}_i)|$ depending on their label, supported by a reduced amount of vectors (the so-called support vectors). A key element of the feature map is not only the use of quantum state space as a feature space but also the way data are mapped into this high dimensional space.

Constructing feature maps based on quantum circuits that are hard to simulate classically is an important steps towards obtaining a quantum advantage over classical approaches. The authors of [1] proposed a family of feature maps that is conjectured to be hard to simulate classically and that can be implemented as short-depth circuits on near-term quantum devices.

$$ \mathcal{U}_{\Phi(\mathbf{x})}=\prod_d U_{\Phi(\mathbf{x})}H^{\otimes n},\ U_{\Phi(\mathbf{x})}=\exp\left(i\sum_{S\subseteq[1,n]}\phi_S(\mathbf{x})\prod_{k\in S} P_k\right) $$

The number of qubits $n$ in the quantum circuit is equal to the dimensionality of the classical data $\mathbf{x}$, which are encoded through the coefficients $\phi_S(\mathbf{x})$. The quantum circuit is composed of $d$ repeated layers of Hadamard gates interleaved with entangling blocks, which are expressed in terms of the Pauli gates $P_k \in \{\mathbb{1}_k, X_k, Y_k, Z_k \}$. The parameters $d$, $\phi_S$ and $P_k$ are mutable for both classification algorithms (Quantum Variational Classifier and Quantum Kernel Estimator) in Aqua. We note that the depth $d=1$ circuit considered in [1] can be efficiently simulated classically by uniform sampling, while the $d=2$ variant is conjectured to be hard to simulate classically.

<img src="images/uphi.PNG" width="400" />

The size of $S$ can be controled as well. We call the $r$-th order expansion, the feature map of this circuit family when $|S|\leq r$. In Aqua, the default is the second order expansion $|S|\leq 2$ used in [1], but can be increased. The greater the upper bound the more interactions will be taken into account. This gives $n$ singeltons $S=\{i\}$, and, depending on the connectivity graph of the quantum device, up to $\frac{n(n-1)}{2}$ couples to encode non-linear interactions.

Finally, we have a choice of the set of Pauli gates to use. Only contributions from $Z$ and $ZZ$ gates are considered in [1], because the corresponding $U_{\Phi(\mathbf{\mathbf{x}})}$ can be implemented efficiently, which is important for applications on NISQ devices.

### Programming the Quantum Feature Map

We will now see how to modify these four parameters (circuit depth, data map function, quantum gate set and expansion order) in Aqua. There are two default settings, both easily configurable, which allow modifications of the depth and data map, but not the gate set: `FirstOrderExpansion` and `SecondOrderExpansion`. To test them and see how they work, we need to have data.

<span style="color:green">[Note: We want to build up in "complexity" of the feature maps here. So perhaps a better order would be to discuss `FirstOrderExpansion` first, then follow with `SecondOrderExpansion` and the customized data map function.]</span>

<span style="color:green">[To more clearly see the effects of the different choices of feature maps, perhaps it would be interesting to use the `ad_hoc` dataset. This dataset was generated by the default `SecondOrderExpansion` and so if we use that same feature map to classify the data, we can expect nearly perfect test accuracy. So we could show that (1) default `FirstOrderExpansion` performs worse than default `SecondOrderExpansion` (we already know this to be the case) and (2) `SecondOrderExpansion` with a different data map perfoms worse/better than the default `SecondOrderExpansion` (this has not been tested yet) and (3) higher-order `PauliExpansion` with custom datamap performs worse/better than default `SecondOrderExpansion` (this has not been tested yet).]
</span>

In [1]:
import numpy as np
import matplotlib.pyplot as plt
from qiskit import BasicAer
from qiskit_aqua import run_algorithm, QuantumInstance
from qiskit_aqua.components.feature_maps import SecondOrderExpansion, FirstOrderExpansion, PauliExpansion, self_product
from qsvm_datasets import *
from qiskit_aqua.algorithms import QSVMKernel
import functools

<span style="color:green">
[If we are going to use the Iris dataset, let's pull it from the file where it's already set up: qiskit-tutorials/qiskit/aqua/artificial_intelligence/qsvm_datasets.py
    
Three things will need to be fixed in that file first (open an issue in github?): <br>
(1) In 'train_test_split', 'test_size' is set to '1'. We should change this to, e.g., 0.3 <br>
(2) Add option here to do PCA to reduce dimensionality instead of simply removing some features
(3) Create the test dictionary from the test array, not the train array

Using $X$ and $Y$ to describe the training and test input could be confusing since the labels are usually represented by the latter.]
</span>

In [2]:
#We use ad_hoc data generated by a quantum feature map

feature_dim = 2

sample_Total, training_input, test_input, class_labels = ad_hoc_data(training_size=15, 
                                                                     test_size=20, 
                                                                     n=feature_dim, 
                                                                     gap=0.3, 
                                                                     PLOT_DATA=False)

With the data, we will use the Quantum SVM Classifier to test different feature maps. We will start with the first order expansion, of depth $d=2$, using only $Z$ gates, the default data map (discussed below), and a full connectivity graph.

With the data, we will use the Quantum SVM Classifier to test the feature map with the same parameters as described in [1]. Namely, a second order expansion, of depth $d=2$, using only $Z$ gates,the default data map (discussed below), and a full connectivity graph.

In [2]:
shots = 1024
random_seed = 10598

# We use the simulator
backend = BasicAer.get_backend('qasm_simulator')
quantum_instance = QuantumInstance(backend, shots=shots, seed=random_seed, seed_mapper=random_seed)

#### 1. First Order Diagonal Expansion



In [4]:
# Generate the feature map
feature_map = FirstOrderExpansion(num_qubits=feature_dim, depth=2)

# Run the Quantum Variational Classifier, and test it

qsvm = QSVMKernel(feature_map, training_input, test_input)
result = qsvm.run(quantum_instance)
print("testing success ratio: ", result['testing_accuracy'])

testing success ratio:  0.85


Here we have used the `FirstOrderExpansion` feature map, so it forces $|S|=1$, there is no interaction between features of the data encoded in the circuit, and no entanglement. It takes as input the number of qubits (same as the of the data), the circuit depth $d$, an entangler map to encode the connectivity of the qubits (default is `entangler_map=None`, meaning we will use a pre-computed connectivity graph according to the next parameter), a string parameter called `entanglement` with options `'full'` or `'linear'` to generate connectivity if it isn't provided in `entangler_map` (default value is `'full'`, meaning it will consider the connectivity graph to be complete and consider all $\frac{n(n-1)}{2}$ interactions) and the data map $\phi_S(\mathbf{x})$ which will make non-linear connections in data (default value is  `data_map_func=self_product`, where `self_product` represents 

$$\phi_S:x\mapsto \Bigg\{\begin{array}{ll}
    x_i & \mbox{if}\ S=\{i\} \\
        (\pi-x_i)(\pi-x_j) & \mbox{if}\ S=\{i,j\}
    \end{array}$$.


Because we used `FirstOrderExpansion`, the connectivity didn't mattered, but it will be important for increased order of expansion, such as the `SecondOrderExpansion`

#### 2. Second Order Diagonal Expansion

The `SecondOrderExpansion` feature map allows $|S|\leq2$, so this time interactions in the data will be encoded in the feature map, according to the connectivity graph and the data map function. This is, with the set of parameters used here, the same feature map as described in [1].

In [5]:
feature_map = SecondOrderExpansion(num_qubits=feature_dim, depth=2)

qsvm = QSVMKernel(feature_map, training_input, test_input)
result = qsvm.run(quantum_instance)
print("testing success ratio: ", result['testing_accuracy'])

testing success ratio:  0.975


We see here it gives better results, because the dataset is made from this exact feature map. So obviously we will get near perfect results with only a few data points.

#### 3. Second Order Diagonal Expansion with Custom Data Map

We can also construct a new data map $\phi_S(\mathbf{x})$ and provide a custom entangler map via a dictionary describing the qubit connectivity.

<span style="color:green">
    
I had issues with the way QSVMKernel works: it creates a certain type of threads, incompatible with python console. However, the way python notebooks work is close to this. The only quick workaround I found was to put the custom data map in an other file and import it, I will try to find a better looking solution (see ProcessPoolExecutor documentation). 

</span>

In [6]:
from custom_data_map import custom_data_map_func

# The entangler map is a dictionary,
# keys are source qubit index (int),
# values are arrays of target qubit index(es) (int)

entangler_map = {0:[1]} # qubit 0 linked to qubit 1

In [7]:
# Here we use all available parameters for the simple feature maps
# (entanglement='full' won't be used because we provide entangler_map)

feature_map = SecondOrderExpansion(num_qubits=feature_dim,
                                   depth=2,
                                   data_map_func=custom_data_map_func,
                                   entangler_map=entangler_map)

qsvm = QSVMKernel(feature_map, training_input, test_input)
result = qsvm.run(quantum_instance)
print("testing success ratio: ", result['testing_accuracy'])

testing success ratio:  0.65


We see changing the data map function reduced the efficiency of the model, so it must be carfully chosen.

#### 4. Second Order Pauli Expansion

For some applications, we could want to change the set of Pauli gates used, to have more flexibility, instead of using $Z$ gates only. To do that, we can use the `PauliExpansion` feature map. It has the same parameters as the other `FirstOrderExpansion` and `SecondOrderExpansion` (`depth`, `entangler_map`, `data_map_function`), but also has a `paulis` parameter to change the gate set.

This parameter is a `list` of `string`, each representing the Pauli gate to use. The default value for this parameter is `['Z', 'ZZ']`, which is equivalent to `SecondOrderExpansion`.

Now we will a feature map using only Pauli $Y$ gates, `paulis=['Z', 'Y', 'ZZ']`.

In [8]:
feature_map = PauliExpansion(num_qubits=feature_dim, depth=2, paulis = ['Z', 'Y', 'ZZ'])

qsvm = QSVMKernel(feature_map, training_input, test_input)
result = qsvm.run(quantum_instance)
print("testing success ratio: ", result['testing_accuracy'])

testing success ratio:  0.65


Each `string` in `paulis` is implemented one at a time. Note that for a single character, for example `'Z'`, a layer of single-qubit gates are added to the circuit, while terms such as `'ZZ'` will add a layer of corresponding two-qubit entangling gates for each qubit pair available.

For example, the choice `paulis = ['Z', 'Y', 'ZZ']` generates a quantum feature map of the form 

$$\mathcal{U}_{\Phi(\mathbf{x})} = \left( \exp\left(i\sum_{jk} \phi_{\{j,k\}}(\mathbf{x}) Z_j \otimes Z_k\right) \, \exp\left(i\sum_{j} \phi_{\{j\}}(\mathbf{x}) Y_j\right) \, \exp\left(i\sum_j \phi_{\{j\}}(\mathbf{x}) Z_j\right) \, H^{\otimes n} \right)^d.$$ 

The depth $d=1$ version of the quantum circuit is shown below  <span style="color:green">[perhaps we should use the actual gates in the diagram to make clear the connection to the specific example: I think it will be more confusing because A gates are U1 gates, but B gates are A gates with a basis change, so the complete diagram would be longer and less clear]</span>


<img src="images/depth1.PNG" width="400"/>

The circuit begins with a layer of Hadamard gates $H^{\otimes n}$, followed by a layer of $A$ gates and a layer of $B$ gates. The $A$ and $B$ gates are single-qubit rotations by the same set of angles $\phi_{\{i\}}(\mathbf{x})$ but around different axes: $B = e^{i\phi_{\{i\}}(\mathbf{x})Y_i}$ and $A = e^{i\phi_{\{i\}}(\mathbf{x})Z_i}$. The entangling $ZZ$ gate $e^{i \phi_{\{0,1\}}(\mathbf{x}) Z_0 Z_1}$ is parametrized by an angle $\phi_{\{0,1\}}(\mathbf{x})$ and can be implemented using two controlled-NOT gates and one $A'=e^{i\phi_{\{0,1\}}(x)Z_1}$ gate as shown in the figure.

To compare, `paulis = ['Z', 'ZZ']` creates the same circuit as above but without the $B$ gates, while `paulis = ['Z', 'YY']` creates a circuit with a layer of $A$ gates followed by a layer of entangling $YY$ gates.


#### 5. Third Order Pauli Expansion with Custom Data Map

One should note that `PauliExpansion` allows third order or more expansions, for example `paulis = ['Y', 'Z', 'ZZ', 'ZZZ']`. Assuming the data has dimensionality of at least three and we have access to three qubits, this generates a feature map according to the previously mentioned rule, with $|S|\leq 3$. 

For example, suppose we want to classify three-dimensional data, using a third order expansion, a custom data map, a circuit depth of $d=2$, and a $Y$ gate for non-interactive encoding of the data, in addition to the $Z$ gates. We can do this with the following code in Aqua.

In [3]:
feature_dim = 3
sample_Total, training_input, test_input, class_labels = ad_hoc_data(training_size=10, 
                                                                     test_size=10, 
                                                                     n=feature_dim, 
                                                                     gap=0.3, 
                                                                     PLOT_DATA=False)

In [10]:
feature_map = PauliExpansion(num_qubits=feature_dim, depth=2, paulis = ['Y', 'Z', 'ZZ', 'ZZZ'])

qsvm = QSVMKernel(feature_map, training_input, test_input)
result = qsvm.run(quantum_instance)
print("testing success ratio: ", result['testing_accuracy'])

testing success ratio:  0.55


Because the connectivity is `'full'` by default, this circuit will contain a layer of $B$ gates parametrised by $\phi_{\{i\}}(\mathbf x)$, a layer of $A$ gates parametrised by $\phi_{\{i\}}(\mathbf x)$, three $ZZ$ gates, one for each pair of qubits $(0,1),\ (1,2),\ (0,2)$, and finally a $ZZZ$ gate $e^{i\phi_{\{0,1,2 \}}(x)Z_0Z_1Z_2}$. 

### Conclusion

<span style="color:green">[Include short description on how to create a completely new, pluggable feature map in Aqua.]</span>

We saw how to generate feature maps from the family described in [1]. It creates powerfull feature maps, and has already a lot of option to fit to a lot of different problems. But we may want to use a totally new feature map, using a different algorithm and circuit.

To do that, we only need to create a new class implementing the class `FeatureMap`, and its method `construct_circuit`. As an example, here is a general custom feature map class, taking the circuit construction algorithm (the core of the feature map, the way it's generating the circuit), and a list of necessary arguments.

In [20]:
"""
This module contains the definition of a base class for
feature map. Several types of commonly used approaches.
"""


import logging

import numpy as np
from qiskit import QuantumCircuit, QuantumRegister

from qiskit_aqua.components.feature_maps import FeatureMap
from inspect import signature

logger = logging.getLogger(__name__)


class CustomExpansion(FeatureMap):
    """
    Mapping data the way you want
    """

    CONFIGURATION = {
        'name': 'CustomExpansion',
        'description': 'Custom expansion for feature map (any order)',
        'input_schema': {
            '$schema': 'http://json-schema.org/schema#',
            'id': 'Custom_Expansion_schema',
            'type': 'object',
            'properties': {
                'feature_param': {
                    'type': ['array']
                }
            },
            'additionalProperties': False
        }
    }

    def __init__(self, num_qubits, constructor_function, feature_param):
        """Constructor.

        Args:
            num_qubits (int): number of qubits
            constructor_function (fun): a function that takes as parameters
            a datum x, a QuantumRegister qr, a boolean inverse and
            all other parameters needed from feature_param
            feature_param (list): the list of parameters needed to generate
            the circuit, that won't change depending on the data given
            (such as the data map function or other).
        """
        self.validate(locals())
        super().__init__()
        self._num_qubits = num_qubits
        sig = signature(constructor_function)
        if len(sig.parameters) != len(feature_param)+3:
            raise ValueError("The constructor_function given don't match the parameters given.\n" +
                             "Make sure it takes, in this order, the datum x, the QuantumRegister qr, the Boolean\n" +
                             " inverse and all the parameters provided in feature_param")
        self._constructor_function = constructor_function
        self._feature_param = feature_param
    
    # The only method mandatory to implement
    def construct_circuit(self, x, qr=None, inverse=False):
        """
        Construct the circuit based on given data and according to the function provided at instantiation.

        Args:
            x (numpy.ndarray): 1-D to-be-transformed data.
            qr (QauntumRegister): the QuantumRegister object for the circuit, if None,
                                  generate new registers with name q.
            inverse (bool): whether or not inverse the circuit

        Returns:
            QuantumCircuit: a quantum circuit transform data x.
        """
        if not isinstance(x, np.ndarray):
            raise TypeError("x must be numpy array.")
        if x.ndim != 1:
            raise ValueError("x must be 1-D array.")
        if x.shape[0] != self._num_qubits:
            raise ValueError("number of qubits and data dimension must be the same.")
        if qr is None:
            qr = QuantumRegister(self._num_qubits, name='q')
        qc = self._constructor_function(x, qr, inverse, *self._feature_param)
        return qc


In [27]:
def constructor_function(x, qr, inverse=False, depth=2):
    """A mock constructor function to test the class,
    it only places H and u1 gates."""
    qc = QuantumCircuit(qr)
    for _ in range(depth):
        qc.h(qr)
        for i in range(len(x)):
            qc.u1(x[i], qr[i])
    return qc
  

We put them in separated files to show the general process.

In [5]:
from custom_feature_map import CustomExpansion
from mock_constructor import constructor_function

feature_map = CustomExpansion(num_qubits=feature_dim, constructor_function=constructor_function, feature_param=[2])

qsvm = QSVMKernel(feature_map, training_input, test_input)
result = qsvm.run(quantum_instance)
print("testing success ratio: ", result['testing_accuracy'])

testing success ratio:  0.45
