# Variational Quantum Classifier Feature Map Comparison

Both the first-order and second-order expansion feature maps provided by Aqua use $n$ qubits to encode $n$-dim datapoints. However, raw feature vectors can also be directly used in `VQC` circuit constructions, requiring only $log_2(n)$ qubits to encode $n$-dim datapoints. 

### Experiment
Below we compare the classification performance of `VQC` on the [Wine dataset](https://scikit-learn.org/stable/datasets/index.html#wine-dataset) using `RawFeatureVector` and `SecondOrderExpansion` feature maps. As you'll see, the former leads to about $90\%$ accuracy using only $2$ qubits, whereas the latter achieves only around $50\%$ accuracy, using $4$ qubits and taking $3\times$ as long. 

We first prepare the Wine dataset:

In [1]:
import numpy as np
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, MinMaxScaler
from sklearn.decomposition import PCA


def Wine(training_size, test_size, n):
    class_labels = [r'A', r'B', r'C']

    data, target = datasets.load_wine(True)
    sample_train, sample_test, label_train, label_test = train_test_split(
        data, target, test_size=test_size, random_state=7
    )

    # Now we standarize for gaussian around 0 with unit variance
    std_scale = StandardScaler().fit(sample_train)
    sample_train = std_scale.transform(sample_train)
    sample_test = std_scale.transform(sample_test)

    # Now reduce number of features to number of qubits
    pca = PCA(n_components=n).fit(sample_train)
    sample_train = pca.transform(sample_train)
    sample_test = pca.transform(sample_test)

    # Scale to the range (-1,+1)
    samples = np.append(sample_train, sample_test, axis=0)
    minmax_scale = MinMaxScaler((-1, 1)).fit(samples)
    sample_train = minmax_scale.transform(sample_train)
    sample_test = minmax_scale.transform(sample_test)
    # Pick training size number of samples from each distro
    training_input = {key: (sample_train[label_train == k, :])[:training_size] for k, key in enumerate(class_labels)}
    test_input = {key: (sample_train[label_train == k, :])[training_size:(
        training_size+test_size)] for k, key in enumerate(class_labels)}
    return sample_train, training_input, test_input, class_labels

We can then set up the experiment as follows:

In [2]:
import numpy as np
import scipy

from qiskit import BasicAer
from qiskit.aqua.input import ClassificationInput
from qiskit.aqua import run_algorithm, QuantumInstance, aqua_globals
from qiskit.aqua.components.optimizers import SPSA, COBYLA

feature_dim = 4 # dimension of each data point
training_dataset_size = 20
testing_dataset_size = 10
random_seed = 10598
np.random.seed(random_seed)

sample_Total, training_input, test_input, class_labels = Wine(
    training_size=training_dataset_size,
    test_size=testing_dataset_size,
    n=feature_dim
)

classification_input = ClassificationInput(training_input, test_input)
params = {
    'problem': {'name': 'classification', 'random_seed': random_seed},
    'algorithm': {'name': 'VQC'},
    'backend': {'provider': 'qiskit.BasicAer', 'name': 'statevector_simulator'},
    'optimizer': {'name': 'COBYLA', 'maxiter':200},
    'variational_form': {'name': 'RYRZ', 'depth': 3},
    'feature_map': {'name': None},
}

Let's try `RawFeatureVector` first:

In [3]:
params['feature_map']['name'] = 'RawFeatureVector'
result = run_algorithm(params, classification_input)
print("VQC accuracy with RawFeatureVector: ", result['testing_accuracy'])

VQC accuracy with RawFeatureVector:  0.8666666666666667


Now let's try `SecondOrderExpansion`:

In [4]:
params['feature_map']['name'] = 'SecondOrderExpansion'
result = run_algorithm(params, classification_input)
print("Test accuracy with SecondOrderExpansion: ", result['testing_accuracy'])

Test accuracy with SecondOrderExpansion:  0.5333333333333333
