##QSVM on LumA v LumB
We start off the process by installing qiskit to this Jupyter Notebook. The process consists of preparing normalized data to pass it into qiskit's QSVM function and then waiting for the results to be processed. 
The QSVM process is almost exactly the same as a normal SVM. It differs only in the feature map which as you probably guessed, is a Quantum Circuit in the QSVM process.

In [0]:
!pip install qiskit

Collecting qiskit
  Downloading https://files.pythonhosted.org/packages/db/a4/587cbe0186990b0f07b6161458461ae8d9ddbccdb00f827a23384eeb77fa/qiskit-0.19.3.tar.gz
Collecting qiskit-terra==0.14.1
[?25l  Downloading https://files.pythonhosted.org/packages/55/c3/6c3561cbcf69a791307ada32d61b4eeb7ab6a38ac968e0f1e36f0e40b249/qiskit_terra-0.14.1-cp36-cp36m-manylinux2010_x86_64.whl (6.7MB)
[K     |████████████████████████████████| 6.7MB 2.6MB/s 
[?25hCollecting qiskit-aer==0.5.2
[?25l  Downloading https://files.pythonhosted.org/packages/45/6f/2d269684891b634cce6ddb5684fd004c7b6bf986cec8544f4b6f495c8b99/qiskit_aer-0.5.2-cp36-cp36m-manylinux2010_x86_64.whl (23.3MB)
[K     |████████████████████████████████| 23.3MB 13.0MB/s 
[?25hCollecting qiskit-ibmq-provider==0.7.2
[?25l  Downloading https://files.pythonhosted.org/packages/92/1f/0c6b290064a471a8a9c1e3366367b46d320efdad6b730eadefbd1f3c4eb0/qiskit_ibmq_provider-0.7.2-py3-none-any.whl (155kB)
[K     |████████████████████████████████| 163kB 38

In [0]:
import numpy as np
from qiskit import BasicAer
from qiskit.providers.aer.noise import *
from qiskit.aqua import QuantumInstance, aqua_globals
from qiskit.aqua.components.feature_maps import SecondOrderExpansion
from qiskit.aqua.components.multiclass_extensions import (ErrorCorrectingCode,AllPairs,OneAgainstRest)
from qiskit.aqua.algorithms import QSVM
from qiskit.aqua.utils import get_feature_dimension
from qiskit import IBMQ
import qiskit
from qiskit.ml.datasets import *
from qiskit.circuit.library import ZZFeatureMap
from qiskit.aqua.utils import split_dataset_to_data_and_labels, map_label_to_class_name
import pandas as pd
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.decomposition import PCA
from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.svm import SVC
from sklearn.pipeline import Pipeline
from sklearn.metrics import accuracy_score, roc_curve, confusion_matrix
from sklearn.model_selection import cross_val_predict
import feather


##Data Prep and loading my IBMQ Account
The data that we're importing is from the 'Quantum Annealing versus Classical Machine Learning Algorithms' paper. The data is of a feather type, so we use pandas to read it in as a DataFrame.

In [0]:
IBMQ.save_account('75ed5c790fc03e4cc6937b8bcd170d2fc3f72cb9f931ebbed2e11c9124b20d2b65c99a0a8cbb69884306828181f5051ebc01a2f4e4f4ba54fce8481b1c12dcfa')

In [0]:
data_train = pd.read_feather('drive/My Drive/Public_Folder/data5_lumAB_train_normalized.feather')
data_test = pd.read_feather('drive/My Drive/Public_Folder/data5_lumAB_test_normalized.feather')

In [0]:
x_data = pd.DataFrame(data=data_train, columns = data_train.columns)
y_data = pd.DataFrame(data=data_train, columns = ['cancer'])

x_test = pd.DataFrame(data = data_test, columns = data_test.columns)
y_test = pd.DataFrame(data = data_test, columns = ['cancer'])

We apply PCA to the dataset because ibmq can't handle more than 32 features (Each Qubit is mapped onto one feature and the maximum qubits available to us on IBM's OpenQasmSimulator are 32).

In [0]:
pca = PCA(n_components = 10)
label_encoder = LabelEncoder()
y_data_labeless = label_encoder.fit_transform(y_data)
y_test_labeless = label_encoder.fit_transform(y_test)

  y = column_or_1d(y, warn=True)


In [0]:
x_data_labeless = x_data.drop(columns = ['cancer'], axis=1)
x_test_labeless = x_data.drop(columns = ['cancer'], axis=1)

In [0]:
x_data_pca = pca.fit_transform(x_data_labeless)
x_test_pca = pca.transform(x_test_labeless)

In [0]:
A = np.empty(5)
B = np.empty(5)
A_test = np.empty(5)
B_test = np.empty(5)
for x in range(len(y_data_labeless)):
  if y_data_labeless[x] == 0:
    A = np.vstack((A, x_data_pca[x]))
  else:
    B = np.vstack((B, x_data_pca[x]))
for x in range(len(y_test_labeless)):
  if y_test_labeless[x] == 0:
    A_test = np.vstack((A_test, x_test_pca[x]))
  else:
    B_test = np.vstack((B_test, x_test_pca[x]))
A = np.delete(A, 0, axis=0)
B = np.delete(B, 0, axis=0)
A_test = np.delete(A_test, 0, axis=0)
B_test = np.delete(B_test, 0, axis=0)
sample = {
      'A' : A,
      'B' : B
}
test = {
    'A' : A_test,
    'B' : B_test
}

In [0]:
qiskit.__version__

'0.14.1'

In [0]:
shots = 1024
seed = 10598
feature_dims = 5

provider = IBMQ.load_account()
#backend1 = provider.get_backend('ibmq_vigo')
noise_model = NoiseModel.from_backend(backend1)
feature_map = ZZFeatureMap(feature_dimension=feature_dims, reps=2, entanglement='linear')
qsvm = QSVM(feature_map, sample, test)

In [0]:
backend = BasicAer.get_backend('qasm_simulator')
quantum_instance = QuantumInstance(backend, shots=1024, seed_simulator=10598, seed_transpiler=10598)

In [0]:
result = qsvm.run(quantum_instance, noise_model=noise_model)

In [0]:
print("Accuracy = ", result['testing_accuracy'] * 100)

Accuracy =  55.73770491803278
