## Installing Libraries needed for Quantum Machine Learning

This Tutorial depends on the code represented on the qiskit official website and the website can be accessed from here https://qiskit.org/documentation/machine-learning/tutorials/02_neural_network_classifier_and_regressor.html

In [None]:
%pip install qiskit
%pip install qiskit_machine_learning
%pip install imbalanced-learn
%pip install pylatexenc

#### Note : Restart the kernel after installing the required library to take those installations into effect

### Importing the libraries

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
from qiskit import Aer, QuantumCircuit
from qiskit.opflow import Z, I, StateFn
from qiskit.utils import QuantumInstance, algorithm_globals
from qiskit.circuit import Parameter
from qiskit.circuit.library import RealAmplitudes, ZZFeatureMap
from qiskit.algorithms.optimizers import COBYLA, L_BFGS_B
from sklearn.metrics import classification_report
from qiskit_machine_learning.neural_networks import TwoLayerQNN, CircuitQNN
from qiskit_machine_learning.algorithms.classifiers import NeuralNetworkClassifier, VQC
from qiskit_machine_learning.algorithms.regressors import NeuralNetworkRegressor, VQR
from typing import Union
from sklearn.model_selection import train_test_split
from qiskit_machine_learning.exceptions import QiskitMachineLearningError
from imblearn.over_sampling import SMOTE
from IPython.display import clear_output

### Reading the dataset

The probem statement is to predict Milk quality on the basis of the input features. The dataset was taken from kaggle.

In [None]:
data=pd.read_csv("milknew.csv")

In [None]:
data.head(5)

#### Getting the description and information about the dataset

In [None]:
data.info()

In [None]:
data.describe()

The Task is to predict the Grade of the Milk.so Grade here is our target variable

In [None]:
data['Grade'].value_counts()

#### Removing one category from target Variable

Here we are seeing the Taget variable is having three categories. Since this tutorial is concerned with the Implementation of Quantum Computing in Machine learning, we will deal with only Two categories of the target variable that is High and low.

For this we will remove all the data that is concerned with the target category as medium and after removing we will balance the class

In [None]:
df=data[(data['Grade']=='low')| (data['Grade']=='high')]

In [None]:
df['Grade'].value_counts()

#### Balancing the Categories

Since we have removed one category, the dataset got imbalanced. To omit the chances of overfitiing let's balance the dataset using SMOTE Algorithm

In [None]:
sm = SMOTE(random_state = 42)
X_oversampled, y_oversampled = sm.fit_resample(df.iloc[:,:-1], df.iloc[:,-1])
X = pd.DataFrame(X_oversampled, columns=df.iloc[:,:-1].columns)
Y = pd.DataFrame(y_oversampled, columns=[list(df.columns)[-1]])
df_balanced=pd.concat([X,Y],axis=1)

In [None]:
df_balanced['Grade'].value_counts()

In [None]:
df_balanced.head(5)

In [None]:
df_balanced.shape

###  Exploratory Data Analysis

#### Plotting the scatter plot between PH and Hued grouped by Taste

In [None]:
sns.scatterplot(data=df_balanced,x='pH',y='Temprature',hue='Taste')

#### Plotting the scatter plot between PH and Hued grouped by Odor

In [None]:
sns.scatterplot(data=df_balanced,x='pH',y='Temprature',hue='Odor')

In [None]:

plt.xticks(rotation='70')
sns.barplot(data=df_balanced,x='pH',y='Temprature',hue='Odor')

### Label encoding the Target column

In [None]:
df_balanced['Grade']=df_balanced['Grade'].map({"high":1,"low":0})

### Splitting the Dataframe into Train and test

Splitting the Dataframe to Train and testfor training and validation Purposes. Also converting the splitted dataframes to numpy array as it will be needed while training the model

In [None]:
X_train,X_test,Y_train,Y_test=train_test_split(df_balanced.iloc[:,:-1],df_balanced.iloc[:,-1])
X_train=np.array(X_train)
X_test=np.array(X_test)
Y_train=np.array(Y_train)
Y_test=np.array(Y_test)

### Classification with a CircuitQNN


Next we show how a CircuitQNN can be used for classification within a NeuralNetworkClassifier. In this context, the CircuitQNN is expected to return -dimensional probability vector as output, where  denotes the number of classes. Sampling from a QuantumCircuit automatically results in a probability distribution and we just need to define a mapping from the measured bitstrings to the different classes. For binary classification we use the parity mapping.

Since we have number of input variables or features = 7, so we will use 7 qubits.

In [None]:
quantum_instance = QuantumInstance(Aer.get_backend("qasm_simulator"), shots=100)
# construct feature map
num_inputs=df.iloc[:,:-1].shape[1]
feature_map = ZZFeatureMap(num_inputs)

# construct ansatz
ansatz = RealAmplitudes(num_inputs, reps=2)

# construct quantum circuit
qc = QuantumCircuit(num_inputs)
qc.append(feature_map, range(num_inputs))
qc.append(ansatz, range(num_inputs))
qc.decompose().draw(output="mpl")

In [None]:
# parity maps bitstrings to 0 or 1
def parity(x):
    return "{:b}".format(x).count("1") % 2


output_shape = 2  # corresponds to the number of classes, possible outcomes of the (parity) mapping.

#### callback function that draws a live plot when the .fit() method is called

In [None]:
 def callback_graph(weights, obj_func_eval):
    clear_output(wait=True)
    objective_func_vals.append(obj_func_eval)
    plt.title("Objective function value against iteration")
    plt.xlabel("Iteration")
    plt.ylabel("Objective function value")
    plt.plot(range(len(objective_func_vals)), objective_func_vals)
    plt.show()


#### construct QNN

In [None]:
circuit_qnn = CircuitQNN(
    circuit=qc,
    input_params=feature_map.parameters,
    weight_params=ansatz.parameters,
    interpret=parity,
    output_shape=output_shape,
    quantum_instance=quantum_instance,
)

In [None]:
# construct classifier
circuit_classifier = NeuralNetworkClassifier(
    neural_network=circuit_qnn, optimizer=COBYLA(), callback=callback_graph
)

Training on Simulators will take time and for the sake of it we have used only 100 rows of the data. If you have real QPUs you can run it on whole dataset

In [None]:
# create empty array for callback to store evaluations of the objective function
objective_func_vals = []
plt.rcParams["figure.figsize"] = (12, 6)

# fit classifier to data
circuit_classifier.fit(X_train[:100], Y_train[:100])

# return to default figsize
plt.rcParams["figure.figsize"] = (6, 4)

# score classifier
circuit_classifier.score(X_train, Y_train)


In [None]:
# evaluate data points
y_predict = circuit_classifier.predict(X_test)

In [None]:
print(classification_report(y_predict,Y_test))

We have achieved the accuracy of 66%. Let's try to build VQC and see if there is any accuracy improvements

### Classification with Variational Quantum Classifier (VQC)


The VQC is a special variant of the NeuralNetworkClassifier with a CircuitQNN. It applies a parity mapping (or extensions to multiple classes) to map from the bitstring to the classification, which results in a probability vector, which is interpreted as a one-hot encoded result. By default, it applies this the CrossEntropyLoss function that expects labels given in one-hot encoded format and will return predictions in that format too.

In [None]:
vqc = VQC(
    feature_map=feature_map,
    ansatz=ansatz,
    loss="cross_entropy",
    optimizer=COBYLA(),
    quantum_instance=quantum_instance,
    callback=callback_graph,
)

Since VQC takes Target variable in one hot encoded form, Let's convert our target variable into one hot encoded form

In [None]:
Y_one_hot=[]
for i in Y_train:
    if i==1:
        Y_one_hot.append([0,1])
    else:
        Y_one_hot.append([1,0])
Y_one_hot=np.array(Y_one_hot)

In [None]:
objective_func_vals = []
plt.rcParams["figure.figsize"] = (12, 6)

# fit classifier to data
vqc.fit(X_train[0:50], Y_one_hot[0:50])

# return to default figsize
plt.rcParams["figure.figsize"] = (6, 4)

# score classifier
vqc.score(X_train[0:50], Y_one_hot[0:50])

In [None]:
# evaluate data points
y_predict = vqc.predict(X_test)

In [None]:
y_pred=[]
for i in y_predict:
    if any(i==[1,0]):
        y_pred.append(0)
    else:
        y_pred.append(1)
         

In [None]:
print(classification_report(y_pred,Y_test))