# **Ansatz Circuit Configuration Testbench - Quantum Machine Learning Capstone 2022**
#### **Capstone Members ⸻** Carson Darling, Brandon Downs, Christopher Haddox, Brightan Hsu, Matthew Jurenka
#### **Sponsor ⸻** Dr. Gennaro De Luca

<br>

### Packages and Non-Standard Python Package Installation
The non-standard python packages used by this TestBench are SKLearn,Pennylane, and Pandas. 
Uncomment and execute the method **clean_install()** to execute the installation via PIP. The environment must have Python 3.6+ and PIP installed.


In [1]:
import subprocess
import sys
    
def pip_install(package):
    subprocess.run([sys.executable, "-m", "pip", "install", package])

def clean_install():
    [pip_install(package) for package in ['sklearn', 'pennylane', 'pandas']]

#clean_install()

In [2]:
from sklearn.model_selection import train_test_split
from pennylane import numpy as np
from datetime import datetime
import pennylane as qml
import random as rand
import pandas as pd
import os as os

## <br> <br> <br> **Introduction**


The purpose of this Jupyter Notebook is to serve as a testbench for the quantum machine learning capstone group. This testbench allows for the testing of a quantum variational classifier with different ansatz configurations on three different datasets. Each dataset consists of instances containing a binary classification over 4 numeric features. The circuits will all exhibit rotational encoding over 4 qubits, allowing a qubit for each feature. The datasets are as follows:


&emsp;&emsp;[Iris Dataset](https://archive.ics.uci.edu/ml/datasets/iris) ⸻ 3 classes of 150 instances of plant measures, where each class refers to a type of iris plant. This dataset will be truncated to only 2 classes.

&emsp;&emsp;[Banknote Dataset](https://archive.ics.uci.edu/ml/datasets/banknote+authentication)⸻ 2 classes consisting of 1372 instances of banknote-like specimen, where each class refers to forgery or authenticate.

&emsp;&emsp;[Transfusion Dataset](https://archive.ics.uci.edu/ml/datasets/Blood+Transfusion+Service+Center) ⸻ 2 classes consisting of 748 donors from the donor database, where each class refers to donation in March 2007.
<br>

## **Methodology**

### **Independent Variables** - Variational Classifier & Circuit Constants

In [3]:
NUM_QUBITS = 4
NUM_LAYERS = 6
BATCH_SIZE = 5
STEP_SIZE = .1
WEIGHTS_INIT = 0.01 * np.random.randn(NUM_LAYERS, NUM_QUBITS, 3, requires_grad=True)
BIAS_INIT = np.array(0.0, requires_grad=True)
OPTIMIZER = qml.optimize.AdamOptimizer(STEP_SIZE)
DEV = qml.device("default.qubit", wires=4)

### **Dependent Variable** - Function Reference to Ansatz Circuit Configurations

This global variable will serve as a function reference to a specific tested ansatz circuit configuration during test iterations. The implementation of function reference serves for increased readability in circuit configurations by encapsulating encoding techniques within a circuit configuration.

In [4]:
CURRENT_TEST_CIRCUIT = None

### **Ansatz Circuit Configuration Library**

In [5]:
#https://medium.com/predict/classification-using-vqc-with-custom-variational-ansatz-c7c45fb699a1
# This is for 3 qubits, we need 4. 
#TODO Fix. Could be garbage, can throw out if not worthwhile.
def layered_gate_circuit(params,x):
    xEmbeded=[i*np.pi for i in x]
    for i in range(NUM_WIRES):
        qml.RX(xEmbeded[i],wires=i)
        qml.Rot(*params[0,i],wires=i)
        
    qml.CZ(wires=[1, 0])
    qml.CZ(wires=[1, 2])
    qml.CZ(wires=[0, 2])
    for i in range(NUM_WIRES):
        qml.Rot(*params[1,i],wires=i)

def alternating_operator_circuit():
    
    pass

def tensor_network_circuit():
    
    pass
        
# https://discuss.pennylane.ai/t/qaoa-embedding-layer/1724/2
# https://docs.pennylane.ai/en/latest/code/api/pennylane.QAOAEmbedding.html
# TODO Implement + Optimize
@qml.qnode(DEV)
def qaoa_circuit(features):
    # TODO
    
    return qml.expval(qml.PauliZ(0))


# Pennylane Circuit from Quantum Variational Classifier
@qml.qnode(DEV)
def pennylane_circuit(weights, features):
    rotational_encoding(features)

    for W in weights:
        qml.Rot(W[0, 0], W[0, 1], W[0, 2], wires=0)
        qml.Rot(W[1, 0], W[1, 1], W[1, 2], wires=1)
        qml.Rot(W[2, 0], W[2, 1], W[2, 2], wires=2)
        qml.Rot(W[3, 0], W[3, 1], W[3, 2], wires=3)    
        qml.CNOT(wires=[0, 1])
        qml.CNOT(wires=[1, 2])
        qml.CNOT(wires=[2, 3])
        qml.CNOT(wires=[3, 0])

    return qml.expval(qml.PauliZ(0))
    

### **Quantum Variational Classifier**

Below is the variational classifier and its supporting functions. This variational classifier model is adapted from the [pennylane variational classifier demo](https://pennylane.ai/qml/demos/tutorial_variational_classifier.html).

In [6]:
def variational_classifier(weights, bias, angles):
    return CURRENT_TEST_CIRCUIT(weights, angles) + bias
    
def cost(weights, bias, features, labels):
    predictions = [variational_classifier(weights, bias, f) for f in features]
    return square_loss(labels, predictions)

In [7]:
def square_loss(labels, predictions):
    loss = 0
    for l, p in zip(labels, predictions):
        loss = loss + (l - p) ** 2

    loss = loss / len(labels)
    return loss

def accuracy(labels, predictions):

    loss = 0
    for l, p in zip(labels, predictions):
        if abs(l - p) < 1e-5:
            loss = loss + 1
    loss = loss / len(labels)

    return loss

In [8]:
def time_elapsed(start_time):
    return f'{datetime.now() - start_time}'
    
def format_time():
    return datetime.now().strftime('%m/%d/%Y, %H:%M:%S')
    

In [9]:
headers = ['Epoch', 'Cost', 'Train_Accuracy', 'Test_Accuracy']
template = '\t\t\t{:<7}   {:<7}   {:<16}   {:<15}'

def train_classifier(dataframes, circuit, total_iterations):
    
    start_time_test = datetime.now() 
    print(f"\nCircuit: {circuit.__name__} | Start: {format_time()}")
    
    # Set the global for the current test circuit as a reference the relevant circuit function - Dependent Variable
    global CURRENT_TEST_CIRCUIT
    CURRENT_TEST_CIRCUIT = circuit
    
    for dataset in dataframes:
        
        start_time_dataset = datetime.now() 
        
        print(f"\n\tCircuit: {circuit.__name__} | Dataset: {dataset[0]} | Start: {format_time()}")
        
        for iteration in range(total_iterations):
        
            # Preprocess the data and seperate into train and test sets. Initialize the weights, bias.
            features, labels = preprocess(dataset[1])
            X_train, X_test, y_train, y_test = train_test_split(features, labels, test_size=0.25, random_state=rand.randint(0, 100))
            weights, bias = WEIGHTS_INIT, BIAS_INIT
            max_cost_iteration = max_train_iteration = max_test_iteration = 0

            # Print the header for the current Iteration
            start_time_iteration = datetime.now()
            print(f"\n\t\tIteration {iteration+1} of {total_iterations} | Start: {format_time()}\n")
            print (template.replace(':', ':-').format('', '', '', ''))
            print(template.format(*headers))
            print (template.replace(':', ':-').format('', '', '', ''))

            for epoch_index in range(100):

                # Update the weights by one optimizer step
                batch_index = np.random.randint(0, len(X_train), (BATCH_SIZE,))

                X_train_batch = X_train[batch_index]
                y_train_batch = y_train[batch_index]
                weights, bias, _, _ = OPTIMIZER.step(cost, weights, bias, X_train_batch, y_train_batch)

                # Compute predictions on train and validation set
                predictions_train = [np.sign(variational_classifier(weights, bias, value)) for value in X_train]
                predictions_test = [np.sign(variational_classifier(weights, bias, value)) for value in X_test]

                # Compute accuracy on train and validation set
                accuracy_train = accuracy(y_train, predictions_train)
                accuracy_test = accuracy(y_test, predictions_test)
                epoch_cost = cost(weights, bias, features, labels)

                # Tabulate a summary of the current epoch
                print(template.format(*[f'{epoch_index:4d}', f'{epoch_cost:0.3f}', f'{accuracy_train:0.7f}', f'{accuracy_test:0.7f}']))
                max_cost_iteration = epoch_cost if epoch_cost > max_cost_iteration else max_cost_iteration
                max_train_iteration = accuracy_train if accuracy_test > max_train_iteration else max_train_iteration
                max_test_iteration = accuracy_test if accuracy_test > max_test_iteration else max_test_iteration
                
                # Break if train and test validation is 100% accuracy.
                if accuracy_test == accuracy_train == 1:
                    break

            # Summarize the findings for the Circuit, Dataset, Iteration
            print (template.replace(':', ':-').format('', '', '', ''))
            print(template.format(*['Maxima', f'{max_cost_iteration:0.3f}', f'{max_train_iteration:0.7f}', f'{max_test_iteration:0.7f}']))
            print (template.replace(':', ':-').format('', '', '', ''))
            print(f"\t\t\tFT: {format_time()} | Elapsed: {time_elapsed(start_time_iteration)}")
        print(f"\t\tFT: {format_time()} | Elapsed: {time_elapsed(start_time_dataset)}")
    print(f"\tFT: {format_time()} | Elapsed: {time_elapsed(start_time_test)}")


### **Data Preparation, Preprocessing, and Encoding**

In [10]:
def preprocess(df):
    df.target = df.target.map({df.target.unique()[0]: -1, df.target.unique()[1]: 1})
   
    if df.target.value_counts()[-1] >= 100 and df.target.value_counts()[1] >= 100: 
        df = pd.concat([
            df[(df.target == -1)].sample(n=100, replace=False, random_state=rand.randint(0, 100)),
            df[(df.target == 1)].sample(n=100, replace=False, random_state=rand.randint(0, 100))
        ])
    else:
        df = df[(df.target == -1) | (df.target == 1)]
    
    X = np.array(df)[:,0:4]
    features = 2 * np.pi * (X - np.min(X)) / (np.max(X) - np.min(X))
    labels = np.array(df)[:,-1]
    
    return features, labels

def rotational_encoding(x):
    qml.Rot(x[0], x[0], x[0], wires=0)
    qml.Rot(x[1], x[1], x[1], wires=1)
    qml.Rot(x[2], x[2], x[2], wires=2)
    qml.Rot(x[3], x[3], x[3], wires=3)
    qml.CNOT(wires=[0, 1])
    qml.CNOT(wires=[1, 2])
    qml.CNOT(wires=[2, 3])
    qml.CNOT(wires=[3, 0])
    return np.array(x)


paths = ['~/Documents/QML/iris.data', '~/Documents/QML/banknote.data','~/Documents/QML/transfusion.data']
dataframes = [(os.path.splitext(os.path.basename(path))[0], pd.read_csv(path, names=['a0','a1','a2','a3', 'target'])) for path in paths]

### **Test Execution**

In [11]:
train_classifier(dataframes, pennylane_circuit, 3)

# train_classifier(dataframes, circuit1, 3)
# train_classifier(dataframes, circuit2, 3)
# train_classifier(dataframes, circuit3, 3)
# train_classifier(dataframes, circuit4, 3)
# train_classifier(dataframes, circuit5, 3)
# train_classifier(dataframes, circuit6, 3)
# train_classifier(dataframes, circuit7, 3)
# train_classifier(dataframes, circuit8, 3)
# train_classifier(dataframes, circuit9, 3)
# train_classifier(dataframes, circuit10, 3)
# train_classifier ( dataframes = dataframes, circuit = THE_CIRCUIT_FUNCTION_NAME, total_iterations=INTEGER)

# TODO:
# 1. Collect Data and send to CSV
# 2. Select the best run from optimization steps
#     Maybe Double the step size and run 3 iterations of each step size? Cast out outliers during data processing
#     Maybe Run different optimizers as well?
# 3. Graph it on completion


# Q for group
# 1.  Increase Iterations
# AdamOPtimizer <-
# MomemtumOptimizer <-
# 2. Undersampling
# 3. Iris -> flower1 flower2 flower3 linearity




Circuit: pennylane_circuit | Start: 10/14/2022, 19:36:49

	Circuit: pennylane_circuit | Dataset: iris | Start: 10/14/2022, 19:36:49

		Iteration 1 of 3 | Start: 10/14/2022, 19:36:49

			-------   -------   ----------------   ---------------
			Epoch     Cost      Train_Accuracy     Test_Accuracy  
			-------   -------   ----------------   ---------------
			   0      1.989     0.0266667          0.0400000      
			   1      1.266     0.2933333          0.2800000      
			   2      0.669     0.8533333          0.9200000      
			   3      0.419     0.9466667          1.0000000      
			   4      0.403     0.9466667          0.9600000      
			   5      0.539     0.8800000          0.8000000      
			   6      0.456     0.9066667          0.8000000      
			   7      0.320     0.9600000          1.0000000      
			   8      0.325     1.0000000          1.0000000      
			-------   -------   ----------------   ---------------
			Maxima    1.989     1.0000000          1.0000000      
			-

## **Results**
this is tables and stuff
e.g. FIGURE 1


## **Discussion**
Talk about FIGURE 1

## **Conclusion**
TLDR of the discussion, future applications