<img src="https://s3-ap-southeast-1.amazonaws.com/he-public-data/wordmark_black65ee464.png" width="700">

# Week 2 : Final Challenge

**Welcome to the final challenge!**  

In the previous notebook we've seen how we can use the VQC class in Aqua to classify the digits `0` and `1`. However, classifying `0` and `1` is relatively simple as digits `0` and `1` are easily distinguishable. `4` and `9` however, are notoriously similar, with a _loop_ on the top and a _line_ on the bottom. This can be corroborated looking at our 2-D t-SNE plot from the previous notebook (Fig.2), we see that `0` and `1` are clustered relatively far from each other making them easily distinguishable, however `4` and `9` are overlapping. In this challenge we are providing you with a dataset with digits reduced to **dimension 3**. For example, in Fig.1 we can see the dimension reduction of the 784 dimension vector for digit `4` into a dimension 3 feature vector. 

**Fig.1 : Features of the digit `4` after reducing dimension to 3:** 
<img src="https://s3-ap-southeast-1.amazonaws.com/he-public-data/four2a7701f.png" width="700">

**Fig.2 : MNIST dataset after dimension reduction to 2 as given in the previous notebook:**
<img src="https://s3-ap-southeast-1.amazonaws.com/he-public-data/mnist_plot53adb39.png" width="400">

## Challenge Question   
Use the VQC method from Aqua to classify the digits `4` and `9` as given in the dataset **challenge_dataset_4_9_dim3_training.csv** provided to you. 

## Rules and Guidelines

* Your `QuantumCircuit` can have a **maximum of 6 qubits**.
* **Cost of the circuit should be less than 2000**.  
* You should not change names of the functions `feature_map()` , `variational_circuit()`  and `return_optimal_params()`.
* All the functions must return the value types as mentioned. 
* All circuits must be Qiskit generated.
* Best of all submissions is considered for grading.

## Judging criteria 

* Primary judgement is based on the **accuracy of the model**, higher the better. **Accuracies which differ by less than 0.005 will be considered to be equal**. ex: Accuracies 0.7783 and 0.7741 will be considered to be equal.
* If the accuracies are tied, the tie will be broken using **cost of the circuit** as the metric, lower the better. 
* In the case that both accuracy of the model and cost of the circuit are equal, **time of submission** is taken into account, Earlier the better. 

_**Important Note:**_ The **leaderboard shown during the progress of the competition** will only display accuracy of the model and is **not the final leaderboard**. Breaking ties between accuracy of the model by considering lower **cost of circuit** will only be done after the competition ends. **The final leaderboard will be announced post the event** which will take into consideration cost of the circuit and time of submission. 

## Certificate Eligibility

Everyone who scores an **accuracy greater than 0.70 (i.e, 70%) will be eligible for a certificate**. 


An explanation on how to calculate the accuracy of the model and the cost of the circuit is given in the end inside the `grade()` function. Before you submit, make sure the grading function is running on your device. To save time you can also use the grading function provided to calculate the accuracy and circuit cost without having to submit your solution onto HackerEarth. Remember, your final score will be determined using the same grading methods as given in this notebook, but will be evaluated on unseen datapoints.

In [49]:
# installing a few dependencies
# !pip install --upgrade seaborn==0.10.1
# !pip install --upgrade scikit-learn==0.23.1
# !pip install --upgrade matplotlib==3.2.0
# !pip install --upgrade pandas==1.0.4
# !pip install --upgrade qiskit==0.19.6 
# !pip install --upgrade plotly==4.9.0

# the output will be cleared after installation
from IPython.display import clear_output
clear_output()

In [50]:
# we have imported a few libraries we thing might be useful 
from qiskit import QuantumCircuit, QuantumRegister, ClassicalRegister
from qiskit import Aer

from qiskit import *
import numpy as np
from qiskit.visualization import plot_bloch_multivector, plot_histogram
%matplotlib inline
import matplotlib.pyplot as plt

import time
from qiskit.circuit.library import ZZFeatureMap, ZFeatureMap, PauliFeatureMap, RealAmplitudes, EfficientSU2
from qiskit.aqua.utils import split_dataset_to_data_and_labels, map_label_to_class_name
from qiskit.aqua import QuantumInstance
from qiskit.aqua.algorithms import VQC
from qiskit.aqua.components.optimizers import COBYLA


# The the write_and_run() magic function creates a file with the content inside the cell that it is run. 
# You have used this in previous exercises for creating your submission files. 
# It will be used for the same purpose here.

from IPython.core.magic import register_cell_magic
@register_cell_magic
def write_and_run(line, cell):
    argz = line.split()
    file = argz[-1]
    mode = 'w'
    with open(file, mode) as f:
        f.write(cell)
    get_ipython().run_cell(cell)

# Solution

## Data loading 

This notebook has helper functions and code snippets to save your time and help you concentrate on what's important: Increasing the accuracy of your model. Running the cell below will import the challenge dataset and will be available to you as `data`. Before running the cell below store the dataset in this file structure (or change the `data_path` accordingly):  

- `challenge_notebook.ipynb`
- `dataset`
    - `challenge_dataset_4_9.csv`


In [51]:
#data_path='./dataset/'
data_path=''
data = np.loadtxt(data_path + "challenge_dataset_4_9_dim3_training.csv", delimiter=",")

ind = np.random.permutation(data.shape[0])

# extracting the first column which contains the labels
data_labels = data[ind, :1].reshape(data.shape[0],)
# extracting all the columns but the first which are our features
data_features = data[ind, 1:]

## Visualizing the dataset

Before we dive into solving the question it is always beneficial to look at the dataset pictographically. This will help us understand patterns which we could leverage when designing our feature maps and variational circuits for example.

In [52]:
import plotly.express as px
import pandas as pd

# creating a dataframe using pandas only for the purpose fo plotting
df = pd.DataFrame({'Component 0':data_features[:,0], 'Component 1':data_features[:,1], 
                   'Component 2':data_features[:,2], 'label':data_labels})

fig = px.scatter_3d(df, x='Component 0', y='Component 1', z='Component 2', color='label')
fig.show()

## Extracting the training dataset

The given dataset has already been reduced in dimension and normalized, so, further pre-processing isn't techincally required. You can do so if you want to, but the testing dataset will be of the same dimension and normalisation as the training dataset provided. Training a dataset of size 6,000 will take multiple hours so you'll need to extract a subset of the dataset to use as a training dataset. The accuracy of the model may vary based on the datapoints and size of the training dataset you choose. Thus, experimenting with various sizes and datapoints will be necessary. For example, Increasing the training dataset size may increase the accuracy of the model however it will increase the training time as well.

Use the space below to extract your training dataset from `data`. For your convenience `data` has been segregated into `data_labels` and `data_features`.

* `data_labels` : 6,000 $\times$ 1 column vector with each entry either `4` or `9` 
* `data_features` : 6,000 $\times$ 3 matrix with each row having the feature corresponding to the label in `data_labels`

**Note:** This process was done in the previous [VQC notebook](https://github.com/Qiskit-Challenge-India/2020/blob/master/Day%206%2C%207%2C8/VQC_notebook.ipynb) with `0` and `1` labels and can be modified and used here as well. 

In [54]:
### WRITE YOUR CODE BETWEEN THESE LINES - START

# do your classical pre-processing here

# store your training and testing datasets to be input in the VQC optimizer in the "training_input" and 
# "testing_input" variables respectively. These variables will eb accessed whiile creating a VQC instance later. 
four_datapoints = []
nine_datapoints = []

for i in range(6000):
    if data_labels[i] == 4:
        four_datapoints.append(data_features[i])

for i in range(6000):
    if data_labels[i] == 9:
        nine_datapoints.append(data_features[i])
        
four_datapoints = np.array(four_datapoints)
nine_datapoints = np.array(nine_datapoints)
np.random.shuffle(four_datapoints)
np.random.shuffle(nine_datapoints)

train_num = 100
test_num = 100

training_input = {4:four_datapoints[:train_num], 9:nine_datapoints[:train_num]}
test_input = {4:four_datapoints[-test_num:], 9:nine_datapoints[-test_num:]}

### WRITE YOUR CODE BETWEEN THESE LINES - END

## Building a Quantum Feature Map

Given below is the `feature_map()` function. It takes no input and has to return a feature map which is either a `FeatureMap` or `QuantumCircuit` object. In the previous notebook you've learnt how feature maps work and the process of using existing feature maps in Qiskit or creating your own. In the space given **inside the function** you have to create a feature map and return it.   


**IMPORTANT:** 
* If you require Qiskit import statements other than the ones provided in the cell below, please include them inside the appropriate space provided. **All additional import statements must be Qiskit imports.** 
* the first line of the cell below must be `%%write_and_run feature_map.py`. This function stores the content of the cell below in the file `feature_map.py`

In [146]:
%%write_and_run feature_map.py
# the write_and_run function writes the content in this cell into the file "feature_map.py"

### WRITE YOUR CODE BETWEEN THESE LINES - START
    
# import libraries that are used in the function below.
from qiskit import QuantumCircuit
from qiskit.circuit import ParameterVector
from qiskit.circuit.library import ZZFeatureMap, ZFeatureMap, PauliFeatureMap
    
### WRITE YOUR CODE BETWEEN THESE LINES - END

def feature_map(): 
    # BUILD FEATURE MAP HERE - START
    
    # import required qiskit libraries if additional libraries are required
    
    # build the feature map
    #feature_map = PauliFeatureMap(feature_dimension=3, reps=2, paulis = ['Z','X','ZY'])
    #feature_map = ZZFeatureMap(feature_dimension=3, reps=5, entanglement='full')
    
    ckt9_depth=3
    num_qubits=6
    x = ParameterVector('x', length=3)
    feature_map=QuantumCircuit(num_qubits)
    
    for i in range(ckt9_depth):
        for j in range(num_qubits):
            feature_map.h(j)
        for j in range(num_qubits-1):
#             feature_map.crz((np.pi-x[j])*(np.pi-x[j+1]),j,j+1)
            feature_map.cz(j,j+1)
        for j in range(num_qubits//2):
            feature_map.rx(x[j],j)
        for j in range(num_qubits//2):
            feature_map.rx(x[j]*x[(j+1)%3],3+j)
    
    # BUILD FEATURE MAP HERE - END
    
    #return the feature map which is either a FeatureMap or QuantumCircuit object
    return feature_map

## Building a Variational Circuit

Given below is the `variational_circuit()` function. It takes no input and has to return a variational circuit which is either a `VariationalForm` or `QuantumCircuit` object. In the previous notebook you've learnt how variational circuits work and the process of using existing variational circuit in Qiskit or creating your own. You have to create a variational circuit in the space given **inside the function** and return it. You can find various variational circuits in the [Qiskit Circuit Library](https://qiskit.org/documentation/apidoc/circuit_library.html) under N-local circuits.

**IMPORTANT:** 
* If you require Qiskit import statements other than the ones provided in the cell below, please include them inside the appropriate space provided. **All additional import statements must be Qiskit imports.** 
* the first line of the cell below must be `%%write_and_run feature_map.py`. This function stores the content of the cell below in the file `variational_circuit.py`

In [147]:
%%write_and_run variational_circuit.py
# the write_and_run function writes the content in this cell into the file "variational_circuit.py"

### WRITE YOUR CODE BETWEEN THESE LINES - START
    
# import libraries that are used in the function below.
from qiskit import QuantumCircuit
from qiskit.circuit import ParameterVector
from qiskit.circuit.library import  RealAmplitudes, EfficientSU2
    
### WRITE YOUR CODE BETWEEN THESE LINES - END

def variational_circuit():
    # BUILD VARIATIONAL CIRCUIT HERE - START
    
    # import required qiskit libraries if additional libraries are required
    
    # build the variational circuit
#     num_qubits = 3            
#     reps = 2
    
#     x = ParameterVector('x', length=num_qubits)  # creating a list of Parameters
#     var_circuit = QuantumCircuit(num_qubits)

#     # defining our parametric form
#     for _ in range(reps):
#         for i in range(num_qubits):
#             var_circuit.rx(x[i], i)
#         for i in range(num_qubits):
#             for j in range(i + 1, num_qubits):
#                 var_circuit.cx(i, j)
#                 var_circuit.u1(x[i] * x[j], j)
#                 var_circuit.cx(i, j)

    # ckt 14
    ckt14_depth=2
    num_qubits = 6
    var_circuit = QuantumCircuit(num_qubits)
    y = ParameterVector('y', length=4*num_qubits)

    for j in range(ckt14_depth):
        for i in range(num_qubits):
            var_circuit.ry(y[i],i)
        for i in range(num_qubits):
            t=num_qubits-1-i
            var_circuit.crx(y[i+num_qubits],t,(t+1)%num_qubits)

        for i in range(num_qubits):
            var_circuit.ry(y[i+2*num_qubits],i)

        for i in range(num_qubits):
            var_circuit.crx(y[i+3*num_qubits],(i-1)%num_qubits,(i-2)%num_qubits)

        
    #var_circuit = EfficientSU2(3, reps=5)
    
#     num_qubits = 3
#     var_circuit = RealAmplitudes(num_qubits, entanglement='full', reps=3)

    # BUILD VARIATIONAL CIRCUIT HERE - END
    
    # return the variational circuit which is either a VaritionalForm or QuantumCircuit object
    return var_circuit

## Choosing a Classical Optimizer

In the `classical_optimizer()` function given below you will have to import the optimizer of your choice from [`qiskit.aqua.optimizers`](https://qiskit.org/documentation/apidoc/qiskit.aqua.components.optimizers.html) and return it. This function will not be called by the grading function `grade()` and thus the name of the function `classical_optimizer()`can be changed if needed. 

In [148]:
def classical_optimizer():
    # CHOOSE AND RETURN CLASSICAL OPTIMIZER OBJECT - START
    
    # import the required clasical optimizer from qiskit.aqua.optimizers
    from qiskit.aqua.components.optimizers import SPSA
    
    # create an optimizer object
    cls_opt = COBYLA(maxiter=500, tol=0.001)
    
    # CHOOSE AND RETURN CLASSICAL OPTIMIZER OBJECT - END
    return cls_opt

### Callback Function

The `VQC` class can take in a callback function to which the following parameters will be passed after every optimization cycle of the algorithm:

* `eval_count` : the evaulation counter
* `var_params` : value of parameters of the variational circuit
* `eval_val`  : current cross entropy cost 
* `index` : the batch index

In [149]:
def call_back_vqc(eval_count, var_params, eval_val, index):
    print("eval_count: {}".format(eval_count))
    print("var_params: {}".format(var_params))
    print("eval_val: {}".format(eval_val))
    print("index: {}".format(index))

## Optimization Step

This is where the whole VQC algorithm will come together. First we create an instance of the `VQC` class. 

In [150]:
# a fixed seed so that we get the same answer when the same input is given. 
seed = 10598

# setting our backend to qasm_simulator with the "statevector" method on. This particular setup is given as it was 
# found to perform better than most. Feel free to play around with different backend options.
backend = Aer.get_backend('qasm_simulator')
backend_options = {"method": "statevector"}

# creating a quantum instance using the backend and backend options taken before
quantum_instance = QuantumInstance(backend, shots=1024, seed_simulator=seed, seed_transpiler=seed, 
                                   backend_options=backend_options)

# creating a VQC instance which you will be used for training. Make sure you input the correct training_dataset and 
# testing_dataset as defined in your program.
vqc = VQC(optimizer=classical_optimizer(), 
          feature_map=feature_map(), 
          var_form=variational_circuit(), 
          callback=call_back_vqc, 
          training_dataset=training_input,     # training_input must be initialized with your training dataset
          test_dataset=test_input)             # testing_input must be initialized with your testing dataset

Now, let's run the VQC classification routine

In [151]:
start = time.process_time()

result = vqc.run(quantum_instance)

print("time taken: ")
print(time.process_time() - start)

print("testing success ratio: {}".format(result['testing_accuracy']))

eval_count: 0
var_params: [-1.37181496e+00 -3.16659024e-02 -5.67083245e-01  2.70310919e+00
  9.92438854e-01 -7.51133657e-02 -8.37274085e-01 -5.29300956e-01
  2.21773182e-02  2.71611419e+00  5.13735596e-01 -2.49460372e-01
 -1.94255631e-03 -1.12622214e+00  9.36014394e-01 -3.97271230e-01
 -9.87206520e-01  1.27655253e+00 -3.68985060e-01  6.93120237e-01
  1.14837295e+00 -1.40777587e-01  4.91645112e-01 -1.51000420e+00]
eval_val: 0.6961764280155817
index: 0
eval_count: 1
var_params: [-3.71814956e-01 -3.16659024e-02 -5.67083245e-01  2.70310919e+00
  9.92438854e-01 -7.51133657e-02 -8.37274085e-01 -5.29300956e-01
  2.21773182e-02  2.71611419e+00  5.13735596e-01 -2.49460372e-01
 -1.94255631e-03 -1.12622214e+00  9.36014394e-01 -3.97271230e-01
 -9.87206520e-01  1.27655253e+00 -3.68985060e-01  6.93120237e-01
  1.14837295e+00 -1.40777587e-01  4.91645112e-01 -1.51000420e+00]
eval_val: 0.678622749545663
index: 1
eval_count: 2
var_params: [-3.71814956e-01  9.68334098e-01 -5.67083245e-01  2.70310919e+00


eval_count: 19
var_params: [-3.71814956e-01 -3.16659024e-02 -5.67083245e-01  2.70310919e+00
  9.92438854e-01 -7.51133657e-02 -8.37274085e-01 -5.29300956e-01
  1.02217732e+00  2.71611419e+00  5.13735596e-01 -2.49460372e-01
 -1.94255631e-03 -1.12622214e+00  9.36014394e-01 -3.97271230e-01
 -9.87206520e-01  1.27655253e+00  6.31014940e-01  6.93120237e-01
  1.14837295e+00 -1.40777587e-01  4.91645112e-01 -1.51000420e+00]
eval_val: 0.6791509578594208
index: 19
eval_count: 20
var_params: [-3.71814956e-01 -3.16659024e-02 -5.67083245e-01  2.70310919e+00
  9.92438854e-01 -7.51133657e-02 -8.37274085e-01 -5.29300956e-01
  1.02217732e+00  2.71611419e+00  5.13735596e-01 -2.49460372e-01
 -1.94255631e-03 -1.12622214e+00  9.36014394e-01 -3.97271230e-01
 -9.87206520e-01  1.27655253e+00 -3.68985060e-01  1.69312024e+00
  1.14837295e+00 -1.40777587e-01  4.91645112e-01 -1.51000420e+00]
eval_val: 0.6969985776245128
index: 20
eval_count: 21
var_params: [-3.71814956e-01 -3.16659024e-02 -5.67083245e-01  2.7031091

eval_count: 40
var_params: [-0.31929161 -0.01569066 -0.62314458  2.61445586  0.99129625  0.04065178
 -0.83291617 -0.5641392   1.03285805  2.9708849   1.01099478  0.12837665
 -0.03165725 -1.18021647  0.64682069 -0.42764044 -1.04499533  1.23933389
 -0.38177891  0.8316878   1.13843574  0.8620544   0.45761619 -0.50852639]
eval_val: 0.6713690025067044
index: 40
eval_count: 41
var_params: [-0.35327373 -0.02602649 -0.58687342  2.67181374  0.9920355  -0.01332098
 -0.83858679 -0.54163971  1.02594772  2.70245441  1.00409098 -0.27001673
 -0.01243212 -1.14908462  0.72235951 -0.60169119 -1.01041882  1.42146641
 -0.37350141  0.82885303  1.14486502  0.86022213  0.47963258 -0.50948252]
eval_val: 0.6681951836130243
index: 41
eval_count: 42
var_params: [-0.32892114 -0.01861953 -0.6128664   2.6307094   0.99150573  0.02535748
 -0.83452306 -0.5577635   1.03089986  2.69984989  1.00903844 -0.27792274
 -0.02620941 -1.17139462  0.66822618 -0.09755192 -1.03519736  1.64312343
 -0.37943331  0.83088451  1.14025761

eval_count: 63
var_params: [-0.34268749 -0.05818567 -0.57711639  2.62025297  0.91116144  0.01867296
 -0.81041042 -0.30534362  0.99444343  2.69255218  1.01348072 -0.28701342
 -0.0310875  -1.16713261  0.66656568 -0.435835   -1.02986084  1.22889873
 -0.3709255   0.79894515  1.18169342  0.85072861  0.46290006 -0.53857824]
eval_val: 0.668367802841117
index: 63
eval_count: 64
var_params: [-0.35493663 -0.04150272 -0.5787505   2.65405296  0.9876903   0.00893193
 -0.81095045 -0.29717998  1.03070757  2.70290496  1.01638074 -0.27046207
  0.02296871 -1.15332217  0.70944811 -0.41611758 -1.01532062  1.25324519
 -0.35257116  0.8320794   1.14446733  0.85837503  0.43168745 -0.52319077]
eval_val: 0.6694940991604885
index: 64
eval_count: 65
var_params: [-0.35696422 -0.04870665 -0.57413345  2.64758687  0.98975774  0.01911911
 -0.81024118 -0.29767735  1.02981998  2.70412415  1.02279576 -0.26915309
  0.053058   -1.15291691  0.69575675 -0.42548872 -1.01518971  1.24166201
 -0.33695379  0.8249257   1.14717967 

eval_count: 86
var_params: [-0.35317127 -0.04465503 -0.57859018  2.63658506  0.98430288  0.0260035
 -0.81331478 -0.29643607  1.02690358  2.73079983  1.02116902 -0.22343887
 -0.02686926 -1.15392678  0.69654374 -0.38198077 -1.01468221  1.29946053
 -0.3698669   0.82663861  1.14827835  0.85149997  0.4795459  -0.529861  ]
eval_val: 0.6672061876766046
index: 86
eval_count: 87
var_params: [-0.34837242 -0.0317026  -0.58974424  2.63960262  0.98503272  0.01549754
 -0.81448294 -0.32813539  1.0295147   2.73413207  1.01048163 -0.22330396
 -0.02272437 -1.15404981  0.70933794 -0.41400258 -1.01597136  1.2558681
 -0.37308001  0.8293328   1.14498518  0.85598149  0.47074918 -0.52119611]
eval_val: 0.6663672956268257
index: 87
eval_count: 88
var_params: [-0.34550202 -0.02982784 -0.59257411  2.63554263  0.98070392  0.01741869
 -0.81135112 -0.28624599  1.03641318  2.73926224  1.00939924 -0.21339842
 -0.04823805 -1.1599145   0.71497452 -0.41298118 -1.02130462  1.25882908
 -0.38388661  0.83480677  1.13767991  

eval_count: 109
var_params: [-0.29943431 -0.0454667  -0.58800839  2.61820314  1.01212881  0.04102671
 -0.82011445 -0.30127779  1.03196798  2.73043838  1.01527881 -0.21371308
 -0.01481743 -1.17473206  0.69570804 -0.42875715 -1.01202616  1.25036683
 -0.36421704  0.8269788   1.13916421  0.84802387  0.47652421 -0.53029192]
eval_val: 0.6652669176983267
index: 109
eval_count: 110
var_params: [-0.30334477 -0.04670955 -0.59031249  2.61327874  1.01784629  0.04392716
 -0.8197492  -0.3024638   1.03140204  2.72915169  1.01380265 -0.21159974
 -0.0145186  -1.15265886  0.69270586 -0.43120438 -1.03560517  1.24989537
 -0.3621226   0.82464234  1.13848965  0.83832432  0.47595352 -0.53099325]
eval_val: 0.6654932614510561
index: 110
eval_count: 111
var_params: [-0.29683826 -0.03809657 -0.58451138  2.62103629  1.01643731  0.03663761
 -0.82015927 -0.30074291  1.035695    2.732769    1.01830485 -0.2101283
 -0.01132376 -1.16474236  0.70028    -0.42685557 -1.01924947  1.2527118
 -0.36310508  0.83043249  1.13541

eval_count: 132
var_params: [-0.29721224 -0.04863271 -0.58546028  2.62439282  1.0123853   0.03883389
 -0.81918197 -0.30012725  1.03166349  2.73161967  1.0194044  -0.21118322
 -0.01349049 -1.16615198  0.6946712  -0.43052719 -1.02099067  1.25126001
 -0.37022678  0.8264187   1.13557437  0.85509184  0.47756817 -0.53006646]
eval_val: 0.6647037610429879
index: 132
eval_count: 133
var_params: [-0.29756907 -0.04592796 -0.58475278  2.62105106  1.01147142  0.03813868
 -0.8192905  -0.30028904  1.03066413  2.73056288  1.01756139 -0.21359175
 -0.01526375 -1.16444723  0.69549177 -0.4282628  -1.01951448  1.25059096
 -0.36624128  0.82709371  1.13911336  0.85632273  0.47771951 -0.52879968]
eval_val: 0.6644444260453605
index: 133
eval_count: 134
var_params: [-0.29631275 -0.04603772 -0.58493897  2.62112296  1.01098833  0.03857033
 -0.81989687 -0.30098918  1.03159036  2.73147608  1.0179062  -0.21238202
 -0.01385629 -1.16549838  0.69646583 -0.42825064 -1.02021102  1.25068365
 -0.36587804  0.82802791  1.138

eval_count: 155
var_params: [-0.29672506 -0.04780626 -0.58490246  2.62097575  1.01081042  0.03903771
 -0.81924972 -0.30084079  1.03033953  2.73382135  1.01779012 -0.21407666
 -0.01337057 -1.16578549  0.69513806 -0.42957879 -1.02009981  1.24997924
 -0.36749369  0.8268296   1.13740897  0.85547269  0.47801054 -0.53132304]
eval_val: 0.6640590587148917
index: 155
eval_count: 156
var_params: [-0.29690118 -0.0480828  -0.5849047   2.62141219  1.0103954   0.03897971
 -0.81889294 -0.30059126  1.03108197  2.73345515  1.0179601  -0.21376817
 -0.01363458 -1.16598508  0.69554869 -0.42914998 -1.01993215  1.25110012
 -0.36730863  0.82739541  1.13726951  0.85606691  0.47733393 -0.53071921]
eval_val: 0.6638625631931022
index: 156
eval_count: 157
var_params: [-0.29539578 -0.04844091 -0.58465299  2.62111542  1.01135058  0.03906823
 -0.81856413 -0.30135262  1.03077909  2.73291452  1.01774554 -0.21404048
 -0.0135802  -1.16636516  0.69546    -0.4292887  -1.01960505  1.25116327
 -0.36725364  0.82718369  1.137

eval_count: 178
var_params: [-0.29662886 -0.04800171 -0.58508557  2.62135499  1.01100523  0.03872615
 -0.81917621 -0.30110495  1.03158268  2.73334134  1.01839892 -0.21338471
 -0.01415012 -1.16618072  0.69522428 -0.42998944 -1.01989051  1.25110628
 -0.36694311  0.82696378  1.13685859  0.85543     0.47741467 -0.53066956]
eval_val: 0.6637374016710512
index: 178
eval_count: 179
var_params: [-0.29635666 -0.04812744 -0.58504476  2.62158494  1.01088529  0.03850828
 -0.81932148 -0.30104459  1.03130344  2.73338037  1.01841164 -0.21364155
 -0.01428663 -1.16565001  0.69524682 -0.42965172 -1.02004111  1.25095546
 -0.36727939  0.82715462  1.13711716  0.85585915  0.47744116 -0.53077825]
eval_val: 0.6637461990812477
index: 179
eval_count: 180
var_params: [-0.29626751 -0.04798448 -0.58467937  2.62224939  1.01072269  0.03902567
 -0.81913404 -0.30112199  1.03130362  2.73327857  1.01804431 -0.2134501
 -0.01428789 -1.16554613  0.6952429  -0.42980335 -1.02011114  1.25093313
 -0.36711968  0.82716133  1.1368

In [152]:
print(result)

{'num_optimizer_evals': 191, 'min_val': 0.6637557524047645, 'opt_params': array([-0.29675114, -0.04846702, -0.5850075 ,  2.62177256,  1.01119949,
        0.03876633, -0.81944976, -0.30106969,  1.03178752,  2.73369302,
        1.01829486, -0.21391532, -0.01420495, -1.16551459,  0.69541124,
       -0.42986013, -1.01958552,  1.25126709, -0.36722038,  0.82736309,
        1.13701116,  0.85590683,  0.47739484, -0.53082141]), 'eval_time': 1427.6125736236572, 'eval_count': 191, 'training_loss': 0.6637557524047645, 'testing_accuracy': 0.64, 'test_success_ratio': 0.64, 'testing_loss': 0.6690988778159763}


## Storing the optimal parameters for grading

Once the training step of the vqc algorithm is done we obtain the optimal parameters for our specific variational form. For the grading function to be able to access these optimal parameters you will need to follow the steps below. 

* **Step 1**: Run the cell below with `print(repr(vqc.optimal_params))`. 
* **Step 2**: Copy the matrix of optimal parameters and store it in the variable `optimal_parameters` inside the function `return_optimal_params()` in the next cell. This will enable us to extract it while calculating the accuracy your the model during grading. Given below is a pictographical explanation of the same:  

<img src="https://s3-ap-southeast-1.amazonaws.com/he-public-data/opt_params456b075.png" width="800">

In [153]:
print(repr(vqc.optimal_params))

array([-0.29675114, -0.04846702, -0.5850075 ,  2.62177256,  1.01119949,
        0.03876633, -0.81944976, -0.30106969,  1.03178752,  2.73369302,
        1.01829486, -0.21391532, -0.01420495, -1.16551459,  0.69541124,
       -0.42986013, -1.01958552,  1.25126709, -0.36722038,  0.82736309,
        1.13701116,  0.85590683,  0.47739484, -0.53082141])


In [154]:
%%write_and_run optimal_params.py
# # the write_and_run function writes the content in this cell into the file "optimal_params.py"

### WRITE YOUR CODE BETWEEN THESE LINES - START
    
# import libraries that are used in the function below.
import numpy as np
    
### WRITE YOUR CODE BETWEEN THESE LINES - END

def return_optimal_params():
    # STORE THE OPTIMAL PARAMETERS AS AN ARRAY IN THE VARIABLE optimal_parameters 
    
    optimal_parameters = [-0.29675114, -0.04846702, -0.5850075 ,  2.62177256,  1.01119949,
        0.03876633, -0.81944976, -0.30106969,  1.03178752,  2.73369302,
        1.01829486, -0.21391532, -0.01420495, -1.16551459,  0.69541124,
       -0.42986013, -1.01958552,  1.25126709, -0.36722038,  0.82736309,
        1.13701116,  0.85590683,  0.47739484, -0.53082141]
    
    # STORE THE OPTIMAL PARAMETERS AS AN ARRAY IN THE VARIABLE optimal_parameters 
    return np.array(optimal_parameters)

## Submission

Before we go any further, check that you have the three files `feature_map.py`, `variational_circuit.py` and `optimal_params.py` in the **same working directory as this notebook**. If you do not, then go back to the start and run the notebook making sure you have filled in the code where its required. When you run the cell below, all the three files `feature_map.py`, `variational_circuit.py` and `optimal_params.py` are combined into one file named **"answer.py"**. Now your working directory will have four python (.py) files out of which **"answer.py"** is the submission file: 
* `answer.py` <- upload this file onto HackerEarth and click on "Submit and Evaluate"
* `feature_map.py`
* `variational_circuit.py`
* `optimal_params.py`

In [155]:
solution = ['feature_map.py','variational_circuit.py','optimal_params.py']
file = open("answer.py","w")
file.truncate(0)
for i in solution:    
    with open(i) as f:
        with open("answer.py", "a") as f1:
            for line in f:
                f1.write(line)
file.close()

## Grading Function

Given below is the grading function that we shall use to grade your submission with a test dataset that is of the same format as `challenge_dataset_4_9.csv`. You can use it to grade your submission by extracting a few points out of the `challenge_dataset_4_9.csv` to get a basic idea of how your model is performing. 

In [156]:
#imports required for the grading function 
from qiskit import *
from qiskit.aqua import QuantumInstance
from qiskit.aqua.algorithms import VQC
from qiskit.aqua.components.feature_maps import FeatureMap
from qiskit.aqua.components.variational_forms import VariationalForm
import numpy as np

### Working of the grading function

The grading function `grade()` takes as **input**: 

* `test_data`: (`np.ndarray`) -- **no. of datapoints $\times$ dimension of data** : the datapoints against which we want to test our model. 


* `test_labels`: (`np.ndarray`) -- **no. of datapoints $\times$ 1** : A column vector with each entry either 0 or 1 as entries.


* `feature_map`: (`QuantumCircuit` or `FeatureMap`) -- A quantum feature map which is the output of `feature_map()` defined earlier.


* `variational_form`: (`QuantumCircuit` or `VariationalForm`) -- A variational form which is the output of `variational_circuit()` defined earlier.


* `optimal_params`: (`numpy.ndarray`) -- the optimal parameters obtained after running the VQC algorithm above. These are the values obtained when the function `return_optimal_params()` is run. 


* `find_circuit_cost` : (`bool`) -- Calculates the circuit cost if set to `True`. Circuit cost is calculated by converting the circuit to the basis gate set `\[ 'u3', 'cx'\]` and then applying the formula **cost = 1$\times$(no.of u3 gates) + 10$\times$(no.of cx gates)**.


* `verbose` : (`bool`) -- prints the result message if set to `True`.

And gives as **output**: 

* `model_accuracy` : (`numpy.float64`) -- percent accuracy of the model. 


* `circuit_cost`: (`int`) -- circuit cost as explained above.


* `ans`: (`tuple`) -- Output of the `VQC.predict()` method. 


* `result_msg`: (`str`) -- Result message which also outputs the error message in case of one.


* `unrolled_circuit`: (`QuantumCircuit` or `None`) -- the circuit obtained after unrolling the full VQC circuit and substituting the optimal parameters to the basis gate set `\[ 'u3', 'cx'\]`.

**Note:** if you look inside the `grade()` function in Section 2 you'll see that we have initialized a COBYLA optimizer though the prediction step will not require one. Similarily we have given a dataset to `training dataset`. Both of these are dummy variables. The reason for this is because these are not optional variables the `VQC` class instantiation.  

In [157]:
def grade(test_data, test_labels, feature_map, variational_form, optimal_params, find_circuit_cost=True, verbose=True):
    seed = 10598
    model_accuracy = None 
    circuit_cost=None 
    ans = None
    unrolled_circuit = None
    result_msg=''
    data_dim = np.array(test_data).shape[1]
    dataset_size = np.array(test_data).shape[0]
    dummy_training_dataset=training_input = {'A':np.ones((2,data_dim)), 'B':np.ones((2, data_dim))}
    
    # converting 4's to 0's and 9's to 1's for checking 
    test_labels_transformed = np.where(test_labels==4, 0., 1.)
    max_qubit_count = 6
    max_circuit_cost = 2000
    
    # Section 1
    if feature_map is None:
        result_msg += 'feature_map variable is None. Please submit a valid entry' if verbose else ''
    elif variational_form is None: 
        result_msg += 'variational_form variable is None. Please submit a valid entry' if verbose else ''
    elif optimal_params is None: 
        result_msg += 'optimal_params variable is None. Please submit a valid entry' if verbose else ''
    elif test_data is None: 
        result_msg += 'test_data variable is None. Please submit a valid entry' if verbose else ''
    elif test_labels is None: 
        result_msg += 'test_labels variable is None. Please submit a valid entry' if verbose else ''
    elif not isinstance(feature_map, (QuantumCircuit, FeatureMap)):
        result_msg += 'feature_map variable should be a QuantumCircuit or a FeatureMap not (%s)' % \
                      type(feature_map) if verbose else ''
    elif not isinstance(variational_form, (QuantumCircuit, VariationalForm)):
        result_msg += 'variational_form variable should be a QuantumCircuit or a VariationalForm not (%s)' % \
                      type(variational_form) if verbose else ''
    elif not isinstance(test_data, np.ndarray):
        result_msg += 'test_data variable should be a numpy.ndarray not (%s)' % \
                      type(test_data) if verbose else ''
    elif not isinstance(test_labels, np.ndarray):
        result_msg += 'test_labels variable should be a numpy.ndarray not (%s)' % \
                      type(test_labels) if verbose else ''
    elif not isinstance(optimal_params, np.ndarray):
        result_msg += 'optimal_params variable should be a numpy.ndarray not (%s)' % \
                      type(optimal_params) if verbose else ''
    elif not dataset_size == test_labels_transformed.shape[0]:
        result_msg += 'Dataset size and label array size must be equal'
    # Section 2
    else:
        
        # setting up COBYLA optimizer as a dummy optimizer
        from qiskit.aqua.components.optimizers import COBYLA
        dummy_optimizer = COBYLA()

        # setting up the backend and creating a quantum instance
        backend = Aer.get_backend('qasm_simulator')
        backend_options = {"method": "statevector"}
        quantum_instance = QuantumInstance(backend, 
                                           shots=2000, 
                                           seed_simulator=seed, 
                                           seed_transpiler=seed, 
                                           backend_options=backend_options)

        # creating a VQC instance and running the VQC.predict method to get the accuracy of the model 
        vqc = VQC(optimizer=dummy_optimizer, 
                  feature_map=feature_map, 
                  var_form=variational_form, 
                  training_dataset=dummy_training_dataset)
        
        from qiskit.transpiler import PassManager
        from qiskit.transpiler.passes import Unroller
        pass_ = Unroller(['u3', 'cx'])
        pm = PassManager(pass_)
        # construct circuit with first datapoint
        circuit = vqc.construct_circuit(data[0], optimal_params)
        unrolled_circuit = pm.run(circuit)
        gates = unrolled_circuit.count_ops()
        if 'u3' in gates: 
            circuit_cost = gates['u3']
        if 'cx' in gates: 
            circuit_cost+= 10*gates['cx']
        
        if circuit.num_qubits > max_qubit_count:
            result_msg += 'Your quantum circuit is using more than 6 qubits. Reduce the number of qubits used and try again.'
        elif circuit_cost > max_circuit_cost:
            result_msg += 'The cost of your circuit is exceeding the maximum accpetable cost of 2000. Reduce the circuit cost and try again.'
        else: 
            
            ans = vqc.predict(test_data, quantum_instance=quantum_instance, params=np.array(optimal_params))
            model_accuracy = np.sum(np.equal(test_labels_transformed, ans[1]))/len(ans[1])

            result_msg += 'Accuracy of the model is {}'.format(model_accuracy) if verbose else ''
            result_msg += ' and circuit cost is {}'.format(circuit_cost) if verbose else ''
            
    return model_accuracy, circuit_cost, ans, result_msg, unrolled_circuit

## Process of grading using a dummy grading dataset

Let us create a dummy grading dataset with features and labels `grading_features` and `grading_labels` created from the last 2000 datapoints from `data_features` and `data_labels`so that we can a rough estimate of our accuaracy. It must be noted that this may not be a balanced dataset, i.e, may not have equal number of `4`'s and `9`'s and is not best practice. This is only given for the purpose of the demo of `grade()` function. In the final scoring done on HackerEarth, the testing dataset used will have a balanced number of class labels `4` and `9`.

In [158]:
grading_dataset_size=2000    # this value is not per digit but in total
grading_features = data_features[-grading_dataset_size:]
grading_labels = data_labels[-grading_dataset_size:]

In [None]:
start = time.process_time()

accuracy, circuit_cost, ans, result_msg, full_circuit  =  grade(test_data=grading_features, 
                                                                test_labels=grading_labels, 
                                                                feature_map=feature_map(), 
                                                                variational_form=variational_circuit(), 
                                                                optimal_params=return_optimal_params())

print("time taken: {} seconds".format(time.process_time() - start))
print(result_msg)

You can also check your **accuracy**, **circuit_cost** and **full_circuit** which is the result of combining the feature map and variational circuit and unrolling into the basis \['u3', 'cx'\].

In [84]:
print("Accuracy of the model: {}".format(accuracy))
print("Circuit Cost: {}".format(circuit_cost))
print("The complete unrolled circuit: ")
full_circuit.draw()

Accuracy of the model: 0.7735
Circuit Cost: 378
The complete unrolled circuit: 
