# Fully Managed Quantum Machine Learning Example

## Introduction
In this example we will _sagemakerize_ the QBoost algorithm, which is a Quantum Binary Classifier published in 2008 by D-Wave & Google.
Since you are reading this notebook, I assume you already followed the steps in the README.md files, but in any case I will report them here again:
1. Run the script: create_wisc_datasets.py: this will create the folder _data_ in your local machine, and will generate inside it two files, one for training and one for testing
2. Please get your AWS credentials, AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY, and export them as environment variables
3. Run the following commands to build the container and push the image in Amazon ECR:
    chmod +wrx build_and_push.sh && ./build_and_push.sh qboost-sagemaker-example
4. Once it's done, please open AWS SageMaker and start a Jupyter Notebook instance
5. Follow the instruction in the next cells


In [None]:
# S3 prefix where we will store everything we will produce following this tutorial
prefix = 'DEMO-qboost-breast-cancer'

# Define IAM role
import boto3
import re

import os
import numpy as np
import pandas as pd
from sagemaker import get_execution_role

role = get_execution_role() # Getting the Role, means getting the permissions connected with you AWS Credentials

In [None]:
import sagemaker as sage
from time import gmtime, strftime

# The Session is essential in order to access the S3 Bucket without writing the whole URI.
# It will also handle the creation of appropriate names for the S3 Buckets, EndPoint name, etc.
sess = sage.Session() 

In [None]:
# Now please create a folder in your Jupyter Lab environment and call it 'data'.
# After you open it from the interface of Jupyter Lab, please and drag & drop from your computer
# only the training csv previously created thanks to the script 'create_wisc_datasets.py' 

WORK_DIRECTORY = 'data' # The name of the folder with the training data just drag&dropped

# With the following line of code you will upload the folder using the Session tool in a S3 Bucket. 
# The Session will take care about using the correct one
data_location = sess.upload_data(WORK_DIRECTORY, key_prefix=prefix) 

In [None]:
# Now you would need to retrieve the account information, that are linked with this Sagemaker instance
# In the SageMaker Jupyter Lab environment you can easily access them using the Session tool
account = sess.boto_session.client('sts').get_caller_identity()['Account']
region = sess.boto_session.region_name

# This is the name of the image that you previously uploaded using the script:
# './build_and_push.sh qboost-sagemaker-example'
image = '{}.dkr.ecr.{}.amazonaws.com/qboost-sagemaker-example:latest'.format(account, region)

# These are the variables that you need to train your QBoost on the D-Wave machine
env = {
    'DW_ENDPOINT': 'https://cloud.dwavesys.com/sapi',
    'DW_TOKEN': 'Your DWave DEV Token',
    'DW_SOLVER': 'DW_2000Q_2_1' , # Name of an available solver
    'num_reads' : 1000, # Number of shots for the QPU
    'tree_depth' : 2,
}

# Creating a SageMaker Estimator for your QBoost image
qboost = sage.estimator.Estimator(image, # Docker Container image previously pushed
                                  role, # Your account's role, you need to have the Sagemaker full priviledges
                                  1, # Number of instances for your training. You need 1 since D-Wave is going to do all the work.
                                  'ml.c4.2xlarge', # Instance type.
                                  output_path="s3://{}/output".format(sess.default_bucket()), # Where to put the trained model
                                  sagemaker_session=sess, # Session tool, required
                                  hyperparameters=env) # The parameters that we specified before for connected the D-Wave machine with Sagemaker

qboost.fit(data_location) # Unleash the power of D-Wave.



In [None]:
# Time to Deploy your model in order to be reachable and fully managed by SageMaker

# I would like to highlight the fact that now you are not running the inference on QPU time, since
# the power of the Quantum Optimisation has been delivered back to the classical world, ready to be deployed.

from sagemaker.predictor import csv_serializer
predictor = qboost.deploy(1, 'ml.m4.xlarge', serializer=csv_serializer)

In [None]:
# In this cell we will upload the testing dataset, to use with the freshly created API

# As we did before, create a folder in your Jupyter Lab environment and call it: 'data_testing'
# Open it and drag & drop into it the file 'test_wisc_binary.csv' that was created before
# and it should be in the 'data' folder in your local system (not the 'data' folder inside Sagemaker Jupyter Lab)

shape=pd.read_csv("data_testing/test_wisc_binary.csv", header=None) # Read the file

import itertools

# Select some random samples
a = [50*i for i in range(3)] 
b = [40+i for i in range(10)]
indices = [i+j for i,j in itertools.product(a,b)]

# Create the test_data subset
test_data=shape.iloc[indices[:-1]]

In [None]:
# Call the API with the test_data and get the predictions
predictions = predictor.predict(test_data.values).decode('utf-8').split('\n')[:-1]
predictions = [float(x) for x in predictions] # Cast to float the predictions
real_values = [float(x) for x in test_data.values[:,0]] # Cast to float the ground truth values

In [None]:
# Evaluate the accuracy of the trained model
from sklearn.metrics import accuracy_score

print(predictions, len(predictions))
print(real_values, len(real_values))

print('Accuracy', accuracy_score(predictions, real_values))

In [None]:
# Delete the endpoint in order to save money! $$ (to spend on QPU time ;) )
sess.delete_endpoint(predictor.endpoint)

## Conclusion
We just saw how to create a Sagemaker Quantum Model, training it using D-Wave quantum machine directky from Sagemaker, and serve it in order to be used on production in a classical environment.
The QPU time has been only used to _train_ the model, while the coocked recipe has been transported back in the classical machine, to perform inference through a normal API call.

### Future Work
In order to have our quantum model served in a proper production way you would need to:
1. Create a Lambda function (which is serverless) on AWS that call the sagemaker model.
2. Create an API Gateway toward the Lambda function that handle Error Codes, and Authentication

I didn't explain these steps here since I will cover them later in a more complete example. However feel free to contact me on calogero.zarbo@docebo.com / calogero.zarbo@deeploans.ai / D-Wave Forum if you have any question, advice or feedback in general.

### Why this work is relevant (IMHO)
The main goal of this tutorial is not limited to implementation of QBoost algorithm, since it's a relatively old model (2008). My main goal was to provide to the community a complete example, that can be modified as much as anyone want and can be reused in their projects. In fact, you can substitute QBoost implementation with any other model and have it to run in a production ready way with very few steps.
This example proves that the power of D-Wave quantum computer can be embedded in already existing framework, allowing the spread of this cutting-edge technology in multiple business cases.

**_Once again, please feel free to contact me for anything_**