# Quantum classifier with custom containers

Amazon Braket has pre-built containers for executing Amazon Braket Hybrid Jobs, which are sufficient for many use cases involving the Braket SDK and PennyLane. However, if we want to use custom packages outside the scope of pre-built containers, we will need to supply a custom-built container. In this tutorial, we show how to use Braket Job to train a quantum machine learning model using BYOC (Bring Your Own Container).

## 1 Prepare algorithm script
The purpose of this section is to show how to prepare the algorithm script for a text sentiment classifier. The complete script for this notebook is [here](algorithm_script.py). We will not go through the script line by line. Instead, we highlight the part that is important for understanding the Braket Job.

### Problem setup
The quantum machine learning problem we are targeting is to classify the positive and negative sentiment of a sentence. We use four sentences in this example:

"I eat a banana every day." <br>
"Bananas are not for her." <br>
"Banana shakes are delicious." <br>
"How can you like bananas?" <br>

The first and the third sentence have positive sentiment on bananas, which are labeled as +1. The second and the fourth sentence have negative sentiment, which are labeled as -1. To input the sentence to a quantum machine learning model, we use spaCy package to embed the sentences into 1D vectors.

In [1]:
import spacy_sentence_bert

nlp = spacy_sentence_bert.load_model("xx_distiluse_base_multilingual_cased_v2")
banana_string = ["I eat a banana every day.",
                "Bananas are not for her.",
                "Banana shakes are delicious.",
                "How can you like bananas?"
]
banana_embedding = [nlp(d) for d in banana_string]
data = [d.vector for d in banana_embedding]
label = [1, -1, 1, -1]

With the "xx_distiluse_base_multilingual_cased_v2" language model, each data point is now a vector with length 512. See the [spaCy page](https://spacy.io/universe/project/spacy-sentence-bert) for details. Note that the size of the embedding vectors depends on the language model. When choosing a different model, we would expect a different shape of embedding. 

In [2]:
for d in data:
    print("data size: {}".format(d.shape))    

data size: (512,)
data size: (512,)
data size: (512,)
data size: (512,)


### Quantum machine learning model

We choose [Circuit-centric quantum classifiers (CCQC)](https://arxiv.org/abs/1804.00633) as our quantum model. The figure below shows an example CCQC circuit with 7 qubits. The data (classical embedding from the language model) is input to the circuit as the initial state via [amplitude encoding](https://arxiv.org/abs/1803.07128). After the initial states are two entanglement layers. The first layer entangles each qubit with its nearest neighbor, and the second layer with each qubit's third nearest neighbor. A rotation gate is then applied to the first qubit. Finally, the measurement is only done on the first qubit. The classification criterion is only based on this measurement. If the measurement is positive, it predicts a positive sentiment (+1); otherwise, it predicts a negative one (-1).


<img src="ccqc_circuit.png" alt="Image of quantum circuit" width="600" /> 

We use PennyLane as our machine learning framework. To use Braket managed simulators or QPUs, we set the device name to be "braket.aws.qubit". The device_arn will be passed to the algorithm script as the environment variable <code>os.environ["AMZN_BRAKET_DEVICE_ARN"]</code>; this variable is set when creating the job in the orchestration script. For details about options of Braket devices in <code>qml.device</code>, see [Amazon Braket-Pennylane Plugin](https://github.com/aws/amazon-braket-pennylane-plugin-python). The quantum model is packaged in a CCQC class in the [algorithm script](algorithm_script.py).

### Monitor metrics and record results
We can monitor the progress of the hybrid job in the Amazon Braket console. <code>log_metric</code> records the metrics so that we can view the training progress in the "Monitor" console tab.  <code>save_job_result</code> allows us to view the result in the console and in the orchestration script. 

## 2 Prepare custom container
When we submit a quantum job, Amazon Braket starts an instance and spins up a container to run our script. The environment is defined by the provided container instead of the local machine where the job is submitted. If no container image is specified when submitting a job, the default container is the base Braket container. See the [developer guide](https://docs.aws.amazon.com/braket/index.html) for the configuration of the base container.

Amazon Braket Jobs provides three pre-built containers for different use cases. See the [developer guide](https://docs.aws.amazon.com/braket/index.html) for the configuration of pre-built containers. In this exercise, the spaCy package is not supported in any of the three containers. One option is to install the package through <code>pip</code> in the beginning of the algorithm script.  

In [1]:
# from pip._internal import main as pipmain
# pipmain(["install", "spacy"])

When the problem size is small, we can manage to use <code>pip</code> to configure the environment. However, for large-scale applications, we expect that this method would quickly become infeasible. Braket Job provides the "bring your own container (BYOC)" option to help you manage the environment of your hybrid job. BYOC not only allows us to define what Pyhton packages are available, but to configure those settings that are hard to do by <code>pip</code> or Python alone. In the following, we go through the steps of building our own container.

### Preparation 1: Docker 
To build and upload our custom container, we must have [Docker](https://docs.docker.com/get-docker/) installed. Amazon Braket Notebook Instance has Docker pre-stalled. This step can be skipped if you are using the terminal of a Braket Notebook Instance.

### Preparation 2: Dockerfile
A Dockerfile defines the environment and the software in the containers. We can start with the base Dockerfile of Braket Jobs as a template and add packages according to our needs. For our quantum text classifier, we use the Dockerfile below. The Dockerfile can also be found [here](dockerfile).

<pre><code>
FROM 292282985366.dkr.ecr.us-west-2.amazonaws.com/base-jobs:1.0-cpu-py37-ubuntu18.04

RUN python3 -m pip install --upgrade pip

RUN python3 -m pip install amazon-braket-pennylane-plugin \
                           pennylane \
                           spacy \
                           spacy_sentence_bert

RUN python3 -m pip install --no-cache --upgrade sagemaker-training

COPY braket_container.py /opt/ml/code/braket_container.py

ENV SAGEMAKER_PROGRAM braket_container.py
</code></pre>

The first line in the dockerfile specifies the container template. We build our container upon the base Braket container. The rest of the file is to install the required packages (PennyLane and SpaCy etc.) and configure the environment.

### Preparation 3: Initial script
We prepare an initial script that will be executed when our container starts. For this exercise, we can directly use braket_container.py from the example code without modification. The initial script for this excercise is [here](qml_source/braket_container.py). The script configures the paths for image and for user code. It first sets up a container and then downloads the algorithm script to run in the container. It also handles errors and logs error messages.

### Preparation 4: Create ECR
Amazon Elastic Container Registry (ECR) is a fully managed Docker container registry. Follow the [instructions](https://docs.aws.amazon.com/AmazonECR/latest/userguide/repository-create.html) to create a "private" repository. For this exercise, we name our repository "my-qtc" (my quantum text classifier).

Now that we have finished the prerequisites, it's time to build our container!

### Action 1: Login to AWS CLI and Docker
If you haven't already, follow the [instructions](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-quickstart.html) to configure your AWS credentials using the AWS CLI. Then, run the following snippet to log in to Docker. Replace all &lt;XXX&gt; below with your own credentials. We will see "Login Succeeded" when it's done.

<code>aws ecr get-login-password --region &lt;YOUR_AWS_REGION&gt; | docker login --username AWS --password-stdin &lt;YOUR_ACCOUNT_ID&gt;.dkr.ecr.&lt;YOUR_AWS_REGION&gt;.amazonaws.com</code>

If the terminal does not support interactive login, we can also run the following line to log in.

<code>docker login -u AWS -p $(aws ecr get-login-password --region &lt;YOUR_AWS_REGION&gt;) &lt;YOUR_ACCOUNT_ID&gt;.dkr.ecr.&lt;YOUR_AWS_REGION&gt;.amazonaws.com</code>

### Action 2: Build and push the image
Go to the folder containing your Dockerfile and the initial script (braket_container.py in this case). Run the lines below to build and push the image to our ECR. Remember to replace all &lt;XXX&gt; in the code with your own credentials. When it completes, we will see all layers are pushed in the terminal, and a new image will appear in our ECR console under "my-qtc" repository. If running into memory error when building an image due to the size of the language model, we can increase the memory limit in Docker.<br>
<code>docker build -t dockerfile .
docker tag dockerfile:latest &lt;YOUR_ACCOUNT_ID&gt;.dkr.ecr.&lt;YOUR_AWS_REGION&gt;.amazonaws.com/my-qtc:latest
docker push &lt;YOUR_ACCOUNT_ID&gt;.dkr.ecr.&lt;YOUR_AWS_REGION&gt;.amazonaws.com/my-qtc:latest
</code>
Once the container image uploaded, it is ready to be used in a Braket Job!

### All-in-one shell scrip
The above procedure walks you through the steps to create a container image. This procedure can be automated in a shell script. The [example script](build_and_push.sh) is provided in the same folder of this notebook. The script automatically builds and pushes the container image to the ERC repository you assign. If the repository does not exist, it creates one. All you need to do is to prepare the Dockerfile and the initial script, and then run following cell in a terminal

In [None]:
# ./build_and_push.sh <name-of-ECR-repo>

or the following cell in a notebook

In [None]:
# import subprocess
# print(subprocess.check_output(['./build_and_push.sh','name-of-ERC-repo']))

## 3 Submit the Braket Job to AWS
Now that we have prepared an algorithm script and the container for the job, we can submit the hybrid job to AWS using <code>AwsQuantumJob</code>. Remember to provide the container we just created via the <code>image_uri</code> keyword.

In [4]:
from braket.aws import AwsQuantumJob

image_uri = "[aws_account_id].dkr.ecr.[your_region].amazonaws.com/my-qtc:latest"

job = AwsQuantumJob.create(
    "arn:aws:braket:::device/quantum-simulator/amazon/sv1",
    source_module="algorithm_script.py",
    entry_point="algorithm_script:main",
    wait_until_complete=True,
    job_name = "my-aws-job",
    image_uri = image_uri,
)

## 4 Evaluate results
Once training is completed, we can evaluate how our quantum model performs. First, we initialize the CCQC model. 

In [5]:
from algorithm_script import CCQC

device_arn="arn:aws:braket:::device/quantum-simulator/amazon/sv1"
qml_model = CCQC(nwires = 9, device_arn=device_arn)

We then retrieve the trained weights from <code>job.result()</code>. If we lose the <code>job</code> variable, we can always retrieve it by its arn which can be found in the console.

In [7]:
from braket.aws import AwsQuantumJob

job_arn = "your-job-arn"
job = AwsQuantumJob(job_arn)
weights = job.result()['weights']

Using the trained weights, we can make the prediction for each sentence with the <code>predict</code> of our model. See the [algorithm script](algorithm_script.py) for definitions.

In [8]:
for i in range(4):
    print(banana_string[i])
    pred = qml_model.predict(*weights, data=data[i])
    print("label: {}  predict:{}".format(label[i], pred))
    print()

I eat a banana every day.
label: 1  predict:1

Bananas are not for her.
label: -1  predict:-1

Banana shakes are delicious.
label: 1  predict:1

How can you like bananas?
label: -1  predict:-1



We can also test our quantum model on a sentence it has not seen.

In [9]:
test_string = "A banana a day keeps the doctor away."
test_data = nlp(test_string).vector
test_label = 1
print(test_string)
pred = qml_model.predict(*weights, data=test_data)
print("label: {}  predict:{}".format(test_label, pred))

A banana a day keeps the doctor away.
label: 1  predict:1


## Summary
In this notebook, we have gone through an use case that requires python packages not supported by either of the pre-built containers provided by Braket Job. We have learned the steps to prepare a Dockerfile and build our own container that supports our use case. Using Braket Job with BYOC provides the flexibility of defining custom environments and the convenience for switching between environments for different applications through the <code>img_uri</code> argument.