# Quantum Classifier with Custom Containers

In this tutorial, we show how to use Amazon Braket Hybrid Job to train a quantum machine learning model. To run a hybird job, we need to prepare an algorithm script and an orchestration script. The algorithm script defines problem, machine learning model and metrics. The orchestration script defines the environment and the backend for the algorithm script. 

## 1 Prepare Algorithm Script
The purpose of this section is to show how to prepare the algorithm script for a text sentiment classifier. The complete script for this notebook is [here](algorithm_script.py). We will not go through the script line by line. Instead, we highlight the part that is important for understanding the hybrid job.

### Problem Setup
The quantum machine learning problem we are targeting is to classify the postive and negative sentiment of a sentence. We use four sentences in this example:

"I eat a banana everyday." <br>
"Banana is not for her." <br>
"Banana shake is delicious." <br>
"How can you like to eat bananas?" <br>

The first and the third sentences have positive sentiment on bananas, which are labeled as +1. The second and the fourth one have negative sentiment, which are labeled as -1. To input the sentence to a quantum machine learning model, we use spaCy package to embed the sentences into 1D vectors.

In [2]:
import spacy_sentence_bert

nlp = spacy_sentence_bert.load_model("xx_distiluse_base_multilingual_cased_v2")
banana_string = ["I eat a banana everyday.",
                "Banana is not for her.",
                "Banana shake is delicious.",
                "How can you like to eat bananas?"
]
banana_embeding = [nlp(d) for d in banana_string]
data = [d.vector for d in banana_embeding]
label = [1, -1, 1, -1]

With "xx_distiluse_base_multilingual_cased_v2" language model, each data point is now a vector with length 512. See the [spaCy page](https://spacy.io/universe/project/spacy-sentence-bert) for details. Note that the size of the embeding vectors depends on the language model. When choosing a different model, we would expect a different shape of embeding. 

In [3]:
for d in data:
    print("data size: {}".format(d.shape))

data size: (512,)
data size: (512,)
data size: (512,)
data size: (512,)


### Quantum machine learning model

We choose [Circuit-centric quantum classifiers (CCQC)](https://arxiv.org/abs/1804.00633) as our quantum model. The figure below shows an example of CCQC circuit with 7 qubit. The data (classical embeding from the language model) is input to the circuit as the initial state via [amplitude encoding](https://arxiv.org/abs/1803.07128). Followed the initail states are two entanglement layers. The first one entangles each qubit with its nearest neighbor, while the second one with its thrid nearest neighbor. A rotation gate is then put at the first qubit. Finally, the measurment is only done at the first qubit. The classification criterion is only based on this measurement. If the measurement is >0, it predicts a postive sentiment (+1), otherwise it predicts a negative one (-1).


<img src="ccqc_circuit.png" alt="Image of quantum circuit" width="600" /> 

We use Pennylane as the machine learning framework. To use AWS QPU, we set the device name to be "braket.aws.qubit". As the algorithm script, we do not have to specify the device arn explcitly. We can assign it as <code>os.environ["AMZN_BRAKET_DEVICE_ARN"]</code>, and it will be fetched from the orchestration script through environment variables. For details about options of Braket devices in <code>qml.device</code>, see [Amazon Braket-Pennylane Plugin](https://github.com/aws/amazon-braket-pennylane-plugin-python).
<pre><code>
dev = qml.device("braket.aws.qubit",
         device_arn = os.environ["AMZN_BRAKET_DEVICE_ARN"],
         wires=self.nwires,
         s3_destination_folder = None,
        )

@qml.qnode(dev)
def circuit(*weights, features=np.zeros(2**nwires)):
    # components of the CCQC circuit
</pre></code>
The quantum model is packaged in a CCQC class in the [algorithm script](algorithm_script.py).

### Monitor metrics and Record Results
We can monitor the progress of the hybrid job in Amazon Braket console. <code>log_metric</code> records the metrics so that we can view the training progress in the "Monitor" console tab.  <code>save_job_result</code> allows us to view the result in the console and in the orchestration script. 
<pre><code>
from braket.jobs import save_job_result
from braket.jobs.metrics import log_metric

log_metric(
    metric_name="Cost",
    value=cost,
    iteration_number=i,
)

weights = [weights.to;list()]
save_job_result({"weights": weights})
</pre></code>

## 2 Prepare Custom Container
Amazon Braket Jobs provides three pre-built containers for different use cases. If your applications fall outside of the support of these three pre-built containers, you have the option to build your own container. In this notebook, the spaCy package is not support either of the three containers, so we need to create our own!

### Preparation 1: Docker 
To build and upload your custom container, you must have Docker installed. <br>

### Preparation 2: Dockerfile
This defines the environment and the software in the containers. We can start with the base dockerfile of Amazon Braket Hybrid Job as an example and add packages for our needs. For our quantum text classifier, we only need to add the following lines to the example dockerfile. The completed dockerfile for this excercise is [here](dockerfile).

<pre><code>
RUN ${PIP} install PennyLane==0.16.0 \
                   spacy==3.1.3 \
                   spacy_sentence_bert==0.1.2
</pre></code>

### Preparation 3: Initial Script
This contains the initial codes to run when your container starts. For this exercise, we can directly use braket_container.py from the example code without modification. The initial script for this excercise is [here](qml_source/braket_container.py).

### Preparation 4: Create ECR
Follow the [instruction](https://docs.aws.amazon.com/AmazonECR/latest/userguide/repository-create.html) to create a "private" container repository. For this excerise, we name our repository "my-qtc" (my quantum text classifier).

### Action 1: Login to AWS CLI and Docker
If you haven't already, follow the [instruction](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-quickstart.html) to configure your credential in AWS CLI. Then, run the following code to log in to Docker. Replace all [xxx] below with your own credentials. You will see "Login Succeeded" when it's done.
<pre><code>
aws ecr get-login-password --region [your_region] | docker login --username AWS --password-stdin [aws_account_id].dkr.ecr.[your_region].amazonaws.com
</pre></code>
If your terminal does not support interactive login, you can also run the following line to log in.
<pre><code>
docker login -u AWS -p $(aws ecr get-login-password --region [your_region]) [aws_account_id].dkr.ecr.[your_region].amazonaws.com
</pre></code>

### Action 2: Build and push the image
Go to the folder where you have docker and your initial script (braket_container.py in this case). Run the lines below to build and push the image to your ECR. Remember to replace all [xxx] in the codes with your own credentials. When it completes, you will see all layers are pushed in your terminal, and a new image will appear in your ECR console under "my-qtc" repository. If you run into memory error when building an image due to the size of the language model, increase the memory limit in Docker. 
<pre><code>
docker build -t dockerfile .
docker tag dockerfile:latest [aws_account_id].dkr.ecr.[your_region].amazonaws.com/my-qtc:latest
docker push [aws_account_id].dkr.ecr.[your_region].amazonaws.com/my-qtc:latest
</pre></code>

## 3 Orchestration Script

### Local Job
Before submit job to aws, you can test the job locally by using <code>LocalQuantumJob</code>. This help debuging and configuring the environment. Remember to provide the container we just created via image_uri keyword.

In [4]:
from braket.jobs.local.local_job import LocalQuantumJob

image_uri = "[aws_account_id].dkr.ecr.[your_region].amazonaws.com/my-qtc:latest"

job = LocalQuantumJob.create(
    "arn:aws:braket:::device/quantum-simulator/amazon/sv1",
    source_module="qml_source",
    entry_point="qml_source.algorithm_script:main",
    wait_until_complete=True,
    image_uri = image_uri,
    job_name = "my-qtc-job"
)

### Runing Hybrid Job in AWS
Now we have prepared an algorithm script and the container for the job, we can submit the hybrid job to AWS using <code>AwsQuantumJob</code>. Remember to provide the container we just created via <code>image_uri</code> keyword.

In [5]:
from braket.aws import AwsQuantumJob

image_uri = "[aws_account_id].dkr.ecr.[your_region].amazonaws.com/my-qtc:latest"

job = AwsQuantumJob.create(
    "arn:aws:braket:::device/quantum-simulator/amazon/sv1",
    source_module="qml_source",
    entry_point="qml_source.algorithm_script:main",
    wait_until_complete=True,
    image_uri = image_uri,
    job_name = "my-qtc-job"
)

## Evaluate Results
Now the training is completed, we can evaulate how our quantum model performs. First, we initialize the CCQC model. 

In [6]:
from qml_source.algorithm_script import CCQC

qml_model = CCQC(nwires = 9)

We then retrieved the trained weights from <code>job.result()</code>. If you lose the <code>job</code> variable, you can always retreive it by its arn which can be found in your console.

In [7]:
from braket.aws import AwsQuantumJob

job_arn = "your-job-arn"
job = AwsQuantumJob(job_arn)
weights = job.result()['weights']

Using the trained weight, we can make the prediction for each sentence with the <code>predict_fun</code> of our model. See the [algorithm script](algorithm_script.py) for definitions.

In [8]:
for i in range(4):
    print(banana_string[i])
    pred = qml_model.predict_fun(*weights, data=data[i])
    print("label: {}  predict:{}".format(label[i], pred))
    print()

I eat a banana everyday.
label: 1  predict:1

Banana is not for her.
label: -1  predict:-1

Banana shake is delicious.
label: 1  predict:1

How can you like to eat bananas?
label: -1  predict:-1



We can also test our quantum model on a sentence it has not seen.

In [9]:
test_string = "A banana a day keeps the doctor away."
test_data = nlp(test_string).vector
test_label = 1
print(test_string)
pred = qml_model.predict_fun(*weights, data=test_data)
print("label: {}  predict:{}".format(test_label, pred))

A banana a day keeps the doctor away.
label: 1  predict:1

