# Model Deployment

__What we have done so far__:
- We have had the best deep learning model in h5 file format trained using TensorFlow. 
- Typically, we will not put this procedure in production. TensorFlow based on Python is too large and not optimized for serving/prediction. However, it's optimized for training. Hence, we need to use something else for production.
- Instead, we can use TensorFlow Serving, TensorFlow Lite, AWS Lambda, etc. In this case, we have used TensorFlow Lite to convert our model in h5 file format to tflite. This decrease the model size drastically while still maintaining the performance to make the prediction.
- In this notebook, we will deploy the model using Docker, AWS Lambda, and AWS API Gateway.

__AWS Lambda__:

By using AWS Lambda, we can run code without thinking about the servers, that is, __serverless__. Hence, we don't need to rent an instance, instead just define a function and specify what is the input and output. We only need to pay for the time this function is actually run. For example, if our function only needs 2 seconds to run, then we only need to pay for 2 seconds. 

The good news is that AWS Lambda supports __Docker__. Thus, before we deploy the model on AWS, we can run the model locally on our machine. 
Here are the steps you can follow:

## Lambda Function

First, we need to create a `lambda_function.py` to deploy the model either on AWS Lambda or Docker since both options need this file for a deep learning model to run.

### Header

In [1]:
# in AWS Lambda, we need to use this import below
# import tflite_runtime.interpreter as tflite
import tensorflow.lite as tflite
import numpy as np
from urllib.request import urlopen
from PIL import Image

# Create an interpreter interface for any model in TFLite
interpreter = tflite.Interpreter(model_path='clothing_classifier.tflite')
interpreter.allocate_tensors()

# Get a list of input details from the model
input_details = interpreter.get_input_details()
input_index = input_details[0]['index']

# Get a list of output details from the model
output_details = interpreter.get_output_details()
output_index = output_details[0]['index']

### Func: `predict`

In [2]:
def predict(X):
    # set the value of the input tensor
    interpreter.set_tensor(input_index, X)
    interpreter.invoke()

    # Get the value of the output tensor
    preds = interpreter.get_tensor(output_index)
    
    return preds[0]

### Func: `decode_predictions`

In [3]:
labels = [
    'dress',
    'hat',
    'longsleeve',
    'outwear',
    'pants',
    'shirt',
    'shoes',
    'short',
    'skirt',
    't-shirt'
]

def decode_predictions(pred):
    result = {label: float(score) for label, score in zip(labels, pred)}
    return result

### Func: `preprocessor`

In [4]:
def preprocessor(img_url):
    # load the image using PIL module
    img = Image.open(urlopen(img_url))
    
    # Specify the image target size
    img = img.resize((150, 150))
    
    # Turn the image into a 4D-array
    X = np.expand_dims(img, axis =0)
    
    # Normalize the image
    X = X/255.0
    
    # Turn the image into a Numpy array with float32 data type
    X = X.astype('float32')
    
    return X

### Func: `lambda_handler`

In [5]:
def lambda_handler(event, context):
    # Obtain the image location
    url = event['url']
    
    # Preprocess the image
    X = preprocessor(url)
    
    # Make prediction
    preds = predict(X)
    
    # Obtain the result
    results = decode_predictions(preds)
    return results

### Run the Function

In [6]:
# Simulate the event (trigger)
event = {'url': 'https://tinyurl.com/clothes-t-shirt'} 

# Call the lambda_handler
results = lambda_handler(event, context=None)

# See the prediction result
results

{'dress': 4.461161218216603e-09,
 'hat': 2.6467307959556798e-14,
 'longsleeve': 0.02351023070514202,
 'outwear': 3.530597038683969e-13,
 'pants': 9.088156659523006e-13,
 'shirt': 3.83664740866152e-07,
 'shoes': 1.397734902410014e-19,
 'short': 1.0869638522592595e-11,
 'skirt': 2.561472355836619e-13,
 't-shirt': 0.9764893651008606}

### Put Everything Together: `lambda_function.py`

Finally, let's create a file that stores all the functions needed to run the app, starting from defining the interpreter, receiving the input image, preprocessing the image, and use the saved model to make the prediction.

In [None]:
# in AWS Lambda, we need to use this import below
import tflite_runtime.interpreter as tflite
import numpy as np
from urllib.request import urlopen
from PIL import Image

# Create an interpreter interface for any model in TFLite
interpreter = tflite.Interpreter(model_path='clothing_classifier.tflite')
interpreter.allocate_tensors()

# Get a list of input details from the model
input_details = interpreter.get_input_details()
input_index = input_details[0]['index']

# Get a list of output details from the model
output_details = interpreter.get_output_details()
output_index = output_details[0]['index']

def predict(X):
    # set the value of the input tensor
    interpreter.set_tensor(input_index, X)
    interpreter.invoke()

    # Get the value of the output tensor
    preds = interpreter.get_tensor(output_index)
    
    return preds[0]

labels = [
    'dress',
    'hat',
    'longsleeve',
    'outwear',
    'pants',
    'shirt',
    'shoes',
    'short',
    'skirt',
    't-shirt'
]

def decode_predictions(pred):
    result = {label: float(score) for label, score in zip(labels, pred)}
    return result

def preprocessor(img_url):
    # load the image using PIL module
    img = Image.open(urlopen(img_url))
    
    # Specify the image target size
    img = img.resize((150, 150))
    
    # Turn the image into a 4D-array
    X = np.expand_dims(img, axis =0)
    
    # Normalize the image
    X = X/255.0
    
    # Turn the image into a Numpy array with float32 data type
    X = X.astype('float32')
    
    return X

def lambda_handler(event, context):
    # Obtain the image location
    url = event['url']
    
    # Preprocess the image
    X = preprocessor(url)
    
    # Make prediction
    preds = predict(X)
    
    # Obtain the result
    results = decode_predictions(preds)
    return results

## Deploy Locally with Docker

We just created the `lambda_fucntion.py`. Next, we want to take and deploy it using AWS Lambda. For that, we will use Docker. AWS Lambda supports docker, so we can use a container image to deploy our function.

In this section, you will learn how to run the model locally using Docker within your machine. 

### `Dockerfile`

The next step is to create a Dockerfile. __What should we put in the docker file?__

- __Dockerfile__ is a way for you to put all the dependencies you need for running the code into one single image that contains everything. 


- __A Docker image__ is a private file system just for your container. It provides all the files and code your container needs.


- This image is self-sufficient because it has everything you need, such as:
    - installing the python package management system.
    - installing the pillow library to deal with image file.
    - installing the TensorFlow Lite tflite_runtime interpreter.
    - taking our model in tflite file and copy it to the docker image.
    - taking the lambda_function.py and copy it to the docker image.
    
The file below is the official docker image from Amazon.

<hr>

```python
FROM public.ecr.aws/lambda/python:3.7

RUN pip3 install --upgrade pip

RUN pip3 install pillow --no-cache-dir
RUN pip3 install https://raw.githubusercontent.com/alexeygrigorev/serverless-deep-learning/master/tflite/tflite_runtime-2.2.0-cp37-cp37m-linux_x86_64.whl --no-cache-dir

COPY clothing_classifier.tflite clothing_classifier.tflite
COPY lambda_function.py lambda_function.py

CMD [ "lambda_function.lambda_handler" ]
```
<hr>

What we need to do now is to run and build this docker image, and deploy it using AWS. Another option to deploy the model is by running it locally.

### Build the Docker Image

The followings are the steps we do to run the application locally:

__Run the docker daemon__. There are 2 ways to do this: 
- First option is to open __cmd__ as __administrator__, then launch the following command: `"C:\Program Files\Docker\Docker\DockerCli.exe" -SwitchDaemon`
    
- Second option is to run the __Docker Desktop__ from the start menu and validate that the docker is in __running__ state. 
    

__Build an image from a Dockerfile__. _A Docker image_ is a private file system just for your container. It provides all the files and code your container needs.One important note is that do not change the working directory in Dockerfile

```
$ docker build -t tf-lite-lambda .
```

- The command above will build the image from the content of the folder you are currently in, with the tag name `tf-lite-lambda`. 

### Run the Container Image

__Start a container based on the image you built in the previous step__. Running a container launches your application with private resources, securely isolated from the rest of your machine.

```
$ docker run --rm -p 8080:8080 --name clothes-classifier tf-lite-lambda
```

- The `-p` (stands for _publish_) indicates that we want to map the container port 80 to the host machine port 80. The container opens a Web server on port 80, and we can map ports on our computer to ports exposed by the container.

- The `--rm` (stands for _remove_) indicates that we want to automatically remove the cotainer when it exists.

- The `--name` gives a name to a new container, and `tf-lite-lambda` is the image name we use to create the container.

__Save and share your image on Docker Hub__ to enable other users to easily download and run the image on any destination machine.

```
$ docker tag tf-lite-lambda [userName]/tf-lite-lambda
$ docker push [userName]/tf-lite-lambda
```

Here are the screenshots of the results from the previous commands:

<img src='https://raw.githubusercontent.com/diardanoraihan/E2E_Deep_Learning/main/Visualization/build%20n%20run%20the%20docker%20app.jpg'>

### `test.py`

After we run the model, we want to test it. We need to create a special file that we can call to see the results of what the model has predicted. 

The file contains:
- the complete categories from the expected input image.
- a PANTS (test) image obtained from this link: http://bit.ly/mlbookcamp-pants. We will send a request that has a key `url` and a url of the image
- a URL address indicating that we deploy on the localhost inside the docker.
- a procedure to send a post request to the target URL address to obtain the prediction result.
- parsing the prediction result and showing it to the user.

In [None]:
import requests
import numpy as np

labels = [
    'dress',
    'hat',
    'longsleeve',
    'outwear',
    'pants',
    'shirt',
    'shoes',
    'short',
    'skirt',
    't-shirt'
]

data = {
    "url": "http://bit.ly/mlbookcamp-pants"
}

url ="http://localhost:8080/2015-03-31/functions/function/invocations"

results = requests.post(url, json=data).json()

print('[PREDICTION RESULT]')
print('+-------------------------------------------+')
score = []
for cat in results:
	print('+ {}: {}'.format(cat, results[cat]))
	score.append(results[cat])

best_cat = np.argmax(score)
print('+-------------------------------------------+')
print('Therefore, the model predicts the input image as {}'.format(labels[best_cat].upper()))
print()

Run the `test.py` in your CLI and see the result for yourself:


<img src='https://raw.githubusercontent.com/diardanoraihan/E2E_Deep_Learning/main/Visualization/app%20prediction%20result.jpg'>

## Deploy on AWS

We just deployed the model locally with Docker. Now, we can bring the same container and deploy it on AWS. AWS has everything you need to deploy your deep learning model online. For this case, we will use AWS CLI, AWS ECR, AWS Lambda, and AWS API Gateway.

### Install AWS CLI

Everything you do with AWS is an API call. Although you can do by visiting the website, but wouldn't it be nice if you can do it one time? Hence, make sure you have installed AWS CLI in your local machine. https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2-windows.html

### Configure Your AWS Account

If we want to deploy the app on AWS, it's obvious we need to set up an account there. After you make an AWS IAM User account, set up your Access Key ID, Secret Access Key, Default Region, and Default Output Format (commonly JSON). Once we have done this, we can make programmatic calls to AWS from the AWS CLI.

```
$ aws configure
```

<img src='https://raw.githubusercontent.com/diardanoraihan/E2E_Deep_Learning/main/Visualization/aws%20configure.jpg'>

### Create a Repo in AWS ECR (Elastic Container Registry)

AWS ECR is a place for us to put Docker images. By running the following command, we will create a private repository to store the Docker image we have built previously.

```
$ aws ecr create-repository --repository-name lambda-images
```

<img src='https://raw.githubusercontent.com/diardanoraihan/E2E_Deep_Learning/main/Visualization/aws%20create%20repo.jpg'>

### Publish the Image to the Repo

Now, we want to publish the image that we have built locally. The followings are the steps cited directly from AWS (`AWS ECR > Repositories > lambda-images > View Push Command`):

- Retrieve an authentication token and authenticate your Docker client to your registry.

```
$ aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin XXXXXXXXX474.dkr.ecr.us-east-1.amazonaws.com
```

- Build your Docker image using the following command. 

```
$ docker build -t lambda-images .
```

- Tag your image so you can push the image to this repository.

```
$ docker tag tf-lite-lambda XXXXXXXXX474.dkr.ecr.us-east-1.amazonaws.com/lambda-images:tf-lite-lambda
```

- Run the following command to push this image to your newly created AWS repository.

```
$ docker push XXXXXXXXX474.dkr.ecr.us-east-1.amazonaws.com/lambda-images:tf-lite-lambda
```

Check the pushed image on the AWS ECR web page. Make sure to copy the URL because we need it to create a Lambda Function.

<img src='https://raw.githubusercontent.com/diardanoraihan/E2E_Deep_Learning/main/Visualization/aws%20push%20image.jpg' width="600">

### Create Lambda Function

Now, we are ready to create a Lambda Function. Go to AWS `Lambda` and click `Create function`. Choose `Container Image`.

<img src='https://raw.githubusercontent.com/diardanoraihan/E2E_Deep_Learning/main/Visualization/aws%20create%20lambda%20func.jpg' width="800">

Give your function a unique name and fill in the Container Image URL with the Image URL that you copied earlier. By leaving everything to default, click `Create function`

### Test the Lambda Function

You just created a lambda function for a prediction task. However, the current configuration does not give us sufficient memory and timeout. We have a big model and the function will take some time to run and load everything to the memory for the first time. Thus, we need to reconfigure it. Go to `Configuration` > `General Configuration` > click `Edit` and set RAM and Timeout to __512/1024__ and __30__ sec respectively. Save it.

<img src='https://raw.githubusercontent.com/diardanoraihan/E2E_Deep_Learning/main/Visualization/aws%20RAM%20Timeout.jpg' width="600">


Next, create a test with this JSON file format:
```python
{
    "url": "https://tinyurl.com/clothes-t-shirt"
}
```
<img src='https://raw.githubusercontent.com/diardanoraihan/E2E_Deep_Learning/main/Visualization/aws%20test%20event.jpg' width="800">

Give a new event a name and click `Test` after you save it. Then, you will see the following result:

<img src='https://raw.githubusercontent.com/diardanoraihan/E2E_Deep_Learning/main/Visualization/aws%20test%20result.jpg' alt='taken from tfcertification.com' width="800">

One thing you need to be aware of is that with AWS Lambda, you will be charged based on the number of requests and the duration, that is, the time it takes for our code to be executed. Please refer to this [link](https://aws.amazon.com/lambda/pricing/) for more pricing info.

### API Gateway Integration

You just tested the function and it seems to work well in making the prediction. What's left is to use it from outside (online). To do this, we need to create an API via AWS API Gateway.

__1. Create a New API__

- Visit AWS API Gateway, then choose REST API by clicking `Build` button.
- Choose the protocol: `REST`. Choose New API for __Creating New API__. Then, fill in the API Name and add some description.

<img src='https://raw.githubusercontent.com/diardanoraihan/E2E_Deep_Learning/main/Visualization/aws%20rest%20api.jpg' width="300"><img src='https://raw.githubusercontent.com/diardanoraihan/E2E_Deep_Learning/main/Visualization/aws%20api%20setting.jpg' width="600">

__2. Create a resource: Predict and a method POST__
- From `Actions`, choose Make Resource > fill in "predict".
- From `Actions`, choose Make Method > select `POST`
<img src='https://raw.githubusercontent.com/diardanoraihan/E2E_Deep_Learning/main/Visualization/aws%20method%20resource.jpg' width="200">

__3. Select the Lambda Function and add some details__. 
- Click on `POST`, then make sure to write the correct name for the __Lambda Function__ and leave everything by default. 
<img src='https://raw.githubusercontent.com/diardanoraihan/E2E_Deep_Learning/main/Visualization/aws%20predict%20post%20setup.jpg' width="500">

__4. Test the API__. 
- From the flow chart execution, click `Test`.
<img src='https://raw.githubusercontent.com/diardanoraihan/E2E_Deep_Learning/main/Visualization/aws%20test%20api.jpg' width="800">
- To test it, input the following code in the __Request Body__:
<img src='https://raw.githubusercontent.com/diardanoraihan/E2E_Deep_Learning/main/Visualization/aws%20request%20body.jpg' width="400">
- You should see the following result in the __Response Body__:
<img src='https://raw.githubusercontent.com/diardanoraihan/E2E_Deep_Learning/main/Visualization/aws%20response%20body.jpg' width="400">

__5. Deploy the API__
- Finally, we need to deploy the API to use it outside. From `Actions`, click `DEPLOY API`. 
<img src='https://raw.githubusercontent.com/diardanoraihan/E2E_Deep_Learning/main/Visualization/aws%20deploy%20api.jpg' width="400">


- Obtain the URL from the "Invoke URL" section. In this case, we have: https://xw2bv0y8mb.execute-api.us-east-1.amazonaws.com/test


- Open the __Postman App__ or go to [reqbin](https://reqbin.com/) to test the REST API we just created. Notice, since we specify `predict` as our method for `POST`, we need to add `/predict` at the end of the URL. Hence, the complete URL to make an API call for making a prediction is https://xw2bv0y8mb.execute-api.us-east-1.amazonaws.com/test/predict. Copy and paste the link to the URL section in the app.


- Copy the following object in JSON as the body to make this POST request. Click `Send`.
```javascript
{
    "url": "https://tinyurl.com/clothes-t-shirt"
}
```


- You can see the prediction result as the content received after making the API call POST request.

<img src='https://raw.githubusercontent.com/diardanoraihan/E2E_Deep_Learning/main/Visualization/aws%20post%20request.jpg'>


- Alternatively, we can use `cURL` (stands for client URL) to send the data (in this case, the t-shirt image) in POST request to our service (in this case, the clothes image classifier) via terminal (i.e. Git Bash).
```
$ curl -d '{"url": "https://tinyurl.com/clothes-t-shirt"}' -H "Content-Type: application/json" -X POST https://xw2bv0y8mb.execute-api.us-east-1.amazonaws.com/test/predict
````

- Run the command above will generate this prediction result:
<img src="https://raw.githubusercontent.com/diardanoraihan/E2E_Deep_Learning/main/Visualization/aws%20post%20request%20cmd.jpg">

- Congrats, now your deep learning model is totally online and ready to help the world become a better place!