# Serverless Deep Learning

We'll deploy the clothes classification model we trained previously.


###  AWS Lambda

* Intro to AWS Lambda
* Serverless vs serverfull

We just have to write the function, we don't need to think about the infrastructure.

If there are no request we don't have to pay (We can create the function and if we don't use it it will not cost us money)

`lambda_function.py`

In [1]:
import json

def lambda_handler(event, context):
    print("parameters:", event)
    url = event['url']
    return { "prediction": "pants" }

### TensorFlow Lite

* Why not TensorFlow
* Converting the model
* Using the TF-Lite model for making predictions

TensorFlow is too large, this is a smaller version of TensorFlow.

__Reasons__:

* AWS Lambda Limits (Used to be <= 50 MB Zip File, now this is not the reason because lambda limits are up to 10GB - For Docker)
* Larger Image = +$$$ for storage
* Larger Image = +Time to initialize the image
* TensorFlow is slow to import, Bigger RAM footprint

TensorFlow Lite only focus on inference:

* Inference is when we do `model.predict(X)`


In [2]:
import tensorflow as tf
from tensorflow import keras

In [None]:
# Just checking the version
tf.__version__

'2.18.0'

In [17]:
model = keras.models.load_model('clothing-model.keras')

In [18]:
from tensorflow.keras.preprocessing.image import load_img
from tensorflow.keras.applications.xception import preprocess_input

In [19]:
img = load_img('pants.jpg', target_size=(299, 299))

In [20]:
import numpy as np

In [21]:
x = np.array(img)
X = np.array([x])

X = preprocess_input(X)

In [22]:
X.shape

(1, 299, 299, 3)

In [23]:
preds = model.predict(X)

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 502ms/step


In [24]:
preds

array([[-2.4649577, -5.283191 , -6.699933 , -4.505825 , 13.79345  ,
        -6.163595 , -4.2706175,  3.0697412, -3.403578 , -8.285092 ]],
      dtype=float32)

In [25]:
classes = ['dress', 'hat', 'longsleeve', 'outwear', 'pants', 'shirt', 'shoes', 'shorts', 'skirt', 't-shirt']

In [26]:
dict(zip(classes, preds[0]))

{'dress': np.float32(-2.4649577),
 'hat': np.float32(-5.283191),
 'longsleeve': np.float32(-6.699933),
 'outwear': np.float32(-4.505825),
 'pants': np.float32(13.79345),
 'shirt': np.float32(-6.163595),
 'shoes': np.float32(-4.2706175),
 'shorts': np.float32(3.0697412),
 'skirt': np.float32(-3.403578),
 't-shirt': np.float32(-8.285092)}

#### Convert Keras to TF-Lite

In [27]:
converter = tf.lite.TFLiteConverter.from_keras_model(model)

tflite_model = converter.convert()

with open('clothing-model.tflite', 'wb') as f_out:
    f_out.write(tflite_model)

INFO:tensorflow:Assets written to: /var/folders/km/dql4gckx6ps4lrl0mdrhnsbh0000gn/T/tmplkr1552n/assets


INFO:tensorflow:Assets written to: /var/folders/km/dql4gckx6ps4lrl0mdrhnsbh0000gn/T/tmplkr1552n/assets


Saved artifact at '/var/folders/km/dql4gckx6ps4lrl0mdrhnsbh0000gn/T/tmplkr1552n'. The following endpoints are available:

* Endpoint 'serve'
  args_0 (POSITIONAL_ONLY): TensorSpec(shape=(None, 299, 299, 3), dtype=tf.float32, name='input_layer_75')
Output Type:
  TensorSpec(shape=(None, 10), dtype=tf.float32, name=None)
Captures:
  5602721424: TensorSpec(shape=(), dtype=tf.resource, name=None)
  5604507920: TensorSpec(shape=(), dtype=tf.resource, name=None)
  5604508112: TensorSpec(shape=(), dtype=tf.resource, name=None)
  5602719120: TensorSpec(shape=(), dtype=tf.resource, name=None)
  5604508688: TensorSpec(shape=(), dtype=tf.resource, name=None)
  5604509072: TensorSpec(shape=(), dtype=tf.resource, name=None)
  5604510032: TensorSpec(shape=(), dtype=tf.resource, name=None)
  5604508880: TensorSpec(shape=(), dtype=tf.resource, name=None)
  5604510416: TensorSpec(shape=(), dtype=tf.resource, name=None)
  5604509648: TensorSpec(shape=(), dtype=tf.resource, name=None)
  5604509840: Tenso

W0000 00:00:1732944468.879418 7355263 tf_tfl_flatbuffer_helpers.cc:365] Ignored output_format.
W0000 00:00:1732944468.879472 7355263 tf_tfl_flatbuffer_helpers.cc:368] Ignored drop_control_dependency.
2024-11-30 00:27:48.879954: I tensorflow/cc/saved_model/reader.cc:83] Reading SavedModel from: /var/folders/km/dql4gckx6ps4lrl0mdrhnsbh0000gn/T/tmplkr1552n
2024-11-30 00:27:48.886960: I tensorflow/cc/saved_model/reader.cc:52] Reading meta graph with tags { serve }
2024-11-30 00:27:48.886976: I tensorflow/cc/saved_model/reader.cc:147] Reading SavedModel debug info (if present) from: /var/folders/km/dql4gckx6ps4lrl0mdrhnsbh0000gn/T/tmplkr1552n
I0000 00:00:1732944468.956068 7355263 mlir_graph_optimization_pass.cc:401] MLIR V1 optimization pass is not enabled
2024-11-30 00:27:48.968971: I tensorflow/cc/saved_model/loader.cc:236] Restoring SavedModel bundle.
2024-11-30 00:27:49.505972: I tensorflow/cc/saved_model/loader.cc:220] Running initialization op on SavedModel bundle at path: /var/folder

#### Manually setting input and output indexes

In [28]:
import tensorflow.lite as tflite

In [29]:
interpreter = tflite.Interpreter(model_path='clothing-model.tflite')
interpreter.allocate_tensors()

INFO: Created TensorFlow Lite XNNPACK delegate for CPU.


In [30]:
interpreter.get_input_details()

[{'name': 'serving_default_input_layer_75:0',
  'index': 0,
  'shape': array([  1, 299, 299,   3], dtype=int32),
  'shape_signature': array([ -1, 299, 299,   3], dtype=int32),
  'dtype': numpy.float32,
  'quantization': (0.0, 0),
  'quantization_parameters': {'scales': array([], dtype=float32),
   'zero_points': array([], dtype=int32),
   'quantized_dimension': 0},
  'sparsity_parameters': {}}]

In [31]:
input_index = interpreter.get_input_details()[0]['index']

In [32]:
input_index

0

In [35]:
output_index = interpreter.get_output_details()[0]['index']

In [36]:
output_index

229

#### Inicializing input, invoking computations, fetching predictions

In [37]:
interpreter.set_tensor(input_index, X)

In [38]:
interpreter.invoke()

In [40]:
preds = interpreter.get_tensor(output_index)

In [41]:
dict(zip(classes, preds[0]))

{'dress': np.float32(-2.4649622),
 'hat': np.float32(-5.2831907),
 'longsleeve': np.float32(-6.6999326),
 'outwear': np.float32(-4.505824),
 'pants': np.float32(13.7934475),
 'shirt': np.float32(-6.163595),
 'shoes': np.float32(-4.2706175),
 'shorts': np.float32(3.0697498),
 'skirt': np.float32(-3.4035814),
 't-shirt': np.float32(-8.285092)}

#### Converting Tensorflow code to TensorFlow Lite

In [42]:
from PIL import Image

In [None]:
# Instead of using load_img
# img = load_img('pants.jpg', target_size=(299, 299))
with Image.open('pants.jpg') as img:
    img = img.resize((299, 299), Image.NEAREST)

In [None]:
# This is what the preprocess_input method is doing
def preprocess_input(x):
    x /=127.5
    x -= 1.
    return x

In [46]:
x = np.array(img, dtype='float32')
X = np.array([x])
X = preprocess_input(X)

In [47]:
interpreter.set_tensor(input_index, X)
interpreter.invoke()
preds = interpreter.get_tensor(output_index)

In [48]:
classes = ['dress', 'hat', 'longsleeve', 'outwear', 'pants', 'shirt', 'shoes', 'shorts', 'skirt', 't-shirt']
dict(zip(classes, preds[0]))

{'dress': np.float32(-2.4649622),
 'hat': np.float32(-5.2831907),
 'longsleeve': np.float32(-6.6999326),
 'outwear': np.float32(-4.505824),
 'pants': np.float32(13.7934475),
 'shirt': np.float32(-6.163595),
 'shoes': np.float32(-4.2706175),
 'shorts': np.float32(3.0697498),
 'skirt': np.float32(-3.4035814),
 't-shirt': np.float32(-8.285092)}

#### Simpler way of doing it

```bash
pip install keras-image-helper
```

In [49]:
from keras_image_helper import create_preprocessor

In [50]:
preprocessor = create_preprocessor('xception', target_size=(299, 299))

In [51]:
# preprocessor.from_url
# preprocessor.from_path
url = 'http://bit.ly/mlbookcamp-pants'
X = preprocessor.from_url(url)

In [52]:
interpreter.set_tensor(input_index, X)
interpreter.invoke()
preds = interpreter.get_tensor(output_index)

In [53]:
classes = ['dress', 'hat', 'longsleeve', 'outwear', 'pants', 'shirt', 'shoes', 'shorts', 'skirt', 't-shirt']
dict(zip(classes, preds[0]))

{'dress': np.float32(-2.4649622),
 'hat': np.float32(-5.2831907),
 'longsleeve': np.float32(-6.6999326),
 'outwear': np.float32(-4.505824),
 'pants': np.float32(13.7934475),
 'shirt': np.float32(-6.163595),
 'shoes': np.float32(-4.2706175),
 'shorts': np.float32(3.0697498),
 'skirt': np.float32(-3.4035814),
 't-shirt': np.float32(-8.285092)}

#### Puting everyting together

```bash
pip install ai-edge-litert
```

Should be python 3.11

In [55]:
import tensorflow.lite as tflite

from keras_image_helper import create_preprocessor

interpreter = tflite.Interpreter(model_path='clothing-model.tflite')
interpreter.allocate_tensors()

input_index = interpreter.get_input_details()[0]['index']
output_index = interpreter.get_output_details()[0]['index']

preprocessor = create_preprocessor('xception', target_size=(299, 299))

url = 'http://bit.ly/mlbookcamp-pants'
X = preprocessor.from_url(url)

interpreter.set_tensor(input_index, X)
interpreter.invoke()
preds = interpreter.get_tensor(output_index)

classes = ['dress', 'hat', 'longsleeve', 'outwear', 'pants', 'shirt', 'shoes', 'shorts', 'skirt', 't-shirt']
dict(zip(classes, preds[0]))

{'dress': np.float32(-2.4649622),
 'hat': np.float32(-5.2831907),
 'longsleeve': np.float32(-6.6999326),
 'outwear': np.float32(-4.505824),
 'pants': np.float32(13.7934475),
 'shirt': np.float32(-6.163595),
 'shoes': np.float32(-4.2706175),
 'shorts': np.float32(3.0697498),
 'skirt': np.float32(-3.4035814),
 't-shirt': np.float32(-8.285092)}

### Preparing the Lambda code

* Moving the code from notebook to script

```bash
jupyter nbconvert --to script serverless.ipynb
mv serverless.py lambda_function.py
```

* Testing it locally

In [58]:
import lambda_function

event = { 'url': 'http://bit.ly/mlbookcamp-pants' }

lambda_function.lambda_handler(event, None)

{'dress': np.float32(-2.4649622),
 'hat': np.float32(-5.2831907),
 'longsleeve': np.float32(-6.6999326),
 'outwear': np.float32(-4.505824),
 'pants': np.float32(13.7934475),
 'shirt': np.float32(-6.163595),
 'shoes': np.float32(-4.2706175),
 'shorts': np.float32(3.0697498),
 'skirt': np.float32(-3.4035814),
 't-shirt': np.float32(-8.285092)}

### Preparing a Docker image

* Lambda base images

https://gallery.ecr.aws

* Preparing the dockerfile

```Dockerfile
# https://gallery.ecr.aws/lambda/python
FROM public.ecr.aws/lambda/python:3.8

RUN pip install keras-image-helper
# RUN pip install --extra-index-url https://google-coral.github.io/py-repo/ tflite_runtime
RUN pip install https://github.com/alexeygrigorev/tflite-aws-lambda/blob/main/tflite/tflite_runtime-2.7.0-cp38-cp38-linux_x86_64.whl?raw=true

COPY clothing-model.tflite .
COPY lambda_function.py .

CMD [ "lambda_function.lambda_handler" ]

```

* Build the image

```bash
docker build --platform linux/amd64 -t clothing-model .
```

* Test the image

```bash
docker run -it --rm --platform linux/amd64 -p 8080:8080 clothing-model:latest
```

* Using the right TF-Lite wheel

https://github.com/alexeygrigorev/tflite-aws-lambda

https://github.com/alexeygrigorev/tflite-aws-lambda/blob/main/tflite/tflite_runtime-2.7.0-cp38-cp38-linux_x86_64.whl?raw=true


In [67]:
import requests

url = 'http://localhost:8080/2015-03-31/functions/function/invocations'
data = { 'url': 'http://bit.ly/mlbookcamp-pants' }

result = requests.post(url, json=data)

print(result)

<Response [404]>


### Creating the lambda function

* Publishing the image to AWS ECR

```bash
pip install awscli

aws configure

aws ecr create-repository --repository-name clothing-tflite-images

# aws ecr get-login --no-include-email | sed 's/[0-9a-zA-Z=]\{20,\}/PASSWORD/g'

# docker login -u AWS -p PASSWORD https://387546586014.dkr.ecr.eu-west-1.amazonaws.com

$(aws ecr get-login --no-include-email)

# Login Succeded

ACCOUNT=387546586014
REGION=eu-west-1
REGISTRY=clothing-tflite-images
PREFIX=${ACCOUNT}.dkr.ecr.${REGION}.amazonaws.com/${REGISTRY}
TAG=clothing-model-xception-v4-001
REMOTE_URI=${PREFIX}:${TAG}

echo ${REMOTE_URI}

docker tag clothing-model:latest ${REMOTE_URI}

docker push ${REMOTE_URI}
```

* Creating the function

    - Lambda
    - Create function
    - Container image
    - Function name: clothing-classification
    - Container image URI: --Browse Images--
    - Click on Create Function button

* Configuring it

    - Configuration
    - General Cofiguration
    - Edit
        - Memory: 1024 mb
        - Timeout: 30 seconds

* Testing the function from the AWS Console


* Pricing

0.0000000167 * 2000


### API Gateway: exposing the lambda function

* Creating and configuring the gateway

1. API Gateway
    1. Create API
    2. Rest API
        1. Click Build
        2. API name: clothes-classification
        3. Click "Create API"
        4. Actions -> Create Resource
            1. Resource Name: "predict"
            2. Resource Path: / "predict"
            3. Click "Create Resource"
        5. For the resource
            1. Create POST Method
                1. Integration type: Lambda Function
                2. Lambda Region: us-east-1
                3. Lambda Function: clothing-classification
        6. Actions: Deploy API
            1. Deployment stage: [New Stage]
            2. Stage name: test
            3. Click: "Deploy"


Test with url "https://rpa3mf7j86.execute-api.eu-west-1.amazonaws.com/test/predict"

### Summary

* AWS Lambda is a way of deploying models without having to worry about servers
* Tensorflow Lite is a lightweight alternative to Tensorflow that only focuses on inference
* To deploy your code, package it in a Docker container
* Expose the lambda function via API Gateway

### Deploying BentoML to AWS lambda with Bentoctl

```bash
# Build our bento 
bentoml build

# https://github.com/bentoml/bentoctl

# Make sure we are connected to aws
# aws configure
# export AWS_PROFILE=kasteion
aws s3 ls

# Install Bentoctl
pip install bentoctl

# https://github.com/bentoml/bentoctl/aws-lambda-deployment

# Instal AWS Lambda Operator
bentoctl operator install aws-lambda

# Initialize deployment with bentoctl
mkdir deployment
cd deployment

bentoctl init
# name: credit-risk-mlzoomcamp
# operator:
#   name: aws-lambda
# template: terraform
# spec:
#   region: us-west-1
#   timeout: 10
#   memory_size: 512

# Install Terraform
brew tap hashicorp/tap
brew install hashicorp/tap/terraform

# Build command: Build lambda image & push to remote repo for AWS
# run with --dryrun to only build the image to test locally 
bentoctl build -b credit_ristk_classifier:yq3a6zdu62jrydu5 -f deployment_config.yaml

# Init terraform state file
terraform init

# Check plan
terraform plan --var-file=bentoctl.tfvars

# Apply infrastructure changes
terraform apply --var-file=bentoctl.tfvars -auto-approve

# To takedown the infrastructure from the AWS account
bentoctl destroy -f deployment_config.yaml
