# Get Huggingface pretrained model

In [10]:
!pip install transformers --upgrade --quiet

You should consider upgrading via the '/home/ec2-user/anaconda3/envs/pytorch_latest_p36/bin/python -m pip install --upgrade pip' command.[0m


In [11]:
import transformers
import torch
from transformers import pipeline

## Get pretrained model, test and save it

In [12]:
pretrained_classifier = pipeline("sentiment-analysis")

Downloading:   0%|          | 0.00/629 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/268M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/232k [00:00<?, ?B/s]

In [13]:
pretrained_classifier([
    "I really like this place!",
    "The food was bad!"
])

[{'label': 'POSITIVE', 'score': 0.9998599290847778},
 {'label': 'NEGATIVE', 'score': 0.9997907280921936}]

In [14]:
pretrained_classifier.save_pretrained("./model/")

## Check if we can successfully load our saved model

In [15]:
local_classifier = pipeline('sentiment-analysis', model="./model/", tokenizer="./model/")

In [16]:
local_classifier([
    "I really like this place!",
    "The food was bad!"
])

[{'label': 'POSITIVE', 'score': 0.9998599290847778},
 {'label': 'NEGATIVE', 'score': 0.9997907280921936}]

ok this works!

## Build container for Lambda

Let's build a container that holds our huggingface model 
and our lambda code that handles requests and forwards the data to our huggingface model.

In [18]:
print("look at the versions for our dependencies:")
print("")
!python --version
print(f"Torch version {torch.__version__}")
print(f"Transformers version {transformers.__version__}")

look at the versions for our dependencies:

Python 3.6.13
Torch version 1.7.1
Transformers version 4.9.2


### Create requirements.txt with dependencies

In [20]:
%%writefile requirements.txt
--find-links  https://download.pytorch.org/whl/torch_stable.html 
 
torch==1.7.1+cpu
transformers==4.9.2

Overwriting requirements.txt


### Create dockerfile
https://docs.aws.amazon.com/lambda/latest/dg/images-create.html

- start from the official lambda docker container.
- copy the requirements.txt to the docker container and isntall.
- copy our model we saved locally in the model/ folder to the docker container
- execcute the lambda handler function on start

In [21]:
%%writefile Dockerfile
FROM public.ecr.aws/lambda/python:3.8

COPY requirements.txt ./requirements.txt
RUN pip install -r requirements.txt 

COPY ./model/   ./model/
COPY ./app/app.py   ./
CMD ["app.handler"]

Overwriting Dockerfile


### Build dockerfile and push to ECR

In [22]:
!pip install sagemaker-studio-image-build --quiet

You should consider upgrading via the '/home/ec2-user/anaconda3/envs/pytorch_latest_p36/bin/python -m pip install --upgrade pip' command.[0m


make sure this role has a trust relationship with codebuild

https://github.com/aws-samples/sagemaker-studio-image-build-cli

In [None]:
!sm-docker build . --repository huggingface-on-lambda:1.0

## Deploy Container within a Lambda Function

Create a serverless.yml file that points to the docker image uri + digest.

We expect a POST request on the path "prediction/"

More configuration can be found in the serverless file ...

In [24]:
ACCOUNT_NUMBER = ""
REGION = ""
# {ACCOUNT_NUMBER}.dkr.ecr.{REGION}.amazonaws.com/huggingface-on-lambda
image_uri= ""
# sha256:f0e2ae3aee2cceb6d93d...
image_digest = ""

__serverless file__:

In [None]:
print(f"""
service: huggingface-on-lambda

provider:
  name: aws
  region: eu-west-1 

functions:
  huggingface:
    image: {image_uri}@{image_digest}
    # 2 minutes before we throw a timeout
    timeout: 120
    # have 1 hot lambda available
    provisionedConcurrency: 1
    # our model is less than 1GB, so 1024MB is enough
    memorySize: 1024 
    events:
      - http:
          path: prediction
          method: post

""")

open an AWS cloudshell and install the serverless framework

    npm install serverless

create the serverless.yml file in the cloudshell
    
    cat > serverless.yml

paste the contents and press ctrl + c to close and save the file.
To deploy the stack run

    npm install --prefix ./ serverless
    node_modules/serverless/bin/serverless.js deploy
    
    
            Output:
    
            Serverless: Packaging service...
            Serverless: WARNING: Function huggingface has timeout of 120 seconds, however, 
            it's attached to API Gateway so it's automatically limited to 30 seconds.
            Serverless: Uploading CloudFormation file to S3...
            Serverless: Uploading artifacts...
            Serverless: Validating template...
            Serverless: Updating Stack...
            Serverless: Checking Stack update progress...
            ........................
            Serverless: Stack update finished...
            Service Information
            service: huggingface-on-lambda
            stage: dev
            region: eu-west-1
            stack: huggingface-on-lambda-dev
            resources: 12
            api keys:
              None
            endpoints:
              POST - https://8af9ar02gi.execute-api.eu-west-1.amazonaws.com/dev/prediction
            functions:
              huggingface: huggingface-on-lambda-dev-huggingface
            layers:
              None
            Serverless: Removing old service artifacts from S3...

if deploy succeeds, call the endpoint

    curl -d '{"data":"some very much wow positive text!"}' -H "Content-Type: application/json" -X POST https://8af9ar02gi.execute-api.eu-west-1.amazonaws.com/dev/prediction
            
            Output:
            
            [{"label": "POSITIVE", "score": 0.9998674392700195}]