## Build Serverless Endpoint by SageMaker
### requirments
* packages to be installed by pip
  + huggingface_hub==0.1.0
  + transformers==4.12
  + boto3
  + awscli
* set up aws config with access key and secret access key
* import models and pipeline from transformers

### python code to create SageMaker endpoint

In [None]:
# import transformers and model from transformers
from transformers import AutoTokenizer, AutoModelForSequenceClassification
from transformers import TextClassificationPipeline

tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")

model = AutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")

pipe = TextClassificationPipeline(model=model, tokenizer=tokenizer, return_all_scores=True)
pipe("I love Amazon SageMaker Studio Lab!")

# initialize a sagemaker client, and test it by listing enpoints
import boto3
sm_client = boto3.client('sagemaker',
                        aws_access_key_id=ACCESS_KEY,
                        aws_secret_access_key=SECRET_KEY)
response = sm_client.list_endpoints()

len(response)

# define the model name and an endpoint_config_name
import time
ml_model_name = "text-classification-hugging-face"
timestamp = time.strftime('-%Y-%m-%d-%H-%M-%S', time.gmtime())
model_name = ml_model_name + '-model' + timestamp
endpoint_config_name = ml_model_name + '-epc' + timestamp
print(model_name)
print(endpoint_config_name)

# define container_config with docker image ECR url, Mode, and environment
# and create model by sagemaker client. Check model is created in SageMaker console
model_data_url="763104351884.dkr.ecr.us-east-1.amazonaws.com/huggingface-pytorch-inference:1.9-transformers4.12-cpu-py38-ubuntu20.04"

container_config = {'Image': model_data_url,
                    'Mode': 'SingleModel',
                    'Environment': {
                        'HF_MODEL_ID': 'distilbert-base-uncased-finetuned-sst-2-english',
                        'HF_TASK' : 'text-classification',
                        'SAGEMAKER_CONTAINER_LOG_LEVEL' : '20',
                        'SAGEMAKER_REGION' : 'us-east-1'
                    }
                   }

response = sm_client.create_model(
    ModelName=model_name,
    PrimaryContainer=container_config,
    ExecutionRoleArn=role, 
    EnableNetworkIsolation=False
)

# create a serverless endpoint configuration
endpoint_config_response = sm_client.create_endpoint_config(
   EndpointConfigName=endpoint_config_name,
   ProductionVariants=[
        {
            "ModelName": model_name,
            "VariantName": "AllTraffic",
            "ServerlessConfig": {
                # Specify MemorySizeInMB and MaxConcurrency in the serverless config object
                "MemorySizeInMB": 4096,
                "MaxConcurrency": 10
            }
        }
    ]
)

print('Endpoint configuration name: {}'.format(endpoint_config_name))
print('Endpoint configuration arn:  {}'.format(endpoint_config_response['EndpointConfigArn']))

# create endpoint by endpoint_config_name, and check if endpoint is created from console
endpoint_name = "studio-lab-ep" + '-epc' + timestamp
response = sm_client.create_endpoint(
    EndpointName=endpoint_name,
    EndpointConfigName=endpoint_config_name
)

# test with boto3 SageMaker runtime to invoke the endpoint
import json
import boto3
runtime = boto3.client("sagemaker-runtime",
                       aws_access_key_id=ACCESS_KEY,
                       aws_secret_access_key=SECRET_KEY) 

content_type = "application/json"

# example request, you always need to define "inputs"
data = {
   "inputs": "Happy Birthday to you!"
}

response = runtime.invoke_endpoint(
    EndpointName=endpoint_name,
    ContentType=content_type,
    Body=json.dumps(data)
)

# delete the model, endpoint config and endpoint, in the reversed order
sm_client.delete_endpoint(EndpointName=endpoint_name)
sm_client.delete_endpoint_config(EndpointConfigName=endpoint_config_name)
sm_client.delete_model(ModelName=model_name)


### Setup Github Actions workflow to build and push an image to Amazon ECR

#### Setup AWS
* create access key and secret access key in AWS for this project
* create a ECR repository (each image has a repository)

#### Setup Github
* in settings -> secrets -> new repository secret, set the follwoing secretes with values from AWS setup
  + REPO_NAME: the name of the AWS ECR repository you created
  + AWS_ACCESS_KEY_ID: AWS access key ID
  + AWS_SECRET_ACCESS_KEY: AWS secret access key
* create a file named main.yml in .github/workflows directory of the root folder. The temlate for main.yml is the following:
```yml
    on:
      push:
        branches: [ main ]
      pull_request:
        branches: [ main ]

    name: AWS ECR push

    jobs:
      deploy:
        name: Deploy
        runs-on: ubuntu-latest

        steps:
        - name: Checkout
          uses: actions/checkout@v2

        - name: Configure AWS credentials
          uses: aws-actions/configure-aws-credentials@v1
          with:
            aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
            aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
            aws-region: us-east-2

        - name: Login to Amazon ECR
          id: login-ecr
          uses: aws-actions/amazon-ecr-login@v1

        - name: Build, tag, and push the image to Amazon ECR
          id: build-image
          env:
            ECR_REGISTRY: ${{ steps.login-ecr.outputs.registry }}
            ECR_REPOSITORY: ${{ secrets.REPO_NAME }}
            IMAGE_TAG: latest
          run: |
            # Build a docker container and push it to ECR 
            docker build -t $ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG .
            echo "Pushing image to ECR..."
            docker push $ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG
            echo "::set-output name=image::$ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG"
```   


* commit you changes in github and see the build process and the updated docker images in ECR