Container Structure

- NER
   - predictor.py: (Flask app for inference)
   - wsgi.py: (Wrapper around predictor)
   - nginx.conf: (Config for nginx front-end)
   - serve: program for container hosting, launches gunicorn server
   - Note that there is no train for pre-trained
- Dockerfile

## Push Docker Image to ECR

In [11]:
%%sh

# Name of algo -> ECR
algorithm_name=sm-pretrained-spacy

cd container

#make serve executable
chmod +x NER/serve

account=$(aws sts get-caller-identity --query Account --output text)

# Region, defaults to us-west-2
region=$(aws configure get region)
region=${region:-us-east-1}

fullname="${account}.dkr.ecr.${region}.amazonaws.com/${algorithm_name}:latest"

# If the repository doesn't exist in ECR, create it.
aws ecr describe-repositories --repository-names "${algorithm_name}" > /dev/null 2>&1

if [ $? -ne 0 ]
then
    aws ecr create-repository --repository-name "${algorithm_name}" > /dev/null
fi

# Get the login command from ECR and execute it directly
aws ecr get-login-password --region ${region}|docker login --username AWS --password-stdin ${fullname}

# Build the docker image locally with the image name and then push it to ECR
# with the full name.
#PACKER_LOG=1 packer build template.json

docker build  -t ${algorithm_name} .
docker tag ${algorithm_name} ${fullname}

docker push ${fullname}

Login Succeeded
Sending build context to Docker daemon  19.46kB
Step 1/10 : FROM python:3.8
 ---> 5e51aed29a27
Step 2/10 : RUN apt-get -y update && apt-get install -y --no-install-recommends          wget          python3          nginx          ca-certificates     && rm -rf /var/lib/apt/lists/*
 ---> Using cache
 ---> 12f2720fd348
Step 3/10 : RUN wget https://bootstrap.pypa.io/get-pip.py && python3 get-pip.py &&     pip install flask gevent gunicorn &&         rm -rf /root/.cache
 ---> Using cache
 ---> 4a585228e966
Step 4/10 : RUN pip install spacy
 ---> Using cache
 ---> 6383e3a4d669
Step 5/10 : RUN python -m spacy download en_core_web_sm
 ---> Using cache
 ---> 619ca72ad015
Step 6/10 : ENV PYTHONUNBUFFERED=TRUE
 ---> Using cache
 ---> 996403cb71b4
Step 7/10 : ENV PYTHONDONTWRITEBYTECODE=TRUE
 ---> Using cache
 ---> e9414edab125
Step 8/10 : ENV PATH="/opt/program:${PATH}"
 ---> Using cache
 ---> 785b5acb646b
Step 9/10 : COPY NER /opt/program
 ---> Using cache
 ---> ef7005d80f16
Step

https://docs.docker.com/engine/reference/commandline/login/#credentials-store

EOF


CalledProcessError: Command 'b'\n# Name of algo -> ECR\nalgorithm_name=sm-pretrained-spacy\n\ncd container\n\n#make serve executable\nchmod +x NER/serve\n\naccount=$(aws sts get-caller-identity --query Account --output text)\n\n# Region, defaults to us-west-2\nregion=$(aws configure get region)\nregion=${region:-us-east-1}\n\nfullname="${account}.dkr.ecr.${region}.amazonaws.com/${algorithm_name}:latest"\n\n# If the repository doesn\'t exist in ECR, create it.\naws ecr describe-repositories --repository-names "${algorithm_name}" > /dev/null 2>&1\n\nif [ $? -ne 0 ]\nthen\n    aws ecr create-repository --repository-name "${algorithm_name}" > /dev/null\nfi\n\n# Get the login command from ECR and execute it directly\naws ecr get-login-password --region ${region}|docker login --username AWS --password-stdin ${fullname}\n\n# Build the docker image locally with the image name and then push it to ECR\n# with the full name.\n#PACKER_LOG=1 packer build template.json\n\ndocker build  -t ${algorithm_name} .\ndocker tag ${algorithm_name} ${fullname}\n\ndocker push ${fullname}\n'' returned non-zero exit status 1.

## SageMaker Client Setup

In [5]:
import boto3
from sagemaker import get_execution_role

sm_client = boto3.client(service_name='sagemaker')
runtime_sm_client = boto3.client(service_name='sagemaker-runtime')

account_id = boto3.client('sts').get_caller_identity()['Account']
region = boto3.Session().region_name

s3_bucket = 'spacy-sagemaker-eu-west-1-bucket'
role = get_execution_role()

## Model Creation

In [6]:
from time import gmtime, strftime

model_name = 'spacy-nermodel-' + strftime("%Y-%m-%d-%H-%M-%S", gmtime())
model_url = 's3://{}/spacy/'.format(s3_bucket) ## MODEL S3 URL
container = '{}.dkr.ecr.{}.amazonaws.com/sm-pretrained-spacy:latest'.format(account_id, region)
instance_type = 'ml.c5d.18xlarge'

print('Model name: ' + model_name)
print('Model data Url: ' + model_url)
print('Container image: ' + container)

container = {
    'Image': container
}

create_model_response = sm_client.create_model(
    ModelName = model_name,
    ExecutionRoleArn = role,
    Containers = [container])

print("Model Arn: " + create_model_response['ModelArn'])

Model name: spacy-nermodel-2022-10-22-12-49-20
Model data Url: s3://spacy-sagemaker-eu-west-1-bucket/spacy/
Container image: 206367865313.dkr.ecr.eu-west-1.amazonaws.com/sm-pretrained-spacy:latest


ClientError: An error occurred (ValidationException) when calling the CreateModel operation: Requested image 206367865313.dkr.ecr.eu-west-1.amazonaws.com/sm-pretrained-spacy:latest not found.

## Endpoint Config Creation

In [7]:
endpoint_config_name = 'spacy-ner-config' + strftime("%Y-%m-%d-%H-%M-%S", gmtime())
print('Endpoint config name: ' + endpoint_config_name)

create_endpoint_config_response = sm_client.create_endpoint_config(
    EndpointConfigName = endpoint_config_name,
    ProductionVariants=[{
        'InstanceType':instance_type,
        'InitialInstanceCount':1,
        'InitialVariantWeight':1,
        'ModelName':model_name,
        'VariantName':'AllTraffic'
    }]
)

print("Endpoint config Arn: " + create_endpoint_config_response['EndpointConfigArn'])

Endpoint config name: spacy-ner-config2022-10-22-12-49-24


ClientError: An error occurred (ValidationException) when calling the CreateEndpointConfig operation: Could not find model "arn:aws:sagemaker:eu-west-1:206367865313:model/spacy-nermodel-2022-10-22-12-49-20".

## Endpoint Creation

In [8]:
%%time

import time

endpoint_name = 'spacy-ner-endpoint' #+ strftime("%Y-%m-%d-%H-%M-%S", gmtime())
print('Endpoint name: ' + endpoint_name)

create_endpoint_response = sm_client.create_endpoint(
    EndpointName=endpoint_name,
    EndpointConfigName=endpoint_config_name
)
print('Endpoint Arn: ' + create_endpoint_response['EndpointArn'])

resp = sm_client.describe_endpoint(EndpointName=endpoint_name)
status = resp['EndpointStatus']
print("Endpoint Status: " + status)

print('Waiting for {} endpoint to be in service...'.format(endpoint_name))
waiter = sm_client.get_waiter('endpoint_in_service')
waiter.wait(EndpointName=endpoint_name)

Endpoint name: spacy-ner-endpoint


ClientError: An error occurred (ValidationException) when calling the CreateEndpoint operation: Could not find endpoint configuration "arn:aws:sagemaker:eu-west-1:206367865313:endpoint-config/spacy-ner-config2022-10-22-12-49-24".

## Endpoint Invocation

In [9]:
import json
content_type = "application/json"
request_body = {"input": "This is a test with NER in America with Amazon and Microsoft in Seattle, writing random stuff."}

#Serialize data for endpoint
data = json.loads(json.dumps(request_body))
payload = json.dumps(data)

#Endpoint invocation
response = runtime_sm_client.invoke_endpoint(
    EndpointName=endpoint_name,
    ContentType=content_type,
    Body=payload)

#Parse results
result = json.loads(response['Body'].read().decode())['output']
result

ValidationError: An error occurred (ValidationError) when calling the InvokeEndpoint operation: Endpoint spacy-ner-endpoint of account 206367865313 not found.