# BYOC Inference for paddleOCR

A trained model does nothing on its own. We now want to use the model to perform inference. For this example, that means predicting the topic mixture representing a given document.This section involves several steps,
Create Model 
- Create model for the training outputCreate Endpoint Configuration 
- Create a configuration defining an endpoint.Create Endpoint 
- Use the configuration to create an inference endpoint.Perform Inference 
- Perform inference on some input data using the endpoint.

## Deploy model
we now create a SageMaker Model from the training output. Using the model we can create an Endpoint Configuration.


In [70]:
%%time

from sagemaker import get_execution_role
role = get_execution_role()


import sagemaker
import boto3
from time import gmtime, strftime

sage = boto3.Session().client(service_name='sagemaker') 
sess = sagemaker.Session()

CPU times: user 92.5 ms, sys: 0 ns, total: 92.5 ms
Wall time: 175 ms


In [71]:
account = sess.boto_session.client("sts").get_caller_identity()["Account"]
region = sess.boto_session.region_name

PROJECT_ID = "sagemaker-p-5an0os9jqfdi"

In [72]:
hosting_image = f'{account}.dkr.ecr.{region}.amazonaws.com/{PROJECT_ID}-inference-imagebuild:latest'
print('Inference image location: ',hosting_image)

Inference image location:  707684582322.dkr.ecr.eu-west-1.amazonaws.com/sagemaker-p-5an0os9jqfdi-inference-imagebuild:latest


In [73]:
TrainingJobName = "sagemaker-p-5an0os9jqfdi-training-image-2022-04-19-13-35-07-735"
#TrainingJobName = "sagemaker-p-5an0os9jqfdi-training-image-2022-04-14-13-41-00-634"


In [75]:
info = sage.describe_training_job(TrainingJobName=TrainingJobName)
model_data = info['ModelArtifacts']['S3ModelArtifacts']
print('Model artifacts location: ',model_data)

primary_container = {
    'Image': hosting_image,
    'ModelDataUrl': model_data,
}


Model artifacts location:  s3://sagemaker-eu-west-1-707684582322/sagemaker-p-5an0os9jqfdi-training-image-2022-04-19-13-35-07-735/output/model.tar.gz


In [77]:
model_name="paddle-v1"

create_model_response = sage.create_model(
    ModelName = model_name,
    ExecutionRoleArn = role,
    PrimaryContainer = primary_container)

print(create_model_response['ModelArn'])

arn:aws:sagemaker:eu-west-1:707684582322:model/paddle-v1


## Create Endpoint Configuration

At launch, we will support configuring REST endpoints in hosting with multiple models, e.g. for A/B testing purposes. In order to support this, customers create an endpoint configuration, that describes the distribution of traffic across the models, whether split, shadowed, or sampled in some way.In addition, the endpoint configuration describes the instance type required for model deployment, and at launch will describe the autoscaling configuration.


In [78]:
from time import gmtime, strftime
import time 

timestamp = time.strftime('-%Y-%m-%d-%H-%M-%S', time.gmtime())
endpoint_config_name = job_name_prefix + '-epc-' + timestamp
endpoint_config_response = sage.create_endpoint_config(
    EndpointConfigName = endpoint_config_name,
    ProductionVariants=[{
        'InstanceType':'ml.p3.2xlarge',
        'InitialInstanceCount':1,
        'ModelName':model_name,
        'VariantName':'AllTraffic'}])

print('Endpoint configuration name: {}'.format(endpoint_config_name))
print('Endpoint configuration arn:  {}'.format(endpoint_config_response['EndpointConfigArn']))

Endpoint configuration name: paddle-epc--2022-04-19-13-48-37
Endpoint configuration arn:  arn:aws:sagemaker:eu-west-1:707684582322:endpoint-config/paddle-epc--2022-04-19-13-48-37


##  Create Endpoint


In [79]:
%%time
import time

job_name_prefix = "paddle"

timestamp = time.strftime('-%Y-%m-%d-%H-%M-%S', time.gmtime())
endpoint_name = job_name_prefix + '-ep-' + timestamp
print('Endpoint name: {}'.format(endpoint_name))

endpoint_params = {
    'EndpointName': endpoint_name,
    'EndpointConfigName': endpoint_config_name,
}
endpoint_response = sage.create_endpoint(**endpoint_params)
print('EndpointArn = {}'.format(endpoint_response['EndpointArn']))

Endpoint name: paddle-ep--2022-04-19-13-48-50
EndpointArn = arn:aws:sagemaker:eu-west-1:707684582322:endpoint/paddle-ep--2022-04-19-13-48-50
CPU times: user 18.1 ms, sys: 0 ns, total: 18.1 ms
Wall time: 238 ms


 If you see the message, Endpoint creation ended with ```EndpointStatus = InService``` then congratulations! You now have a functioning inference endpoint. You can confirm the endpoint configuration and status by navigating to the "Endpoints" tab in the AWS SageMaker console.We will finally create a runtime object from which we can invoke the endpoint.

#  Perform Inference with image 

In [86]:
import boto3
from botocore.config import Config
from sagemaker.session import Session

config = Config(
    read_timeout=120,
    retries={
        'max_attempts': 0
    }
)

In [90]:
from boto3.session import Session
import json
import base64

image_path = './test/new_hkid_front.jpg'

with open(image_path, "rb") as image_file:
    img_data = base64.b64encode(image_file.read())
    data = img_data.decode("utf-8")
    body = json.dumps(data).encode("utf-8")

In [84]:
print(body)
print('hello')

he


In [91]:
sagemaker_runtime_client = boto3.client('sagemaker-runtime', config=config)
session = Session(sagemaker_runtime_client)

response = sagemaker_runtime_client.invoke_endpoint(
    EndpointName=endpoint_naeme,
    ContentType="image/jpeg",
    Body=body)

result = json.loads(response["Body"].read())
print (result)

ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (500) from primary with message "<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<title>500 Internal Server Error</title>
<h1>Internal Server Error</h1>
<p>The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.</p>
". See https://eu-west-1.console.aws.amazon.com/cloudwatch/home?region=eu-west-1#logEventViewer:group=/aws/sagemaker/Endpoints/paddle-ep--2022-04-19-13-48-50 in account 707684582322 for more information.

#  Perform Inference with url 

In [92]:
import boto3
from botocore.config import Config
from sagemaker.session import Session

config = Config(
    read_timeout=120,
    retries={
        'max_attempts': 0
    }
)

from boto3.session import Session
import json

In [58]:
import sagemaker as sage
from time import gmtime, strftime
from sagemaker import get_execution_role

sess = sage.Session()
WORK_DIRECTORY = "./test/new_hkid_front.jpg"

# S3 prefix
prefix = "DEMO-paddle-byo"
bucket = sess.default_bucket()  

image_uri = f'{prefix}/new_hkid_front.jpg'

role = get_execution_role()

data_location = sess.upload_data(WORK_DIRECTORY, key_prefix=prefix)
print(data_location)

s3://sagemaker-eu-west-1-707684582322/DEMO-paddle-byo/new_hkid_front.jpg


In [61]:

test_data = {
    'bucket' : bucket,
    'image_uri' : image_uri,
    'content_type': "application/json",
}
payload = json.dumps(test_data)
print(payload)

{"bucket": "sagemaker-eu-west-1-707684582322", "image_uri": "DEMO-paddle-byo/new_hkid_front.jpg", "content_type": "application/json"}


In [66]:
def cvt_2_base64(file_name):
    with open(file_name , "rb") as image_file :
        data = base64.b64encode(image_file.read())
    return data

done


ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (500) from primary with message "<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<title>500 Internal Server Error</title>
<h1>Internal Server Error</h1>
<p>The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.</p>
". See https://eu-west-1.console.aws.amazon.com/cloudwatch/home?region=eu-west-1#logEventViewer:group=/aws/sagemaker/Endpoints/paddle-ep--2022-04-14-15-00-07 in account 707684582322 for more information.

In [62]:
sagemaker_runtime_client = boto3.client('sagemaker-runtime', config=config)
session = Session(sagemaker_runtime_client)

#     runtime = session.client("runtime.sagemaker",config=config)
response = sagemaker_runtime_client.invoke_endpoint(
    EndpointName=endpoint_name,
    ContentType="application/json",
    #ContentType="image/jpeg",
    Body=payload)

result = json.loads(response["Body"].read())
print (result)

{'label': ['香港永久性居民身份', 'SAMP!', 'HONGKONSPERMANENTIDENTITYCARD', '而藥永晴', 'Z683365', 'LOK，Wing', 'ching', 'MPE', '2867', '3057', '2532', 'SAMPLE', '出生日期Dateo[Birh', 'Q3-06-1985', '女F', '大**AZ', 'SAMPLE', '鞍日期Dateollssue', '（06-96）', '26-11-18', 'Z683365（5）'], 'confidences': [16.18445587158203, 17.04817771911621, 16.762537002563477, 12.265478134155273, 15.969161987304688, 16.189250946044922, 17.28763771057129, 16.429025650024414, 17.817951202392578, 19.93827247619629, 21.21170997619629, 18.52018928527832, 15.540266036987305, 17.720809936523438, 14.004166603088379, 14.258501052856445, 18.13033103942871, 15.3755464553833, 15.028181076049805, 16.284420013427734, 16.89468765258789], 'bbox': [[[158.0, 10.0], [348.0, 11.0], [348.0, 31.0], [158.0, 30.0]], [[11.0, 21.0], [25.0, 21.0], [25.0, 66.0], [11.0, 66.0]], [[73.0, 36.0], [434.0, 36.0], [434.0, 52.0], [73.0, 52.0]], [[10.0, 63.0], [110.0, 67.0], [109.0, 91.0], [9.0, 87.0]], [[356.0, 65.0], [400.0, 65.0], [400.0, 78.0], [356.0, 78.0]], [[1

# Clean up
When we're done with the endpoint, we can just delete it and the backing instances will be released.  Run the following cell to delete the endpoint.

In [33]:
print(endpoint_name)
sage.delete_endpoint(EndpointName=endpoint_name)
sage.delete_endpoint_config(EndpointConfigName=endpoint_name)
sage.delete_model(ModelName=endpoint_name)

paddle-ep--2022-04-14-14-12-46


ClientError: An error occurred (ValidationException) when calling the DeleteEndpoint operation: Cannot update in-progress endpoint "arn:aws:sagemaker:eu-west-1:707684582322:endpoint/paddle-ep--2022-04-14-14-12-46".