# PaddleOCR Inference

In [None]:
!cat Dockerfile

In [None]:
#first build docker
!sh build_and_push.sh

 # Inference

A trained model does nothing on its own. We now want to use the model to perform inference. For this example, that means predicting the topic mixture representing a given document.This section involves several steps,
Create Model 
- Create model for the training outputCreate Endpoint Configuration 
- Create a configuration defining an endpoint.Create Endpoint 
- Use the configuration to create an inference endpoint.Perform Inference 
- Perform inference on some input data using the endpoint.

## deploy model
we now create a SageMaker Model from the training output. Using the model we can create an Endpoint Configuration.


In [1]:
%%time

import boto3
from time import gmtime, strftime

#sage = boto3.Session().client(service_name='sagemaker') 
sage = boto3.client('sagemaker')

from sagemaker import get_execution_role
role = get_execution_role()

model_name="paddle-v0"
print(model_name)

#info = sage.describe_training_job(TrainingJobName=train.latest_training_job.name)
#model_data = info['ModelArtifacts']['S3ModelArtifacts']

#model_data = train.model_data
#print(model_data)

account = boto3.client('sts').get_caller_identity()['Account']
region = boto3.Session().region_name

image = "{}.dkr.ecr.{}.amazonaws.com/paddle".format(account, region)

# hosting_image = "847380964353.dkr.ecr.us-west-2.amazonaws.com/paddle"
primary_container = {
    'Image': image,
    #'ModelDataUrl': model_data,
}

create_model_response = sage.create_model(
    ModelName = model_name,
    ExecutionRoleArn = role,
    PrimaryContainer = primary_container)

print(create_model_response['ModelArn'])

sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /home/sagemaker-user/.config/sagemaker/config.yaml
paddle-v0
arn:aws:sagemaker:us-west-2:373127939256:model/paddle-v0
CPU times: user 1.23 s, sys: 113 ms, total: 1.34 s
Wall time: 1.99 s


## Create Endpoint Configuration
#At launch, we will support configuring REST endpoints in hosting with multiple models, e.g. for A/B testing purposes. In order to support this, customers create an endpoint configuration, that describes the distribution of traffic across the models, whether split, shadowed, or sampled in some way.In addition, the endpoint configuration describes the instance type required for model deployment, and at launch will describe the autoscaling configuration.


In [2]:
from time import gmtime, strftime
import time 

job_name_prefix = "paddle"

timestamp = time.strftime('-%Y-%m-%d-%H-%M-%S', time.gmtime())
endpoint_config_name = job_name_prefix + '-epc-' + timestamp
endpoint_config_response = sage.create_endpoint_config(
    EndpointConfigName = endpoint_config_name,
    ProductionVariants=[{
        'InstanceType':'ml.g4dn.xlarge',
        'InitialInstanceCount':1,
        'ModelName':model_name,
        'VariantName':'AllTraffic'}])

print('Endpoint configuration name: {}'.format(endpoint_config_name))
print('Endpoint configuration arn:  {}'.format(endpoint_config_response['EndpointConfigArn']))

Endpoint configuration name: paddle-epc--2024-10-13-11-31-44
Endpoint configuration arn:  arn:aws:sagemaker:us-west-2:373127939256:endpoint-config/paddle-epc--2024-10-13-11-31-44


##  Create Endpoint
Lastly, the customer creates the endpoint that serves up the model, through specifying the name and configuration defined above. The end result is an endpoint that can be validated and incorporated into production applications. This takes 9-11 minutes to complete.

In [3]:
%%time
import time

timestamp = time.strftime('-%Y-%m-%d-%H-%M-%S', time.gmtime())
endpoint_name = job_name_prefix + '-ep-' + timestamp
print('Endpoint name: {}'.format(endpoint_name))

endpoint_params = {
    'EndpointName': endpoint_name,
    'EndpointConfigName': endpoint_config_name,
}
endpoint_response = sage.create_endpoint(**endpoint_params)
print('EndpointArn = {}'.format(endpoint_response['EndpointArn']))

Endpoint name: paddle-ep--2024-10-13-11-31-50
EndpointArn = arn:aws:sagemaker:us-west-2:373127939256:endpoint/paddle-ep--2024-10-13-11-31-50
CPU times: user 1.97 ms, sys: 4.14 ms, total: 6.11 ms
Wall time: 534 ms


 If you see the message, Endpoint creation ended with ```EndpointStatus = InService``` then congratulations! You now have a functioning inference endpoint. You can confirm the endpoint configuration and status by navigating to the "Endpoints" tab in the AWS SageMaker console.We will finally create a runtime object from which we can invoke the endpoint.

#  Perform Inference

In [4]:
%%time
import boto3
from botocore.config import Config
from sagemaker.session import Session

config = Config(
    read_timeout=120,
    retries={
        'max_attempts': 0
    }
)

#from boto3.session import Session
import json

sagemaker_session = Session()
bucket = sagemaker_session.default_bucket()

WORK_DIRECTORY = "./input/data"
# S3 prefix
prefix = "DEMO-paddle-byo"

data_location = sagemaker_session.upload_data(WORK_DIRECTORY, key_prefix=prefix)

#bucket = sess.default_bucket()
image_uri = 'DEMO-paddle-byo/test/1.jpg'
test_data = {
    'bucket' : bucket,
    'image_uri' : image_uri,
    'content_type': "application/json",
}
payload = json.dumps(test_data)
print(payload)

sagemaker_runtime_client = boto3.client('sagemaker-runtime', config=config)
#session = Session(sagemaker_runtime_client)

#     runtime = session.client("runtime.sagemaker",config=config)
response = sagemaker_runtime_client.invoke_endpoint(
    EndpointName=endpoint_name,
    ContentType="application/json",
    Body=payload)

result = json.loads(response["Body"].read())
print (result)

{"bucket": "sagemaker-us-west-2-373127939256", "image_uri": "DEMO-paddle-byo/test/1.jpg", "content_type": "application/json"}
{'label': [[[266.0, 7.0], [366.0, 7.0], [366.0, 27.0], [266.0, 27.0]]], 'confidences': [['重慈湾酱院', 0.7750508189201355]], 'bbox': [[[[10.0, 9.0], [115.0, 9.0], [115.0, 23.0], [10.0, 23.0]], ['弗教系湾露', 0.561428427696228]]], 'shape': [32, 373, 3]}
CPU times: user 336 ms, sys: 8.98 ms, total: 345 ms
Wall time: 5.31 s


# Clean up
When we're done with the endpoint, we can just delete it and the backing instances will be released.  Run the following cell to delete the endpoint.

In [5]:
import boto3
sage = boto3.client('sagemaker')

print(endpoint_name)
sage.delete_endpoint(EndpointName=endpoint_name)
sage.delete_endpoint_config(EndpointConfigName=endpoint_config_name)
sage.delete_model(ModelName=model_name)

paddle-ep--2024-10-13-11-31-50


{'ResponseMetadata': {'RequestId': '061c172e-4e83-46aa-847d-7db7946cd4b6',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': '061c172e-4e83-46aa-847d-7db7946cd4b6',
   'content-type': 'application/x-amz-json-1.1',
   'date': 'Sun, 13 Oct 2024 11:47:57 GMT',
   'content-length': '0'},
  'RetryAttempts': 0}}