# Predict With Our Amazon Comprehend Custom Classifier Model

<img src="img/comprehend.png" width="80%" align="left">

## Note that Amazon Comprehend is currently only supported in a subset of regions: 

* US East (N. Virginia), US East (Ohio), US West (Oregon)
* Canada (Central)
* Europe (London), Europe (Ireland), Europe (Frankfurt)
* Asia Pacific (Mumbai), Asia Pacific (Seoul), Asia Pacific (Tokyo), Asia Pacific (Singapore), Asia Pacific (Sydney)

You can check https://aws.amazon.com/about-aws/global-infrastructure/regional-product-services/ for details and updates. 

In [1]:
import boto3
import sagemaker
import pandas as pd

sess = sagemaker.Session()
bucket = sess.default_bucket()
role = sagemaker.get_execution_role()
region = boto3.Session().region_name

from botocore.config import Config

config = Config(retries={"max_attempts": 10, "mode": "adaptive"})

comprehend = boto3.Session().client(service_name="comprehend", region_name=region)

In [2]:
%store -r comprehend_training_job_arn

In [3]:
try:
    comprehend_training_job_arn
except NameError:
    print("***************************************************************************")
    print("[ERROR] PLEASE WAIT FOR THE PREVIOUS NOTEBOOK TO FINISH *******************")
    print("[ERROR] OR THIS NOTEBOOK WILL NOT RUN PROPERLY ****************************")
    print("***************************************************************************")

In [4]:
print(comprehend_training_job_arn)

arn:aws:comprehend:us-east-1:211125778552:document-classifier/Amazon-Customer-Reviews-Classifier-1708039740


In [5]:
%store -r comprehend_endpoint_arn

In [6]:
try:
    comprehend_endpoint_arn
except NameError:
    print("***************************************************************************")
    print("[ERROR] PLEASE WAIT FOR THE PREVIOUS NOTEBOOK TO FINISH *******************")
    print("[ERROR] OR THIS NOTEBOOK WILL NOT RUN PROPERLY ****************************")
    print("***************************************************************************")

In [7]:
print(comprehend_endpoint_arn)

arn:aws:comprehend:us-east-1:211125778552:document-classifier-endpoint/comprehend-inference-ep-16-00-08-02


# Deploy Endpoint

In [8]:
describe_response = comprehend.describe_endpoint(EndpointArn=comprehend_endpoint_arn)
print(describe_response)

{'EndpointProperties': {'EndpointArn': 'arn:aws:comprehend:us-east-1:211125778552:document-classifier-endpoint/comprehend-inference-ep-16-00-08-02', 'Status': 'CREATING', 'ModelArn': 'arn:aws:comprehend:us-east-1:211125778552:document-classifier/Amazon-Customer-Reviews-Classifier-1708039740', 'DesiredModelArn': 'arn:aws:comprehend:us-east-1:211125778552:document-classifier/Amazon-Customer-Reviews-Classifier-1708039740', 'DesiredInferenceUnits': 1, 'CurrentInferenceUnits': 0, 'CreationTime': datetime.datetime(2024, 2, 16, 0, 8, 2, 611000, tzinfo=tzlocal()), 'LastModifiedTime': datetime.datetime(2024, 2, 16, 0, 8, 2, 611000, tzinfo=tzlocal())}, 'ResponseMetadata': {'RequestId': '4cc6eb2e-aee9-4055-ad98-f02185d6bc02', 'HTTPStatusCode': 200, 'HTTPHeaders': {'x-amzn-requestid': '4cc6eb2e-aee9-4055-ad98-f02185d6bc02', 'content-type': 'application/x-amz-json-1.1', 'content-length': '536', 'date': 'Fri, 16 Feb 2024 00:08:30 GMT'}, 'RetryAttempts': 0}}


# Check Endpoint Status

In [9]:
from IPython.core.display import display, HTML

display(
    HTML(
        '<b>Review <a target="blank" href="https://console.aws.amazon.com/comprehend/v2/home?region={}#classifier-details/{}/endpoints/{}/details">Comprehend Model Endpoint</a></b>'.format(
            region, comprehend_training_job_arn, comprehend_endpoint_arn
        )
    )
)

In [10]:
import time

max_time = time.time() + 3 * 60 * 60  # 3 hours
while time.time() < max_time:
    describe_response = comprehend.describe_endpoint(EndpointArn=comprehend_endpoint_arn)
    status = describe_response["EndpointProperties"]["Status"]
    print("Endpoint: {}".format(status))

    if status == "IN_SERVICE" or status == "IN_ERROR":
        break

    time.sleep(15)

Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CREATING
Endpoint: CR

# [INFO] _Feel free to continue to the next workshop section while this notebook is running._

# Predict with Endpoint

In [11]:
txt = """I loved it!  I will recommend this to everyone."""

response = comprehend.classify_document(Text=txt, EndpointArn=comprehend_endpoint_arn)

import json

print(json.dumps(response, indent=2, default=str))

{
  "Classes": [
    {
      "Name": "5",
      "Score": 0.9940044283866882
    },
    {
      "Name": "1",
      "Score": 0.0027988224755972624
    },
    {
      "Name": "2",
      "Score": 0.001681410358287394
    }
  ],
  "ResponseMetadata": {
    "RequestId": "2525d098-1585-442e-8797-55f6959743a6",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "x-amzn-requestid": "2525d098-1585-442e-8797-55f6959743a6",
      "content-type": "application/x-amz-json-1.1",
      "content-length": "138",
      "date": "Fri, 16 Feb 2024 00:13:15 GMT"
    },
    "RetryAttempts": 0
  }
}


In [12]:
txt = """It's OK."""

response = comprehend.classify_document(Text=txt, EndpointArn=comprehend_endpoint_arn)

import json

print(json.dumps(response, indent=2, default=str))

{
  "Classes": [
    {
      "Name": "3",
      "Score": 0.9960570335388184
    },
    {
      "Name": "2",
      "Score": 0.0020790337584912777
    },
    {
      "Name": "4",
      "Score": 0.001374129904434085
    }
  ],
  "ResponseMetadata": {
    "RequestId": "25980521-d12b-4951-a587-ea321f7ca9f0",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "x-amzn-requestid": "25980521-d12b-4951-a587-ea321f7ca9f0",
      "content-type": "application/x-amz-json-1.1",
      "content-length": "138",
      "date": "Fri, 16 Feb 2024 00:13:16 GMT"
    },
    "RetryAttempts": 0
  }
}


In [13]:
txt = """Really bad.  I hope they don't make this anymore."""

response = comprehend.classify_document(Text=txt, EndpointArn=comprehend_endpoint_arn)

import json

print(json.dumps(response, indent=2, default=str))

{
  "Classes": [
    {
      "Name": "2",
      "Score": 0.7871689200401306
    },
    {
      "Name": "1",
      "Score": 0.19478803873062134
    },
    {
      "Name": "5",
      "Score": 0.011940845288336277
    }
  ],
  "ResponseMetadata": {
    "RequestId": "cb21df8f-010b-47cc-8062-8ba86d1ed76e",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "x-amzn-requestid": "cb21df8f-010b-47cc-8062-8ba86d1ed76e",
      "content-type": "application/x-amz-json-1.1",
      "content-length": "136",
      "date": "Fri, 16 Feb 2024 00:13:16 GMT"
    },
    "RetryAttempts": 0
  }
}


# Release Resources

In [24]:
try:
    comprehend.delete_document_classifier(DocumentClassifierArn=comprehend_training_job_arn)
    print(f'Deleted documnet classifier at {comprehend_training_job_arn}')

    comprehend.delete_endpoint(EndpointArn=comprehend_endpoint_arn)
    print(f'Deleted comprehend endpoint at {comprehend_endpoint_arn}')
except Exception as error:
    print(error)

An error occurred (ResourceNotFoundException) when calling the DeleteEndpoint operation: Could not find specified endpoint.


In [14]:
%%html

<p><b>Shutting down your kernel for this notebook to release resources.</b></p>
<button class="sm-command-button" data-commandlinker-command="kernelmenu:shutdown" style="display:none;">Shutdown Kernel</button>
        
<script>
try {
    els = document.getElementsByClassName("sm-command-button");
    els[0].click();
}
catch(err) {
    // NoOp
}    
</script>

In [15]:
%%javascript

try {
    Jupyter.notebook.save_checkpoint();
    Jupyter.notebook.session.delete();
}
catch(err) {
    // NoOp
}

<IPython.core.display.Javascript object>