# Comprehend Custom Classification Training & Deployment

This notebook is used to develop the comprehend custom classification training and deployment. The same task/operation can be done using any IDE such as Cloud9 etc. Notebook is used here to give the glimpse of experience to the users about SageMaker.


Get the execution role for the notebook instance. This is the IAM role that you created for your notebook instance. You pass the role to the tuning job

Important Note: This notebook should be executed from us-east-1 since the sample S3 bucket and data is located in us-east-1

In [None]:
from sagemaker import get_execution_role
role = get_execution_role()
role

Initiate the Custom classifcation Training job. This will initiate the training job in comprehend service and produce the trained model.
Note: You can get the bucket name from CloudFormation stack resource section in the AWS console by searching 'AWS::S3::Bucket'. Bucket name will be in 'Physical ID' column.

In [None]:
import boto3

client = boto3.client('comprehend')

response = client.create_document_classifier(
    DocumentClassifierName='email-classifications-sample',# Enter the name of the classifier
    DataAccessRoleArn=role,
    InputDataConfig={
        'S3Uri': 's3://<Bucket Name from CloudFormation stack resource section>/Comprehend_Training_Data.csv'
    },# This is the public read only bucket having sample data. You can use your bucket for your data.
    LanguageCode='en'
)

Checking the status of that training. This may take upto 20 mins. Please wait until you get the status "Training Competed"

In [None]:
import time
modelarn=response["DocumentClassifierArn"]
response_des = client.describe_document_classifier(
    DocumentClassifierArn=modelarn
)
print (response_des['DocumentClassifierProperties']['Status'])
train_status=response_des['DocumentClassifierProperties']['Status']
print ("Training started")
while train_status!='TRAINED':
    print (".")
    time.sleep(30)
    
print("Training Completed")

Creating Endpoint for the Trained model

In [None]:
#create endpoint for the trained classifer
response_ep = client.create_endpoint(
    EndpointName='email-classifications',
    ModelArn=modelarn,
    DesiredInferenceUnits=1,# if you wan to deploy this multiple unit, you can enter more than one. 
    #ClientRequestToken='string',
    Tags=[
        {
            'Key': 'Name',
            'Value': 'email classification'
        },
    ],
    DataAccessRoleArn=role
)

Check the Endpoint ARN

In [None]:
response_ep
eparn=response_ep["EndpointArn"]
eparn

Now test the trined model by sending some sample sentences

In [None]:
response_cd = client.classify_document(
    Text='Can you send the status of the transaction id:278960001',
    EndpointArn=eparn
)
response_cd

Now this endpoint will be used to classify emails coming from customer via Amazon WorkMail 