# Deploy KanjuTech Speech Language Detection Model Package from AWS Marketplace 

KanjuTech's Speech Language Detection model ensures secure end-to-end real-time language identification. The model accurately detects language based on 3 words.

This sample notebook shows you how to deploy the **KanjuTech Speech Language Detection Model** using Amazon SageMaker.

> **Note**: This reference notebook cannot run unless you make the suggested changes in the notebook.

## Pre-requisites:
1. **Note**: This notebook contains elements that render correctly in the Jupyter interface. Open it from an Amazon SageMaker Notebook Instance or Amazon SageMaker Studio.
1. Ensure that the IAM role used has **AmazonSageMakerFullAccess**
1. To deploy this ML model successfully, ensure that:
    1. Either your IAM role has these three permissions, and you have the authority to make AWS Marketplace subscriptions in the AWS account used: 
        1. **aws-marketplace:ViewSubscriptions**
        1. **aws-marketplace:Unsubscribe**
        1. **aws-marketplace:Subscribe**  
    2. your AWS account has a **KanjuTech Speech Language Detection Model** subscription. If so, skip the step: [Subscribe to the model package](#1.-Subscribe-to-the-model-package)

## Contents:
1. [Subscribe to the model package](#1.-Subscribe-to-the-model-package)
2. [Create an endpoint and perform real-time inference](#2.-Create-an-endpoint-and-perform-real-time-inference)
    1. [Create an endpoint](#A.-Create-an-endpoint)
    2. [Create input payload](#B.-Create-input-payload)
    3. [Perform real-time inference](#C.-Perform-real-time-inference)
    4. [Delete endpoint and model](#D.-Delete-endpoint-and-model)
3. [Troubleshooting](#3.-Troubleshooting)
4. [Questions](#4.-Questions)
    
We recommend using ml.g4dn.xlarge instance for real-time.

## 1. Subscribe to the model package

To subscribe to the model package:
1. Open the model package listing page **KanjuTech Speech Language Detection Model**.
1. On the AWS Marketplace listing, click on the **Continue to subscribe** button.
1. On the **Subscribe to this software** page, review and click on **"Accept Offer"** if you and your organization agree with EULA, pricing, and support terms. 
1. Once you click on the **Continue to configuration** button and choose a **region**, you will see a **Product Arn** displayed. This is the model package ARN that you need to specify while creating a deployable model using Boto3. Copy the ARN corresponding to your region and specify it in the following cell.

In [None]:
model_package_arn = "<Specify the Model package ARN that corresponds to your AWS region>"

In [None]:
import sagemaker as sage
from sagemaker import ModelPackage
from sagemaker import get_execution_role
import boto3
import s3fs

In [None]:
role = get_execution_role()

sagemaker_session = sage.Session()

#bucket = 's3://<Name-of-your-existing-S3-bucket>' # Write the name of your S3 bucket where you store your input files and want to save the output
runtime = boto3.client("runtime.sagemaker")

real_time_content_type = "audio/mp4"
accept = "text/xml"

## 2. Create an endpoint and perform real-time inference

See [Documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/realtime-endpoints.html) if you want to understand how real-time inference with Amazon SageMaker works.

### A. Create an endpoint

In [None]:
model_name = "kanjutech-language-detection" # Write the endpoint name

In [None]:
# Specify instance type
real_time_inference_instance_type = "ml.g4dn.xlarge"

>  **Note**: We recommend using ml.g4dn.xlarge instance for real-time inference.

In [None]:
# Create a deployable model from the model package.
model = ModelPackage(role=role, model_package_arn=model_package_arn, sagemaker_session=sagemaker_session)

# Deploy the model
predictor = model.deploy(1, real_time_inference_instance_type, endpoint_name=model_name)

# Wait until it prints "!" after "----------"

Once the endpoint has been created, you can perform real-time inference.

If you get an error here, please see the [Troubleshooting](#6.-Troubleshooting).

**WARNING!** 

**Remember to** [**Delete your endpoint and resources**](#D.-Delete-endpoint-and-model) whenever you finish your work with real-time inference to stop incurring your charges!

For more information, please visit this [page](https://docs.aws.amazon.com/sagemaker/latest/dg/realtime-endpoints-delete-resources.html).

### B. Create input payload

For this example, we detect language from stored audio
> **Note**: The duration of the audio/chunk needs to be less than 15 sec.

In [None]:
bucket = 's3://kanjutech-transcription-speaker-diarization'

In [None]:
# Specify S3 folders
endpoint_input = bucket+'/'+'endpoint-audio' # Your folder on the S3 bucket where you store input audio

In [None]:
fs = s3fs.S3FileSystem()
fs_ls = fs.ls(endpoint_input)
paths = list(filter(lambda k: '.' in k, fs_ls))

In [None]:
# For this example, we process only one file from paths
input_file_path = paths[0]

In [None]:
# Convert audio to bytes data
with fs.open(input_file_path, "rb") as f:
    data = f.read()

### C. Perform real-time inference

Invoke the endpoint for real-time inference
> **Note**: The duration of the audio/chunk needs to be less than 15 sec.

In [None]:
# Request language detection
results = runtime.invoke_endpoint(
    EndpointName=model_name,
    Body=data, 
    ContentType=real_time_content_type,  
    Accept=accept,  
)

# Print detected language
print(results['Body'].read().decode('utf-8'))

### D. Delete endpoint and model

Now that you have successfully performed a real-time inference, you no longer need the endpoint. You can terminate the endpoint to avoid being charged.

In [None]:
model.sagemaker_session.delete_endpoint(model_name)
model.sagemaker_session.delete_endpoint_config(model_name)
model.delete_model()

**WARNING!** 

**Remember to** [**Delete your endpoint and resources**](#D.-Delete-endpoint-and-model) whenever you finish your work with real-time inference to stop incurring your charges!

For more information, please visit this [page](https://docs.aws.amazon.com/sagemaker/latest/dg/realtime-endpoints-delete-resources.html).

## 3. Troubleshooting

### Cannot create already existing endpoint configuration

This error occurs when the user interrupts the inference deployment and tries to rerun it. To restart the deployment, first delete the previously created configurations. You can find this command in the [Delete endpoint and model](#D.-Delete-endpoint-and-model) cell.

Please wait for the deployment to complete. This process may take several minutes.

### ResourceLimitExceeded

If you receive an error due to the lack of a quota for your instance type, you can increase it by sending a request:
1. Open the **Amazone SageMaker** [**Service Quotas**](https://console.aws.amazon.com/servicequotas/home/services/sagemaker/quotas) page.
2. Filter **Service quotas** by "ml.g4dn.xlarge for endpoint usage" for real-time inference.
3. Select and click on the **Request increase at account-level** button.
4. Enter the total amount you want the quota to be and click the **Request** button.
5. Wait until AWS Support increases your quotas for this instance type.

> **Note**: To speed up the processing of your request, please indicate in your correspondence with AWS Support that this type of instance is required for this product.

For more information about requesting a quota increase, visit this [page](https://docs.aws.amazon.com/servicequotas/latest/userguide/request-quota-increase.html).

## 4. Questions

If you have any questions about our product, feel free to email us at aws@kanju.tech or schedule a [meeting](https://calendly.com/kanjutech).