# Deploy Cochl.Sense Model Package from AWS Marketplace

[Cochl.Sense](https://cochl.ai) is an AI-powered audio recognition API that analyzes audio and detects various sound events such as music, speech, sirens, alarms, and more.

This notebook demonstrates how to deploy and use the Cochl.Sense model from AWS Marketplace using Amazon SageMaker.

## Contents

1. [Subscribe to the model package](#1.-Subscribe-to-the-model-package)
2. [Set up the environment](#2.-Set-up-the-environment)
3. [Real-time Inference](#3.-Real-time-Inference)
4. [Asynchronous Inference](#4.-Asynchronous-Inference)
5. [Batch Transform](#5.-Batch-Transform)
6. [Clean up](#6.-Clean-up)

## Usage Instructions

You can run this notebook one cell at a time by pressing Shift+Enter.

## 1. Subscribe to the model package

To subscribe to the model package:

1. Open the model package listing page: **Cochl.Sense**
2. On the AWS Marketplace listing, click on the **Continue to Subscribe** button.
3. On the **Subscribe to this software** page, review and click on **"Accept Offer"** if you and your organization agree with EULA, pricing, and support terms.
4. Once you click on **Continue to configuration** button and then choose a **region**, you will see a **Product Arn** displayed. This is the model package ARN that you need to specify while creating a deployable model using Boto3.

Copy the ARN corresponding to your region and specify it in the following cell.

## 2. Set up the environment

### Prerequisites

This notebook requires Python 3.8+ and the following packages. If you're running this notebook locally (not on SageMaker), you may need to set up a virtual environment and install the required packages.

#### Option 1: Using virtual environment (recommended for local development)

```bash
# Create and activate virtual environment
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install required packages
pip install -r requirements.txt
```

#### Option 2: Install directly in notebook

Run the cell below to install packages directly.

#### AWS CLI Configuration (required for local development)

If you're running this notebook locally, you need to configure AWS credentials:

```bash
aws configure
```

You will be prompted to enter:
- AWS Access Key ID
- AWS Secret Access Key
- Default region name (e.g., `us-east-1`)
- Default output format (e.g., `json`)

In [None]:
# Uncomment and run if packages are not installed (skip if running on SageMaker)
# !pip install boto3 sagemaker

In [None]:
# Specify the model package ARN from AWS Marketplace
model_package_arn = "<your-model-package-arn>"  # Replace with your ARN from Step 1

In [None]:
import boto3
import json
import time
import sagemaker
from sagemaker import ModelPackage, get_execution_role

In [None]:
# Initialize SageMaker session
sagemaker_session = sagemaker.Session()
role = get_execution_role()
region = boto3.Session().region_name
bucket = sagemaker_session.default_bucket()

print(f"SageMaker Role: {role}")
print(f"Region: {region}")
print(f"Default Bucket: {bucket}")

### Create a deployable model from the model package

In [None]:
model_name = f"cochl-sense-model-{int(time.time())}"

model = ModelPackage(
    role=role,
    model_package_arn=model_package_arn,
    sagemaker_session=sagemaker_session,
    name=model_name
)

print(f"Model name: {model_name}")

## 3. Real-time Inference

Real-time inference is suitable for low-latency, interactive use cases. The maximum payload size is **25 MB** with a processing time limit of **60 seconds**.

### Deploy the endpoint

In [None]:
# Deploy a real-time endpoint
realtime_endpoint_name = f"cochl-sense-realtime-{int(time.time())}"
instance_type = "ml.g4dn.xlarge"  # GPU instance for faster inference

predictor = model.deploy(
    initial_instance_count=1,
    instance_type=instance_type,
    endpoint_name=realtime_endpoint_name
)

print(f"Endpoint deployed: {realtime_endpoint_name}")

### Wait for the endpoint to be ready

Endpoint deployment typically takes about **7 minutes**. You can check the endpoint status in the [SageMaker Console](https://console.aws.amazon.com/sagemaker/home#/endpoints). Proceed to the next step once the endpoint status is `InService`.

### Prepare input data

The input must be raw audio binary data. Supported formats:
- `audio/mp3`
- `audio/wav`
- `audio/ogg`
- `application/octet-stream` (auto-detection)
- `audio/x-raw; rate={sample_rate}; format={sample_format}; channels={num_channels}` (for raw PCM)

In [None]:
# Load sample audio file
rt_audio_file_path = "data/sample.mp3"
rt_content_type = "audio/mp3"

with open(rt_audio_file_path, "rb") as f:
    rt_audio_data = f.read()

# Initialize clients
rt_runtime_client = boto3.client("sagemaker-runtime")
rt_sagemaker_client = boto3.client("sagemaker")

print(f"Loaded audio file: {rt_audio_file_path}")
print(f"File size: {len(rt_audio_data)} bytes")

### Invoke the endpoint

In [None]:
# Invoke the endpoint
rt_response = rt_runtime_client.invoke_endpoint(
    EndpointName=realtime_endpoint_name,
    ContentType=rt_content_type,
    Body=rt_audio_data
)

rt_result = json.loads(rt_response["Body"].read().decode("utf-8"))
print(json.dumps(rt_result, indent=2))

### Invoke with sensitivity control (optional)

You can adjust the detection sensitivity using `CustomAttributes` as a JSON string:

- `default_sensitivity`: Default sensitivity for all tags. Range: [-2, 2] (integer). Default: 0
- `tags_sensitivity`: Per-tag sensitivity adjustment. Range: [-2, 2] (integer)

Sensitivity can be set globally or individually per tag:
- Positive values increase tag appearance (more tags will appear), negative values decrease it (fewer tags will appear).
- If certain tags are not being detected frequently, try increasing the sensitivity.
- If you experience too many false detections, lowering the sensitivity may help.

In [None]:
# Invoke with custom sensitivity settings
rt_custom_attributes = json.dumps({
    "default_sensitivity": 1,
    "tags_sensitivity": {"Drum": -2, "Sing": 2}
})

rt_response = rt_runtime_client.invoke_endpoint(
    EndpointName=realtime_endpoint_name,
    ContentType=rt_content_type,
    CustomAttributes=rt_custom_attributes,
    Body=rt_audio_data
)

rt_result = json.loads(rt_response["Body"].read().decode("utf-8"))
print(json.dumps(rt_result, indent=2))

### Delete the real-time endpoint

Delete the endpoint to avoid incurring charges when not in use.

In [None]:
# Delete the real-time endpoint
rt_sagemaker_client.delete_endpoint(EndpointName=realtime_endpoint_name)
print(f"Endpoint deleted: {realtime_endpoint_name}")

## 4. Asynchronous Inference

Asynchronous inference is suitable for large files or when you don't need immediate results. The maximum payload size is **1 GB** with a processing time limit of **60 minutes**.

### Create an asynchronous inference endpoint

In [None]:
from sagemaker.async_inference import AsyncInferenceConfig

async_endpoint_name = f"cochl-sense-async-{int(time.time())}"
async_output_path = f"s3://{bucket}/cochl-sense/async-output/"
async_failure_path = f"s3://{bucket}/cochl-sense/async-failure/"

# Create async inference config
async_config = AsyncInferenceConfig(
    output_path=async_output_path,
    failure_path=async_failure_path,
    max_concurrent_invocations_per_instance=4
)

# Deploy async endpoint
async_predictor = model.deploy(
    initial_instance_count=1,
    instance_type="ml.g4dn.xlarge",
    endpoint_name=async_endpoint_name,
    async_inference_config=async_config
)

print(f"Async endpoint deployed: {async_endpoint_name}")

### Wait for the async endpoint to be ready

Endpoint deployment typically takes about **7 minutes**. You can check the endpoint status in the [SageMaker Console](https://console.aws.amazon.com/sagemaker/home#/endpoints). Proceed to the next step once the endpoint status is `InService`.

### Upload input file to S3

In [None]:
# Initialize clients and load audio file
async_s3_client = boto3.client("s3")
async_runtime_client = boto3.client("sagemaker-runtime")
async_sagemaker_client = boto3.client("sagemaker")

async_audio_file_path = "data/sample.mp3"

# Upload audio file to S3
async_s3_input_key = "cochl-sense/async-input/sample.mp3"
async_s3_input_location = f"s3://{bucket}/{async_s3_input_key}"

async_s3_client.upload_file(async_audio_file_path, bucket, async_s3_input_key)
print(f"Uploaded to: {async_s3_input_location}")

### Invoke the async endpoint

In [None]:
# Invoke async endpoint
async_response = async_runtime_client.invoke_endpoint_async(
    EndpointName=async_endpoint_name,
    InputLocation=async_s3_input_location,
    ContentType="audio/mp3",
    Accept="application/json"
)

async_output_location = async_response["OutputLocation"]
print(f"Output will be stored at: {async_output_location}")

### Wait for and retrieve the result

In [None]:
# Poll for the result
import urllib.parse

async_parsed_url = urllib.parse.urlparse(async_output_location)
async_output_bucket = async_parsed_url.netloc
async_output_key = async_parsed_url.path.lstrip("/")

async_max_retries = 60
async_retry_interval = 5

for i in range(async_max_retries):
    try:
        async_obj = async_s3_client.get_object(Bucket=async_output_bucket, Key=async_output_key)
        async_result = json.loads(async_obj["Body"].read().decode("utf-8"))
        print("Inference completed!")
        print(json.dumps(async_result, indent=2))
        break
    except async_s3_client.exceptions.NoSuchKey:
        print(f"Waiting for result... ({i+1}/{async_max_retries})")
        time.sleep(async_retry_interval)
else:
    print("Timeout waiting for async inference result.")

### Delete the async endpoint

In [None]:
# Delete the async endpoint
async_sagemaker_client.delete_endpoint(EndpointName=async_endpoint_name)
print(f"Async endpoint deleted: {async_endpoint_name}")

## 5. Batch Transform

Batch Transform is suitable for running inference on large datasets stored in S3.

For datasets with mixed audio formats (mp3, wav, ogg), use `application/octet-stream` as the ContentType to enable auto-detection.

### Upload batch input data to S3

In [None]:
# Initialize clients and set paths
batch_s3_client = boto3.client("s3")
batch_sagemaker_client = boto3.client("sagemaker")

batch_audio_file_path = "data/sample.mp3"

batch_input_prefix = "cochl-sense/batch-input/"
batch_output_prefix = "cochl-sense/batch-output/"

batch_input_path = f"s3://{bucket}/{batch_input_prefix}"
batch_output_path = f"s3://{bucket}/{batch_output_prefix}"

# Upload sample file
batch_s3_client.upload_file(batch_audio_file_path, bucket, f"{batch_input_prefix}sample.mp3")
print(f"Batch input path: {batch_input_path}")
print(f"Batch output path: {batch_output_path}")

### Create and run a batch transform job

In [None]:
batch_transform_job_name = f"cochl-sense-batch-{int(time.time())}"

# Create batch transform job
batch_transformer = model.transformer(
    instance_count=1,
    instance_type="ml.g4dn.xlarge",
    output_path=batch_output_path,
    max_payload=100,  # Max payload size in MB
    max_concurrent_transforms=1,
)

# Start the transform job
batch_transformer.transform(
    data=batch_input_path,
    data_type="S3Prefix",
    content_type="application/octet-stream",  # Auto-detection for mixed formats
    split_type="None",
    job_name=batch_transform_job_name,
    wait=False
)

print(f"Batch transform job started: {batch_transform_job_name}")

### Monitor the batch transform job

The batch transform job typically takes about **7 minutes** to provision and start processing. You can check the job status in the [SageMaker Console](https://console.aws.amazon.com/sagemaker/home#/transform-jobs). Proceed to the next step once the job status is `Completed`.

### Retrieve batch transform results

In [None]:
# List and display batch transform output files
batch_response = batch_s3_client.list_objects_v2(Bucket=bucket, Prefix=batch_output_prefix)

if "Contents" in batch_response:
    for obj in batch_response["Contents"]:
        batch_key = obj["Key"]
        print(f"\n=== {batch_key} ===")
        batch_output_obj = batch_s3_client.get_object(Bucket=bucket, Key=batch_key)
        batch_result = json.loads(batch_output_obj["Body"].read().decode("utf-8"))
        print(json.dumps(batch_result, indent=2))
else:
    print("No output files found.")

## 6. Clean up

Delete the model and clean up resources to avoid incurring unnecessary charges.

In [None]:
# Delete the model
cleanup_sagemaker_client = boto3.client("sagemaker")
cleanup_sagemaker_client.delete_model(ModelName=model_name)
print(f"Model deleted: {model_name}")

### Optional: Clean up S3 data

In [None]:
# Uncomment the following lines to delete S3 data

# cleanup_s3_client = boto3.client("s3")

# # Delete async input/output
# cleanup_s3_client.delete_object(Bucket=bucket, Key=async_s3_input_key)

# # Delete batch input/output
# for prefix in [batch_input_prefix, batch_output_prefix]:
#     cleanup_response = cleanup_s3_client.list_objects_v2(Bucket=bucket, Prefix=prefix)
#     if "Contents" in cleanup_response:
#         for obj in cleanup_response["Contents"]:
#             cleanup_s3_client.delete_object(Bucket=bucket, Key=obj["Key"])

# print("S3 data cleaned up.")

### Unsubscribe from the model package

If you no longer need the model, you can unsubscribe from the AWS Marketplace:

1. Navigate to **Machine Learning** tab in [Your Software subscriptions page](https://aws.amazon.com/marketplace/ai/library?productType=ml&ref_=mlmp_gitdemo_ind498)
2. Locate the listing that you want to cancel the subscription for, and then click **Cancel Subscription**

## Output Format Reference

The inference result is returned in JSON format:

```json
{
  "metadata": {
    "content_type": "audio/mp3",
    "length_sec": 3.012,
    "size_byte": 12345,
    "name": "sagemaker_input"
  },
  "data": [
    {
      "tags": [
        {
          "name": "Drum",
          "probability": 0.51982635
        }
      ],
      "start_time": 0,
      "end_time": 2
    }
  ]
}
```

For available tags, see: https://docs.cochl.ai/sense/home/soundtags/