# Inference Notebook
This notebook can be used to deploy your trained model onto SageMaker and perform inference. In the notebook you will:
* Build the inference container
* Deploy the model to SageMaker
* Spot check the inference results

## Build the inference container
In the terminal, run the following command to build the inference container and push it to ECR>

**Note** In a production environment, you will likely want to build the container in a CI/CD pipeline by executing the script as an action in github, gitlab, or another CI/CD platform.

```bash
$ cd ..
$ . ./script/build-serve-container.sh inference
```

Copy the output URI from the terminal output and paste it into the variable `image_uri` below.

In [3]:
image_uri = '<FILL IN>'

# Initialize SageMaker Session
Use the SageMaker SDK to initialize a SageMaker session, grabbing the default bucket and role.

In [11]:
import sagemaker
import boto3
import os

sess = sagemaker.Session()
# sagemaker session bucket -> used for uploading data, models and logs
# sagemaker will automatically create this bucket if it not exists
sagemaker_session_bucket=None
if sagemaker_session_bucket is None and sess is not None:
    # set to default bucket if a bucket name is not given
    sagemaker_session_bucket = sess.default_bucket()
 
try:
    role = sagemaker.get_execution_role()
except ValueError:
    iam = boto3.client('iam')
    role = iam.get_role(RoleName='sagemaker_execution_role')['Role']['Arn']
 
sess = sagemaker.Session(default_bucket=sagemaker_session_bucket)
 
print(f"sagemaker role arn: {role}")
print(f"sagemaker bucket: {sess.default_bucket()}")
print(f"sagemaker session region: {sess.boto_region_name}")

In [None]:
from datetime import datetime

# Specify the model data location in S3. If you trained outside of Sagemaker, you need to upload the model somewhere
# If you trained in SageMaker, the model artifact from the results will already been uploaded to the default bucket.
MODEL_DATA = '<OUTPUT FROM TRAINING JOB>'

MODEL_NAME = f'parler-tts-model-{datetime.now().strftime("%Y-%m-%d-%H-%M-%S")}'

# Create the model artifact in SageMaker
model = sagemaker.Model(
    image_uri=image_uri,
    role=role,
    name=MODEL_NAME,
    model_data=MODEL_DATA,
    env={
        'NVIDIA_VISIBLE_DEVICES': 'all',
        'CUDA_VISIBLE_DEVICES': '0'
    }
)

In [12]:
from datetime import datetime
predictor = model.deploy(
    initial_instance_count=1,
    instance_type='ml.g5.2xlarge',
    endpoint_name=MODEL_NAME,
    environment={
        'NVIDIA_VISIBLE_DEVICES': 'all',
        'CUDA_VISIBLE_DEVICES': '0'
    },
    wait=False
    
)

# Test out the endpoint

In this section, we'll test out the endpoint by passing in a sample request. The endpoint is configured to return a json payload with the actual audio file as a base64 encoded string.

In [9]:
import boto3
import json

# Create a SageMaker runtime client
runtime_client = boto3.client('sagemaker-runtime')

# Sample input data
text = "Your Text Here"
description = "Your Description Here"
# Format the input as a JSON payload
input_data = [{"text": text, "description": description}]

# Invoke the endpoint
response = runtime_client.invoke_endpoint(
    EndpointName=MODEL_NAME,
    ContentType='application/json',
    Body=json.dumps(input_data)
)

# Convert to WAV file
This next part takes the response stream that's base64 encoded and converts it to a WAV file.

In [10]:
import base64
import numpy as np
from scipy.io.wavfile import write  # Import the write function

# Decode the audio since it's in base64. Note that .read() returns a bytes object
decoded_audio = base64.b64decode(response['Body'].read())

# Convert the decoded bytes to a numpy array
audio_array = np.frombuffer(decoded_audio, dtype=np.float32)

# Scale the float32 array to int16 range and convert
scaled_audio = (audio_array * 32767).astype(np.int16)

# Write to WAV file
output_filename = "output.wav"
sample_rate = 44100  # Make sure this matches your model's output sample rate
write(output_filename, sample_rate, scaled_audio)

print(f"Audio saved to {output_filename}")


Audio saved to output.wav
