# AWS Marketplace Product Usage Demonstration - Model Packages

## Using Model Package ARN with Amazon SageMaker APIs

This sample notebook demonstrates how to use Source Separation model package listed on Amazon SageMaker Marketplace.



## Set up the environment

In [27]:
import sagemaker as sage
from sagemaker import get_execution_role
import zipfile
import os

role = get_execution_role()

# S3 prefixes
common_prefix = "source_separation"
batch_inference_input_prefix = common_prefix + "/batch-inference-input-data"

### Create the session

The session remembers our connection parameters to Amazon SageMaker. We'll use it to perform all of our Amazon SageMaker operations.

In [5]:
sagemaker_session = sage.Session()

## Create Model

Now we use the above Model Package to create a model

In [28]:
modelpackage_arn = 'arn:aws:sagemaker:us-east-2:057799348421:model-package/source-separation-v11570291536-75ed8128ecee95e142ec4404d884ecad'
print("Using model package arn " + modelpackage_arn)

Using model package arn arn:aws:sagemaker:us-east-2:057799348421:model-package/source-separation-v11570291536-75ed8128ecee95e142ec4404d884ecad


In [29]:
from sagemaker import ModelPackage
#from sagemaker.predictor import csv_serializer

def predict_wrapper(endpoint, session):
    return sage.RealTimePredictor(endpoint, session, content_type='application/x-recordio-protobuf')

model = ModelPackage(role=role,
                     model_package_arn=modelpackage_arn,
                     sagemaker_session=sagemaker_session,
                     predictor_cls=predict_wrapper)

## Batch Transform Job

Now let's use the model built to run a batch inference job on multiple audio files.

Add your input audio files to "data/transform" folder.

Create a "batch-transform-output" folder in the data directory before running the cells below (if not created already).

In [1]:
TRANSFORM_WORKDIR = "data/transform"

transform_input = sagemaker_session.upload_data(TRANSFORM_WORKDIR, key_prefix=batch_inference_input_prefix)
print("Transform input uploaded to " + transform_input)

NameError: name 'sagemaker_session' is not defined

In [6]:
import json 
import uuid

bucket = sagemaker_session.default_bucket()

transformer = model.transformer(1, 'ml.m4.xlarge', strategy='SingleRecord', output_path='s3://'+bucket+'/'+common_prefix+'/batch-transform-output')
transformer.transform(transform_input, content_type='application/x-recordio-protobuf')
transformer.wait()

print("Batch Transform output saved to " + transformer.output_path)

....................[31mStarting the inference server with 4 workers.[0m
[31m[2019-10-31 08:34:17 +0000] [11] [INFO] Starting gunicorn 19.9.0[0m
[31m[2019-10-31 08:34:17 +0000] [11] [INFO] Listening at: unix:/tmp/gunicorn.sock (11)[0m
[31m[2019-10-31 08:34:17 +0000] [11] [INFO] Using worker: gevent[0m
[31m[2019-10-31 08:34:17 +0000] [15] [INFO] Booting worker with pid: 15[0m
[31m[2019-10-31 08:34:17 +0000] [16] [INFO] Booting worker with pid: 16[0m
[31m[2019-10-31 08:34:17 +0000] [17] [INFO] Booting worker with pid: 17[0m
[31m[2019-10-31 08:34:17 +0000] [18] [INFO] Booting worker with pid: 18[0m
[31mTesting...[0m
[31m2019-10-31 08:34:32.776086: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA[0m
[31m169.254.255.130 - - [31/Oct/2019:08:34:33 +0000] "GET /ping HTTP/1.1" 200 1 "-" "Go-http-client/1.1"[0m
[31m169.254.255.130 - - [31/Oct/2019:08:34:33 +0000] "GET /executio

#### Inspect the Batch Transform Output in S3

In [7]:
import boto3
s3 = boto3.resource('s3')
my_bucket = s3.Bucket(sagemaker_session.default_bucket())
prefix = "source_separation/batch-transform-output/"
i = 0
for object_summary in my_bucket.objects.filter(Prefix=prefix):
    i = i + 1
    file_name = object_summary.key.split('/')[-1]
    print(file_name)
    my_bucket.download_file(prefix+ file_name, 'data/batch-transform-output/output-{}.zip'.format(i))
    #with open('batch_results') as f:
    #    results = f.readlines()
    #    print(results)

mix2.mp3.out
mix3.mp3.out


In [8]:
for file in os.listdir('data/batch-transform-output'):
    print(file)
    with zipfile.ZipFile('data/batch-transform-output/'+file, 'r') as zip_ref:
        zip_ref.extractall('data/batch-transform-output/'+file.split('.')[0]+'/')

output-1.zip
output-2.zip


## Live Inference Endpoint

Now we demonstrate the creation of an endpoint for live inference on a single audio file.

Add your input audio file to "data/inference" folder.

In [11]:
predictor = model.deploy(1, 'ml.m4.xlarge', endpoint_name='source-separation-inference')

-----------!

### Choose some data and use it for a prediction

For inference of a single file enter one file in the "data/inference" folder. Enter the file name in input_file variable.


In [22]:
INFERENCE_WORKDIR = "data/inference/"

input_file = "drake-toosie_slide1.mp3" #Edit input filename here

INFERENCE_FILE = INFERENCE_WORKDIR + input_file

with open(INFERENCE_FILE, 'rb') as file:
    b = file.read()
    
source_separation_output = predictor.predict(b)

## Retrieving the zip file from bytes output

In [23]:
with open('data/output.zip', 'wb') as file:
    file.write(source_separation_output)

## Extracting output files from the zipped file

In [24]:
with zipfile.ZipFile('data/output.zip', 'r') as zip_ref:
    zip_ref.extractall('data/')

## Listing the output files received

In [None]:
print(os.listdir('data/output'))

### Cleanup endpoint


In [30]:
predictor.delete_endpoint()

ClientError: An error occurred (ValidationException) when calling the DeleteEndpointConfig operation: Could not find endpoint configuration "arn:aws:sagemaker:us-east-2:075178354542:endpoint-config/source-separation-inference".