# Audio classification model inference

* Model - pretrained fastai2 xresnet18 using fastai2 audio library

**fastai2_audio**

The additional requirements of the fastai2_audio package will be dealt with below, using a clone of the following repo:

https://github.com/rbracco/fastai2_audio

The demo was run and tested by deploying an SageMaker Notebook instance as per the instructions outlined [here] (https://forums.fast.ai/t/platform-amazon-sagemaker-aws/66020).

Note - the above link is only accessible as part of the ongoing fastai course for the time being.


## Note re dependencies

These are set up using the LifeCycle Configuration for the notebook files. The original fastai2 LifeCycle Configuration provided by Matt McClean https://forums.fast.ai/t/fastai2-sagemaker/66444/6 has been modified to also have the installed of the fastai2 audio github repo:

`!pip install git+https://github.com/mikful/fastai2_audio.git`


As such, the below Installations of fastai2 and fastai2 audio are not necessary if using this LifeCycle Configuration.

However, the installation of the libsndfile is required for to avoid and OSError (this will need to be placed within the LifeCycle Config for the setup at some point.

`!conda install -c conda-forge libsndfile --yes`

## Install fastai2

**Note: not required if using fastai2 + fastai2 audio LifeCyCle Config**

In [None]:
#In SageMaker we need to run this as a  shell commands i.e. with '!' infront of 'pip'
#!pip install fastai2

## Install the fastai2_audio library

We need to install the fastai2_audio library to the local kernel/environment for the analysis

**Note: not required if using fastai2 + fastai2 audio LifeCyCle Config**

In [4]:
#In Colab we need to run this as a shell command i.e. with '!' infront of 'pip'

#!pip install git+https://github.com/mikful/fastai2_audio.git

In [2]:
# Solving an OSError problem with Librosa SoundFile dependency (libsndfile)
# SageMaker/GCP Only

!conda install -c conda-forge libsndfile --yes

Collecting package metadata (current_repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /home/ec2-user/SageMaker/.env/fastai2

  added / updated specs:
    - libsndfile


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    gettext-0.19.8.1           |       h5e8e0c9_1         3.5 MB  conda-forge
    ------------------------------------------------------------
                                           Total:         3.5 MB

The following NEW packages will be INSTALLED:

  gettext            conda-forge/linux-64::gettext-0.19.8.1-h5e8e0c9_1
  libflac            conda-forge/linux-64::libflac-1.3.3-he1b5a44_0
  libogg             conda-forge/linux-64::libogg-1.3.2-h516909a_1002
  libsndfile         conda-forge/linux-64::libsndfile-1.0.28-he1b5a44_1000
  libvorbis          conda-forge/linux-64::libvorbis-1.3.6-he1b5a44_2
  python_abi         conda-forge/l

## Load Pretrained Model (from Colab) and Perform Inference

In [3]:
from fastai2.vision.all import *
from fastai2_audio.core import *
from fastai2_audio.augment import *

In [None]:
def get_x(r): return r['fname']
def get_y(r): return r['labels'].split(',') # split labels on ','
dblock = DataBlock(get_x = get_x, get_y = get_y)
dsets = dblock.datasets(df_combined)
dsets.train[0]

In [None]:
DBMelSpec = SpectrogramTransformer(mel=True, to_db=True)

In [None]:
cfg = AudioConfig.BasicMelSpectrogram()
aud2spec = AudioToSpec.from_cfg(cfg)
aud2spec.settings

In [None]:
# Now let's change the settings to see the impact
aud2spec = DBMelSpec(sample_rate= 16000, f_max=None, f_min=20, n_mels=128, n_fft=1024, hop_length=128, top_db=90)
aud2spec.settings

In [None]:
item_tfms = [RemoveSilence(), CropSignal(3000, pad_mode='Repeat'), aud2spec, MaskTime(num_masks=1, size=100), MaskFreq(num_masks=1, size=10)]

In [None]:
# needs redefining fors test only

dblock = DataBlock(blocks=(AudioBlock, MultiCategoryBlock),
                    splitter=RandomSplitter(),
                    get_x=get_x,
                    get_y=get_y,
                    item_tfms = item_tfms)

# dsets = dblock.datasets(df_curated)
dsets = dblock.datasets(df_combined)
dsets.train[0]

In [None]:
dls = dblock.dataloaders(df_combined, bs=32) # bs= batch_size

In [None]:
### Load pretrained 1-channel xresnet18 with multi-accuracy

# Custom cnn model created from pretrained xresnet18 (smaller model for inference speed)
# 1 input channel and 80 output nodes
# torch.nn.BCEWithLogitsLoss() = Binary Cross Entropy Loss from pytorch
# accuracy_multi for multi label

model = create_cnn_model(xresnet18, n_in=1, n_out=80, pretrained=True)

learn = Learner(dls, model, BCEWithLogitsLossFlat(), metrics=accuracy_multi) # pass custom model to Learner

learn.load('xresnet50-stage-2-model-finetuned.pth')


In [None]:
# Define test file path in S3
df_fname = '../data/test/' + df_fnames.fname
print(df_fname)

#create new dataloaders
dl = learn.dls.test_dl(df_fnames)
    
# predict using tta    
preds, targs = learn.tta(dl=dl)

## Export the model and upload to S3

Now that we have trained our model we will export it using the learner method `export()` and upload the exported model to S3.

In [None]:
learn.export()

Now let's create a tarfile for our model.

In [None]:
import tarfile
with tarfile.open(path/'model.tar.gz', 'w:gz') as f:
    f.add(path/'export.pkl', arcname='model.pkl')

In [None]:
import sagemaker

role = sagemaker.get_execution_role()
sess = sagemaker.Session()

In [None]:
prefix = 'audio-app-mf-ct'

Now we will upload the model to the default S3 bucket for sagemaker.

In [None]:
model_location = sess.upload_data(str(path/'model.tar.gz'), key_prefix=prefix)
model_location

## Script for model inference

SageMaker invokes the main function defined within your training script for training. When deploying your trained model to an endpoint, the `model_fn()` is called to determine how to load your trained model. The `model_fn()` along with a few other functions list below are called to enable predictions on SageMaker.

### [Predicting Functions](https://github.com/aws/sagemaker-pytorch-containers/blob/master/src/sagemaker_pytorch_container/serving.py)
* `model_fn(model_dir)` - loads your model.
* `input_fn(serialized_input_data, content_type)` - deserializes predictions to predict_fn.
* `output_fn(prediction_output, accept)` - serializes predictions from predict_fn.
* `predict_fn(input_data, model)` - calls a model on data deserialized in input_fn.

Here is the full code in a file `serve.py` showing implementations of the 4 key functions:

In [None]:
!pygmentize scripts/serve.py

## Deploy locally to test

Before deploying to Amazon SageMaker we want to verify that the endpoint is working properly. The Amazon SageMaker Python SDK allows us to deploy locally to the Notebook instance using Docker. We will create the model then specify the parameter `instance_type` to be `local` telling the SDK to deploy locally.

In [None]:
from sagemaker.pytorch import PyTorchModel

model = PyTorchModel(model_data=model_location,
                     role=role,
                     framework_version='1.4.0',
                     entry_point='serve.py', 
                     source_dir='scripts')

Now that we have created the model we will deploy locally to test. It may take a while to run the first time as we need to download a Docker image to our notebook instance.

In [None]:
predictor = model.deploy(initial_instance_count=1, instance_type='local')

Now we can test out our endpoint. We will download a cat images from the internet and save locally.

In [None]:
! [ -d tmp ] || mkdir tmp
! wget -q -O tmp/british-shorthair.jpg https://cdn1-www.cattime.com/assets/uploads/2011/12/file_2744_british-shorthair-460x290-460x290.jpg

In [None]:
img = Image.open('tmp/british-shorthair.jpg')
img

Now we can call our local endpoint to ensure it is working and provides us the correct result.

In [None]:
from sagemaker.predictor import json_serializer, json_deserializer

predictor.accept = 'application/json'
predictor.content_type = 'application/json'

predictor.serializer = json_serializer
predictor.deserializer = json_deserializer

response = predictor.predict( { "url": "https://cdn1-www.cattime.com/assets/uploads/2011/12/file_2744_british-shorthair-460x290-460x290.jpg" })

print(response)

Once you are happy that the endpoint is working suceessully you can shut it down.

In [None]:
predictor.delete_endpoint()

## Deploy to SageMaker

Once we have verified that the script is working successfully on our locally deployed endpoint we can deploy our model to Amazon SageMaker so that it can be used in a production application. The code is almost exactly the same as deploying locally except that when we call `model.deploy()` we will change the instance type to an Amazon SageMaker valid instance type (e.g. `ml.m5.xlarge`).

In [None]:
from sagemaker.pytorch import PyTorchModel

model = PyTorchModel(model_data=model_location,
                     role=role,
                     framework_version='1.4.0',
                     entry_point='serve.py', 
                     source_dir='scripts')

Now let's deploy our SageMaker endpoint. It will take a few min to provision.

In [None]:
predictor = model.deploy(initial_instance_count=1, instance_type='ml.m5.xlarge')

In [None]:
img = Image.open('tmp/british-shorthair.jpg')
img

Now let's test our remote endpoint running on SageMaker hosting services.

In [None]:
from sagemaker.predictor import json_serializer, json_deserializer

predictor.accept = 'application/json'
predictor.content_type = 'application/json'

predictor.serializer = json_serializer
predictor.deserializer = json_deserializer

response = predictor.predict( { "url": "https://cdn1-www.cattime.com/assets/uploads/2011/12/file_2744_british-shorthair-460x290-460x290.jpg" })

print(response)

## Optional: delete endpoint

If you do not want to keep the endpoint up and running then remember to delete it to avoid incurring further costs.

In [None]:
predictor.delete_endpoint()