## Usage

This notebook will go over how to install this repo on an external server to run the training and inference.
To begin, we'll first need to clone the repo

We will also need to install the [torchaudio-contrib package](https://github.com/keunwoochoi/torchaudio-contrib). This is simple as cloneing the repo and using `pip` to install it 

Next, we'll do some cleanup and move the repo into the root folder, which is optional

Next, we'll install the other required packages. Note tensorboardX is optional. If you want to install tensorboardX, you'll also need to install Tensorflow as well

Once its downloaded, we will then need to untar the file. Just replace `{urbansound8k_downloaded_file}` with the name of the file. 
Optionally, we can removed the downloaded file here as well

## Preparing the config file

The config file is used to build out the training model. The only thing you will *need* to change is the path to the dataset.
which is located in `data["path"]`.
You may also want to change the number of epochs(`data["train"]["epochs"]`), when testing. Running the training with 10 epochs took about 10 minutes on a GPU. (You are running this on a GPU right :))

In [1]:
json_config = {
    "name"          :   "Urban Testing",
    "data"          :   {
                            "type"      :   "CSVDataManager",
                            "path"      :   "UrbanSound8K",
                            "format"    :   "audio",
                            "loader"    :   {
                                                "shuffle"       : True,
                                                "batch_size"    : 24,
                                                "num_workers"   : 4,
                                                "drop_last"     : True
                                            },
                            "splits"    :   {
                                                "train" : [1,2,3,4,5,6,7,8,9], 
                                                "val"   : [10]                                            
                                            }
                        },
    "transforms"    :   {
                            "type"      :   "AudioTransforms",
                            "args"      :   {
                                                "channels"       : "avg",
                                                "noise"    : [0.3, 0.001],
                                                "crop"     : [0.4, 0.25]
                                            }
                        },
    "optimizer"     :   {
                            "type"      :   "Adam",
                            "args"      :   {
                                                "lr"            : 0.002,
                                                "weight_decay"  : 0.01,
                                                "amsgrad"       : True
                                            }
                        },
    "lr_scheduler"   :   {
                            "type"      :   "StepLR",
                            "args"      :   {
                                                "step_size" : 10,
                                                "gamma"     : 0.5
                                            }
                        },
    "model"         :   {
                            "type"      :   "AudioCRNN"
                        },
    "train"         :   {
                            "loss"      :   "nll_loss",
                            "epochs"    :   100,
                            "save_dir"  :   "saved_cv/",
                            "save_p"    :   1,
                            "verbosity" :   2,
                            
                            "monitor"   :   "min val_loss",
                            "early_stop":   8,
                            "tbX"       :   True
                        },
    "metrics"       :   "classification_metrics"

}

Next, we'll write this json out, in order for the model to read in this updated json file

In [2]:
import json
with open('config.json', 'w') as json_file:  
    json.dump(json_config, json_file)

## Training
Finally, we can start training the model. We'll be passing 3 parameters, with the first parameter being the action we want to take, which is `train`. You can use `train` to train the model, or `eval`, to perform evalation on the model. The `-c` parameter is the config file, which we just created, and `--cfg`, which is the layer configuration of the model.

In [3]:
!python3 run.py train -c my-config.json --cfg crnn.cfg

Traceback (most recent call last):
  File "run.py", line 7, in <module>
    import data as data_module
  File "/work/data/__init__.py", line 1, in <module>
    from .data_manager import *
  File "/work/data/data_manager.py", line 2, in <module>
    import os, cv2
ModuleNotFoundError: No module named 'cv2'


## Inference

After we have trainined the model, we can run inference on it.
Call the `run.py` with 2 parameters. The first is a path to a sample audio audio file. For this example, we'll use a random audio sample from the UrbanSound8K dataset. The second parameter will be the path to the model checkpoint. It will look something like this `saved_cv/{timestamp}/checkoints/model_best.pth`

In [4]:
!python run.py UrbanSound8K/audio/fold10/100795-3-0-0.wav -r saved_cv/0515_171217/checkpoints/model_best.pth

dog_bark 0.9858338236808777


When running our inference, we got a 98% confidence of the supplied audio to be a dog bark.