# Running Animal-Spot
## Important info
My name is David Kebert. This notebook is part of a [github repository](https://github.com/Davidkeebler/Orca-Detection) for a project that was done between February-April 2024. [Here's a link to my blog!](https://medium.com/@davidkebert1)

## Introduction
[Animal-Spot](https://github.com/ChristianBergler/ANIMAL-SPOT) is an open source machine learning framework for the detection of bioacoustic signals. This notebook will download the repository and install the necessary versions of its required packages. It also contains commands that run the training, prediction, and evaluation scripts. This is the third notebook, and it should be run after you have retrieved the data, run the preprocessing, and have moved the configuration files in this repository into their proper directories and renamed them.

## Setup and dependencies

In [None]:
# This repository assumes you are using google colab and google drive to replicate the result, so we mount google drive before anything else.
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
# Use CMD to navigate to navigate to a convenient directory in your google drive before executing the git clone below. Then, replace the config files in the Training, Prediction, and Evaluation subfolders with the config files
# in my repository before renaming them to simply 'config'.
# !git clone https://github.com/ChristianBergler/ANIMAL-SPOT.git

In [None]:
# These exact versions of torch, torchvision, and torchaudio are required for the repository to run.
!pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 TorchAudio==0.11.0+cu113 -f https://download.pytorch.org/whl/torch_stable.html

Looking in links: https://download.pytorch.org/whl/torch_stable.html
Collecting torch==1.11.0+cu113
  Downloading https://download.pytorch.org/whl/cu113/torch-1.11.0%2Bcu113-cp310-cp310-linux_x86_64.whl (1637.0 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.6/1.6 GB[0m [31m363.2 kB/s[0m eta [36m0:00:00[0m
[?25hCollecting torchvision==0.12.0+cu113
  Downloading https://download.pytorch.org/whl/cu113/torchvision-0.12.0%2Bcu113-cp310-cp310-linux_x86_64.whl (22.3 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m22.3/22.3 MB[0m [31m67.7 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting TorchAudio==0.11.0+cu113
  Downloading https://download.pytorch.org/whl/cu113/torchaudio-0.11.0%2Bcu113-cp310-cp310-linux_x86_64.whl (2.9 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.9/2.9 MB[0m [31m45.2 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: torch, torchvision, TorchAudio
  Attempting uninstall: torch
    Found ex

In [None]:
# Colab instances do not have tensorboardx by default, so we need this command to install it.
!python -m pip install tensorboardx

Collecting tensorboardx
  Downloading tensorboardX-2.6.2.2-py2.py3-none-any.whl (101 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m101.7/101.7 kB[0m [31m25.1 kB/s[0m eta [36m0:00:00[0m
Installing collected packages: tensorboardx
Successfully installed tensorboardx-2.6.2.2


In [None]:
# Ensuring the rest of the dependencies are installed.
!pip install resampy
!pip install pillow
!pip install Soundfile
!pip install scikit-image
!pip install six
!pip install opencv-python

Collecting resampy
  Downloading resampy-0.4.3-py3-none-any.whl (3.1 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3.1/3.1 MB[0m [31m13.3 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: resampy
[31mERROR: Operation cancelled by user[0m[31m
[0m

## Executing scripts

The following code cells execute the training, prediction, and evaluation scripts. The directories below will only work without modification if you cloned Animal-Spot directly into the lowest level of your google drive.

Training takes a long time and is resource intensive. It is further complicated by the need to generate spectrograms for each audio file in the training data - TPU is the optimal instance type for training, but it is very slow at generating the spectrograms. For this reason, I recommend you train the model for one epoch with a powerful GPU instance to cache all of the spectrograms, then delete the runtime and start over with a TPU.

Prediction and Evaluation are much less resource intensive and ran fine on any GPU instance in my experience.

### Training

The training script will look in its target directory for audio files to train on, and then generate splits for training, testing, and valuation. After that, it will begin to train the model and write the result of each epoch in the log. The script is configured to stop training on its own after it has gone 8 epochs without an improvement in valuation accuracy. Stopping the training early will not cause the script to produce the final animal-spot.pk file; it is necessary to just wait for it to finish.

If you are impatient and don't want to wait for the program to run 8 epochs without improvement, you could make sure it saved the checkpoint file from the epoch you want to be the final epoch, delete the other checkpoints in the directory, and then configure the model to start training from that checkpoint and to conclude training after 1 epoch.

 The notebooks included in this repository will organize and name the files appropriately so that training will run without any issues. Getting that to work was a big part of the challenge of this project!

In [None]:
# Start Training
!python /content/drive/MyDrive/ANIMAL-SPOT/TRAINING/start_training.py /content/drive/MyDrive/ANIMAL-SPOT/TRAINING/config

2024-04-13 22:10:38,391 - training animal-spot - INFO - Config Data: {'src_dir': '/content/drive/MyDrive/ANIMAL-SPOT/ANIMAL-SPOT', 'debug': '', 'data_dir': '/content/drive/MyDrive/acoustic-sandbox/unpacked-detection/split/bignoise', 'cache_dir': '/content/drive/MyDrive/ANIMAL-SPOT/Cache/', 'model_dir': '/content/drive/MyDrive/ANIMAL-SPOT/Models/', 'checkpoint_dir': '/content/drive/MyDrive/ANIMAL-SPOT/Checkpoints/', 'log_dir': '/content/drive/MyDrive/ANIMAL-SPOT/Logs/', 'summary_dir': '/content/drive/MyDrive/ANIMAL-SPOT/Summaries/', 'noise_dir': '/content/drive/MyDrive/acoustic-sandbox/unpacked-detection/split', 'start_from_scratch': '', 'max_train_epochs': '60', 'epochs_per_eval': '1', 'batch_size': '4', 'num_workers': '8', 'lr': '10e-6', 'beta1': '0.5', 'lr_patience_epochs': '4', 'lr_decay_factor': '0.5', 'early_stopping_patience_epochs': '8', 'filter_broken_audio': '', 'sequence_len': '2500', 'freq_compression': 'linear', 'n_freq_bins': '256', 'n_fft': '2048', 'hop_length': '180', 's

### Prediction

The prediction script will attempt to label the noise/target segments in the target .wav file or all .wav files in the target directory. To change which files prediction is being run on, change the target directory in the prediction config file.

The prediction config file contains some important parameters. I have configured the model to attempt to label 2.5 second increments of the data and advance the window by 0.5 seconds after each prediction. This reflects the fact that the training script is configured to truncate or extend the clips to 2.5 seconds, so these are very important. You might consider making the step size smaller or lowering the prediction threshold to increase the accuracy of the model.

My prediction config file is set to run prediction on one of the unsplit original audio files from the original dataset. To change which file or files prediction is run on, you have to open the config file in your google drive and change the target directory to that of the file you want to run prediction on.

In [None]:
# Start prediction
!python /content/drive/MyDrive/ANIMAL-SPOT/PREDICTION/start_prediction.py /content/drive/MyDrive/ANIMAL-SPOT/PREDICTION/config

2024-04-15 21:30:08,270 - training animal-spot - INFO - Config Data: {'src_dir': '/content/drive/MyDrive/ANIMAL-SPOT/ANIMAL-SPOT', 'debug': '', 'model_path': '/content/drive/MyDrive/ANIMAL-SPOT/Models/ANIMAL-SPOT-FINAL.pk', 'log_dir': '/content/drive/MyDrive/ANIMAL-SPOT/Logs/', 'output_dir': '/content/drive/MyDrive/ANIMAL-SPOT/Outputs/', 'sequence_len': '3', 'hop': '0.5', 'threshold': '0.15', 'batch_size': '1', 'num_workers': '1', 'visualize': '', 'latent_extract': '', 'input_file': '/content/drive/MyDrive/acoustic-sandbox/unpacked-detection/wav/60012.wav'}
2024-04-15 21:30:08,271 - training animal-spot - INFO - Start Prediction!!!
21:30:26|D|Model successfully load via torch load: /content/drive/MyDrive/ANIMAL-SPOT/Models/ANIMAL-SPOT-FINAL.pk
21:30:26|I|Sequential(
  (encoder): ResidualEncoder(
    (conv1): Conv2d(1, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
    (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu1):

### Evaluation

This script will use the output log from the prediction script above to create an annotation file that you can use to segment a spectrogram in [Raven Lite 2](https://store.birds.cornell.edu/products/raven-lite-2-0-free-license), a spectrogram analysis software which is free to use. Importing the selection table generated by this script into raven-lite 2 along with the targeted .wav file will allow you to see how the model has segmented the data!

You will need to make sure the log file is the only file in the target folder, or it will create an error.

In [None]:
!python /content/drive/MyDrive/ANIMAL-SPOT/EVALUATION/start_evaluation.py /content/drive/MyDrive/ANIMAL-SPOT/EVALUATION/config

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
21:32:09|D|start extract=4343850

21:32:09|D|end extract=4476149

21:32:09|I|time=98.5-101.5, pred=0, prob=0.02973049506545067

2024-04-15 21:38:48,253 - training animal-spot - DEBUG - time=98.5-101.5, pred=0, prob=0.02973049506545067
21:32:09|D|start extract=4365900

21:32:09|D|end extract=4498199

21:32:09|I|time=99.0-102.0, pred=0, prob=0.13252592086791992

2024-04-15 21:38:48,253 - training animal-spot - DEBUG - time=99.0-102.0, pred=0, prob=0.13252592086791992
21:32:09|D|start extract=4387950

21:32:09|D|end extract=4520249

21:32:09|I|time=99.5-102.5, pred=1, prob=0.18665997684001923

2024-04-15 21:38:48,253 - training animal-spot - DEBUG - time=99.5-102.5, pred=1, prob=0.18665997684001923
21:32:10|D|start extract=4410000

21:32:10|D|end extract=4542299

21:32:10|I|time=100.0-103.0, pred=1, prob=0.18301688134670258

2024-04-15 21:38:48,253 - training animal-spot - DEBUG - time=100.0-103.0, pred=1, prob=0.18301688134

# Conclusion

In this notebook, we:
- Installed the packages with the correct version numbers required for animal-spot to work
- Used animal-spot to train a model that can identify orca vocalizations and background noise in a raw .wav file from a hydrophone
- Ran prediction on an unseen .wav file with our model
- Converted the prediction log file into a selection table that can be used to segment the data in raven-lite