# Train a ResNet based fish (grouper) detection model


## Summary

This notebook was used to create a fish (grouper) detection model trained on labeled data provided by Jim Locasio, Max Fullmer, and others from the Mote Lab.

The model, a ResNet based CNN, is trained on spectrograms of twenty second audio files identified as containing a vocalization by a fish.



In [None]:
import os
from pathlib import Path

import torch
import fastai
import fastai.vision.all as fai_vision
import pandas as pd
import fastai.callback.all as fai_callback
import fastai.callback.tensorboard as tb

In [None]:
print(torch.__version__, fastai.__version__)

In [None]:
torch.cuda.set_device('cuda:3')

## Prepare data loader for training images

In [None]:
data_dir = Path('/mnt/store/data/assets/black-grouper/data/training/spec/all-wave-samples_snr-1-whole')
image_files = data_dir.glob('**/*.png')

In [None]:
# Create list of sample files to create the training set
# Only include samples:
# - Are not cut off or overlap (i.e. "clean")
# - Include 'call-0' -> lack of call, just evironmental noise
labels = {}

for image_file in image_files:
    if 'clean' not in str(image_file) and 'call-0' not in str(image_file):
        continue
    
    # get "call-*" from path
    # split out int ~ needs to be int to be a Tensor type
    try:
        labels[str(image_file)] = int(str(image_file.parent).split('/')[-2].split('-')[-1])
    except:
        labels[str(image_file)] = 0
    
print(len(labels))

In [None]:
# Put data in a DataFrame to ease training setup
df = pd.DataFrame({'fname': labels.keys(), 'label': labels.values()})
df

### Distribution of classes

The labels are severely unbalanced.  There are three general ways to handle this:

- Just use the unbalanced classes
- Draw equal number of samples from every class (would severely limit training size)
- Create synthetic data to make equal class sample sizes

Here, we undersample the classes with many samples and oversample classes with few examples.

In [None]:
df.groupby('label').count().plot.bar()

In [None]:
df.groupby('label').count()

In [None]:
# Change to True, or remove to recreate sample
if False:
    df_sub = df.groupby('label').sample(n=50, replace=True)
    df_sub.to_csv('fish-sounds-resnet101-balanced-samples-n50.csv', index=False)
else:
    df_sub = pd.read_csv('fish-sounds-resnet101-balanced-samples-n50.csv')

In [None]:
df_sub

In [None]:
# valid_pct: 20% of samples are for testing
# fn_col: column in DF that contains file name
# label_col: column in DF with labels
# num_workers:  Number of threads used for loading data.
# - This is implemented using /dev/shmem and is currently a default low value.
# - Keep as num_workers = 1
loader = fai_vision.ImageDataLoaders.from_df(
    df_sub,
    path='/',
    valid_pct=0.2,
    seed=666,
    fn_col=0,
    label_col=1,
    num_workers=1,
    bs=16
)

### Check out some samples

Very nice feature of fastai.


In [None]:
loader.show_batch(nrows=3, ncols=3)

## Train the model

- ResNet101 trained on ImageNet
- Use accuracy for model
- Do adjust learning parameters
- Training for ~25 epochs is fast and results in reasonable results (high 80s accuracy)

In [None]:
# define the model
model = fai_vision.cnn_learner(
    loader,
    fai_vision.resnet101,
    metrics=fai_vision.accuracy,
    normalize=True,  # Nice touch, this normalizes inputs to aid in training.  Other libs won't do this for you.
    pretrained=True
)

In [None]:
# train
model.fine_tune(25)

In [None]:
# model.path, basepath for exporting model, defaults to '/' which has perms issues
model.path = Path('.')
model.path

In [None]:
model.export(fname='fish-sounds-resnet101-balanced-samples-n50')