# Training a binary classifier to detect dolphin sounds

In this notebook we will use the database (spectrograms) we previously created in the ```database_creation_pamguard``` notebook to train a Ketos model. This first model is a binary classifier that takes 3 seconds long spectrograms and classify them into 2 classes: 1 (presence of dolphin click trains), or 0 (absence of click trains).

We will then use this model on acoustic recordings (.wav files of several minutes) as a detector, to spot trains of dolphin clicks.

The content of this notebook partly comes from the Ketos documentations. the comments and code were adpated to our current ClickLearn project. 


## Contents:

[1. Mouting drive, importing the libraries and setting the paths](#section1)  
[2. Creating the data feed and the model](#section2)  
[3. Creating and training the Neural Network](#section3)  
 

<a id="section1"></a>
##1. Mouting drive, importing the libraries and setting the paths




Mounting a google drive

In [None]:
from google.colab import drive
drive.mount('/content/gdrive')

Mounted at /content/gdrive


Installing ketos

In [None]:
!pip install ketos

Collecting ketos
  Downloading ketos-2.4.0.tar.gz (181 kB)
[K     |████████████████████████████████| 181 kB 4.2 MB/s 
Collecting datetime_glob
  Downloading datetime-glob-1.0.8.tar.gz (10 kB)
Collecting pint
  Downloading Pint-0.18-py2.py3-none-any.whl (209 kB)
[K     |████████████████████████████████| 209 kB 34.4 MB/s 
Collecting version-parser
  Downloading version_parser-1.0.1.tar.gz (4.4 kB)
Collecting lexery>=1.0.0
  Downloading lexery-1.1.1.tar.gz (5.4 kB)
Building wheels for collected packages: ketos, datetime-glob, lexery, version-parser
  Building wheel for ketos (setup.py) ... [?25l[?25hdone
  Created wheel for ketos: filename=ketos-2.4.0-py3-none-any.whl size=226286 sha256=b50a399580e1247e2bfe4ded44c77f798cadc0ed7ab87425f25a7bd58ab03d37
  Stored in directory: /root/.cache/pip/wheels/a9/74/ec/07b9dbb748e2c8bd8c2e71833a4ff12edd76f4e6b5864fdbcb
  Building wheel for datetime-glob (setup.py) ... [?25l[?25hdone
  Created wheel for datetime-glob: filename=datetime_glob-1.0.8-

In [None]:
from ketos.data_handling import selection_table as sl
import ketos.data_handling.database_interface as dbi
from ketos.data_handling.parsing import load_audio_representation
from ketos.audio.spectrogram import MagSpectrogram
from ketos.audio.audio_loader import AudioFrameLoader
from ketos.data_handling.parsing import load_audio_representation
from ketos.neural_networks.resnet import ResNetInterface
from ketos.data_handling.data_feeding import BatchGenerator
import ketos.neural_networks.dev_utils.detection as det

from google.colab import files


  super(Adam, self).__init__(name, **kwargs)


Identifying GPU device and listing:


In [None]:
%tensorflow_version 2.x
import tensorflow as tf
device_name = tf.test.gpu_device_name()
if device_name != '/device:GPU:0':
    raise SystemError('GPU device not found')
print('Found GPU at: {}'.format(device_name))

Found GPU at: /device:GPU:0


In [None]:
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))

Num GPUs Available:  1


<a id="section2"></a>
## 2. Creating the data feed and the model

opening Ketos database (spectrograms)

In [None]:
db = dbi.open_file("gdrive/MyDrive/database_ketos_02.h5", 'r')

In [None]:
db

File(filename=gdrive/MyDrive/database_ketos_02.h5, title='', mode='r', root_uep='/', filters=Filters(complevel=0, shuffle=False, bitshuffle=False, fletcher32=False, least_significant_digit=None))
/ (RootGroup) ''
/test (Group) ''
/test/data (Table(1112,), fletcher32, shuffle, zlib(1)) ''
  description := {
  "data": Float32Col(shape=(86, 8162), dflt=0.0, pos=0),
  "filename": StringCol(itemsize=100, shape=(), dflt=b'', pos=1),
  "id": UInt32Col(shape=(), dflt=0, pos=2),
  "label": UInt8Col(shape=(), dflt=0, pos=3),
  "offset": Float64Col(shape=(), dflt=0.0, pos=4)}
  byteorder := 'little'
  chunkshape := (1,)
/train (Group) ''
/train/data (Table(5455,), fletcher32, shuffle, zlib(1)) ''
  description := {
  "data": Float32Col(shape=(86, 8162), dflt=0.0, pos=0),
  "filename": StringCol(itemsize=100, shape=(), dflt=b'', pos=1),
  "id": UInt32Col(shape=(), dflt=0, pos=2),
  "label": UInt8Col(shape=(), dflt=0, pos=3),
  "offset": Float64Col(shape=(), dflt=0.0, pos=4)}
  byteorder := 'little

Attributing names

In [None]:
# CAUTION: these names should not be changed when using the ketos build because they are unfortunately hard coded as train_data and val_data
train_data = dbi.open_table(db, "/train/data")
val_data = dbi.open_table(db, "/test/data")

In [None]:
print(train_data.shape, val_data.shape)

(5455,) (1112,)


In [17]:
train_data[0][0].shape

(86, 8162)

Custom function for batch loaders:

In [None]:
from skimage.transform import resize

In [18]:
def transform_batch_custom(X, Y):
    '''This function reshapes the spectrograms into squares so that the ketos training loop can handle them.'''
    temp = resize(X, (X.shape[0], 86, 86))
    x = temp.reshape(temp.shape[0],temp.shape[1],temp.shape[2],1)
    y = tf.one_hot(Y['label'], depth=2, axis=1).numpy()
    return x, y

train_generator = BatchGenerator(batch_size=128, data_table=train_data,
                                 output_transform_func=transform_batch_custom,
                                 shuffle=True, refresh_on_epoch_end=True)

val_generator = BatchGenerator(batch_size=128, data_table=val_data,
                                output_transform_func=transform_batch_custom,
                                shuffle=True, refresh_on_epoch_end=False)

recipe.json contains the model architecture

In [19]:
resnet = ResNetInterface.build_from_recipe_file("gdrive/MyDrive/recipe.json")

In [20]:
resnet.train_generator = train_generator
resnet.val_generator = val_generator

<a id="section3"></a>
## 3. Creating and training the Neural Network

## Training #1

In [None]:
resnet.checkpoint_dir = "gdrive/MyDrive/content/models/checkpoints"

In [None]:
#train and go go go!!
resnet.train_loop(n_epochs=10, verbose=True)


Epoch: 1 
train_loss: 0.1511719971895218
train_CategoricalAccuracy: 0.868 train_Precision: 0.964 train_Recall: 0.776 
val_loss: 0.41088739037513733
val_CategoricalAccuracy: 0.574 val_Precision: 1.000 val_Recall: 0.038 


Epoch: 2 
train_loss: 0.08048659563064575
train_CategoricalAccuracy: 0.933 train_Precision: 0.971 train_Recall: 0.898 
val_loss: 0.33597445487976074
val_CategoricalAccuracy: 0.653 val_Precision: 1.000 val_Recall: 0.218 


Epoch: 3 
train_loss: 0.05518920347094536
train_CategoricalAccuracy: 0.964 train_Precision: 0.984 train_Recall: 0.946 
val_loss: 0.20393653213977814
val_CategoricalAccuracy: 0.797 val_Precision: 1.000 val_Recall: 0.541 


Epoch: 4 
train_loss: 0.03286391869187355
train_CategoricalAccuracy: 0.981 train_Precision: 0.981 train_Recall: 0.983 
val_loss: 0.13546355068683624
val_CategoricalAccuracy: 0.866 val_Precision: 1.000 val_Recall: 0.697 


Epoch: 5 
train_loss: 0.017276326194405556
train_CategoricalAccuracy: 0.995 train_Precision: 0.996 train_Recall:

In [None]:
resnet.save_model('gdrive/MyDrive/content/models/Flipper_01_10epochs.kt',audio_repr_file='/content/gdrive/MyDrive/content/spec_config_spectrogram_02.json')


In [None]:
#not sure if it works properly so I try other checkpoint names
# this one is probably wrong:
# resnet.save_model('gdrive/MyDrive/content/models/Flipper_01_05epochs2.kt', checkpoint_name = 'cp-0005.ckpt.data-00000-of-00001', audio_repr_file='/content/gdrive/MyDrive/content/spec_config_spectrogram_02.json')


In [None]:
#not sure if it works properly so I try other checkpoint names
# this one is probably wrong:
# resnet.save_model('gdrive/MyDrive/content/models/Flipper_01_05epochs3.kt', checkpoint_name = 'cp-0005.ckpt.index', audio_repr_file='/content/gdrive/MyDrive/content/spec_config_spectrogram_02.json')


## Training #2

In [None]:
#another try
resnet.train_loop(n_epochs=5, verbose=True, checkpoint_freq = 1)


Epoch: 1 
train_loss: 0.15004906058311462
train_CategoricalAccuracy: 0.876 train_Precision: 0.966 train_Recall: 0.791 
val_loss: 0.3647066652774811
val_CategoricalAccuracy: 0.670 val_Precision: 1.000 val_Recall: 0.256 


Epoch: 2 
train_loss: 0.0661311224102974
train_CategoricalAccuracy: 0.953 train_Precision: 0.977 train_Recall: 0.933 
val_loss: 0.41072288155555725
val_CategoricalAccuracy: 0.639 val_Precision: 1.000 val_Recall: 0.185 


Epoch: 3 
train_loss: 0.034738730639219284
train_CategoricalAccuracy: 0.983 train_Precision: 0.985 train_Recall: 0.982 
val_loss: 0.38999128341674805
val_CategoricalAccuracy: 0.613 val_Precision: 1.000 val_Recall: 0.126 


Epoch: 4 
train_loss: 0.017882484942674637
train_CategoricalAccuracy: 0.994 train_Precision: 0.993 train_Recall: 0.996 
val_loss: 0.25229063630104065
val_CategoricalAccuracy: 0.756 val_Precision: 1.000 val_Recall: 0.449 


Epoch: 5 
train_loss: 0.009140498004853725
train_CategoricalAccuracy: 0.998 train_Precision: 0.998 train_Recall

In [None]:
resnet.save_model('gdrive/MyDrive/content/models/Flipper_03_05epochs.kt',audio_repr_file='/content/gdrive/MyDrive/content/spec_config_spectrogram_02.json')


In [None]:
#keeping on training the same model Flipper_03
resnet.train_loop(n_epochs=1, verbose=True, checkpoint_freq = 1)


Epoch: 1 
train_loss: 0.003192993113771081
train_CategoricalAccuracy: 1.000 train_Precision: 1.000 train_Recall: 1.000 
val_loss: 0.08566919714212418
val_CategoricalAccuracy: 0.925 val_Precision: 1.000 val_Recall: 0.831 



In [None]:
resnet.save_model('gdrive/MyDrive/content/models/Flipper_03_07epochs.kt',audio_repr_file='/content/gdrive/MyDrive/content/spec_config_spectrogram_02.json')


In [None]:
#keeping on training the same model Flipper_03
resnet.train_loop(n_epochs=1, verbose=True, checkpoint_freq = 1)


Epoch: 1 
train_loss: 0.0021612700074911118
train_CategoricalAccuracy: 1.000 train_Precision: 1.000 train_Recall: 1.000 
val_loss: 0.27281829714775085
val_CategoricalAccuracy: 0.743 val_Precision: 0.665 val_Recall: 0.846 



## Training #3

In [None]:
resnet.checkpoint_dir = "gdrive/MyDrive/content/models/checkpoints"

In [None]:
resnet.early_stopping_monitor

{'baseline': 0.5,
 'decreasing': True,
 'delta': 0.1,
 'max_epochs': None,
 'metric': 'val_loss',
 'min_epochs': 5,
 'period': 10}

In [None]:
#change the early stopping rules to use val_CategoricalAccuracy instead of Val_loss
resnet.early_stopping_monitor = {'baseline': None,  #determines if a specific value is aimed at or not 
 'decreasing': False,        # determines the direction of what we consider as an improvement (increase or decrease) of the metric
 'delta': 0.01,              # determines what is an improvement between 2 epochs. Otherwise we consider it is not improving 
 'max_epochs': None,         # not documented...
 'metric': 'val_CategoricalAccuracy',   # the metric we are following for early stopping
 'min_epochs': 3,            # The #epoch from which the monitoring starts 
 'period': 3}                # the number of epochs allowed without improvement, before stopping.

In [None]:
resnet.early_stopping_monitor

{'baseline': None,
 'decreasing': False,
 'delta': 0.01,
 'max_epochs': None,
 'metric': 'val_CategoricalAccuracy',
 'min_epochs': 3,
 'period': 3}

In [None]:
resnet.train_loop(n_epochs=12, verbose=True, checkpoint_freq = 1, early_stopping = True)


Epoch: 1 
train_loss: 0.13590770959854126
train_CategoricalAccuracy: 0.887 train_Precision: 0.927 train_Recall: 0.850 
val_loss: 0.40722736716270447
val_CategoricalAccuracy: 0.564 val_Precision: 1.000 val_Recall: 0.015 


Focus metric val_CategoricalAccuracy

Epoch: 2 
train_loss: 0.07647981494665146
train_CategoricalAccuracy: 0.934 train_Precision: 0.959 train_Recall: 0.913 
val_loss: 0.47175076603889465
val_CategoricalAccuracy: 0.633 val_Precision: 1.000 val_Recall: 0.172 


Focus metric val_CategoricalAccuracy

Epoch: 3 
train_loss: 0.040567848831415176
train_CategoricalAccuracy: 0.974 train_Precision: 0.974 train_Recall: 0.976 
val_loss: 0.3874521255493164
val_CategoricalAccuracy: 0.699 val_Precision: 1.000 val_Recall: 0.321 


Focus metric val_CategoricalAccuracy

Epoch: 4 
train_loss: 0.023791393265128136
train_CategoricalAccuracy: 0.989 train_Precision: 0.990 train_Recall: 0.989 
val_loss: 0.3266814947128296
val_CategoricalAccuracy: 0.684 val_Precision: 1.000 val_Recall: 0.287 

## Training #4

In [None]:
resnet.checkpoint_dir = "gdrive/MyDrive/content/models/checkpoints_flipper05"

In [None]:
#change the early stopping rules to use val_CategoricalAccuracy instead of Val_loss
resnet.early_stopping_monitor = {'baseline': None,  #determines if a specific value is aimed at or not 
 'decreasing': False,        # determines the direction of what we consider as an improvement (increase or decrease) of the metric
 'delta': 0.01,              # determines what is an improvement between 2 epochs. Otherwise we consider it is not improving 
 'max_epochs': None,         # not documented...
 'metric': 'val_CategoricalAccuracy',   # the metric we are following for early stopping
 'min_epochs': 3,            # The #epoch from which the monitoring starts 
 'period': 3}                # the number of epochs allowed without improvement, before stopping.

In [None]:
resnet.early_stopping_monitor

{'baseline': None,
 'decreasing': False,
 'delta': 0.01,
 'max_epochs': None,
 'metric': 'val_CategoricalAccuracy',
 'min_epochs': 3,
 'period': 3}

In [None]:
resnet.train_loop(n_epochs=14, verbose=True, checkpoint_freq = 1, early_stopping = True)


Epoch: 1 
train_loss: 0.13925044238567352
train_CategoricalAccuracy: 0.880 train_Precision: 0.936 train_Recall: 0.828 
val_loss: 0.3605610132217407
val_CategoricalAccuracy: 0.634 val_Precision: 1.000 val_Recall: 0.174 


Focus metric val_CategoricalAccuracy

Epoch: 2 
train_loss: 0.0708041712641716
train_CategoricalAccuracy: 0.943 train_Precision: 0.961 train_Recall: 0.929 
val_loss: 0.4112125337123871
val_CategoricalAccuracy: 0.581 val_Precision: 1.000 val_Recall: 0.054 


Focus metric val_CategoricalAccuracy

Epoch: 3 
train_loss: 0.032577551901340485
train_CategoricalAccuracy: 0.981 train_Precision: 0.976 train_Recall: 0.988 
val_loss: 0.442448228597641
val_CategoricalAccuracy: 0.557 val_Precision: 0.000 val_Recall: 0.000 


Focus metric val_CategoricalAccuracy

Epoch: 4 
train_loss: 0.015243718400597572
train_CategoricalAccuracy: 0.994 train_Precision: 0.993 train_Recall: 0.996 
val_loss: 0.4204346239566803
val_CategoricalAccuracy: 0.573 val_Precision: 1.000 val_Recall: 0.036 


F

In [None]:
#save

## Training #6

In [21]:
resnet.checkpoint_dir = "gdrive/MyDrive/content/models/checkpoints_flipper06"

In [22]:
#Original settings
resnet.early_stopping_monitor

{'baseline': 0.5,
 'decreasing': True,
 'delta': 0.1,
 'max_epochs': None,
 'metric': 'val_loss',
 'min_epochs': 5,
 'period': 10}

In [23]:
#change the early stopping rules to use val_CategoricalAccuracy instead of Val_loss
resnet.early_stopping_monitor = {'baseline': None,  #determines if a specific value is aimed at or not 
 'decreasing': False,        # determines the direction of what we consider as an improvement (increase or decrease) of the metric
 'delta': 0.01,              # determines what is an improvement between 2 epochs. Otherwise we consider it is not improving 
 'max_epochs': None,         # not documented...
 'metric': 'val_CategoricalAccuracy',   # the metric we are following for early stopping
 'min_epochs': 3,            # The #epoch from which the monitoring starts 
 'period': 3}                # number of epochs without improvement to stop training 

In [24]:
resnet.early_stopping_monitor

{'baseline': None,
 'decreasing': False,
 'delta': 0.01,
 'max_epochs': None,
 'metric': 'val_CategoricalAccuracy',
 'min_epochs': 3,
 'period': 3}

In [None]:
resnet.train_loop(n_epochs=14, verbose=True, checkpoint_freq = 1, early_stopping = True)


Epoch: 1 
train_loss: 0.12664449214935303
train_CategoricalAccuracy: 0.901 train_Precision: 0.936 train_Recall: 0.862 
val_loss: 0.5386421084403992
val_CategoricalAccuracy: 0.557 val_Precision: 1.000 val_Recall: 0.092 


Focus metric val_CategoricalAccuracy
