<a href="https://colab.research.google.com/github/Tyler-Pickett/HumpbackWhale_Bioacoustics/blob/main/model.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
!pip install librosa
import librosa
import librosa.display
import soundfile as sf

import os
import pathlib

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns
from IPython import display as ipd

from sklearn.model_selection import train_test_split, StratifiedKFold

import tensorflow as tf
import tensorflow_hub as hub
import keras
from tensorflow.keras.layers.experimental import preprocessing
from keras import layers
from keras.layers import Activation, Dense, Dropout, Conv2D, Flatten, MaxPooling2D, GlobalMaxPooling2D, GlobalAveragePooling1D, AveragePooling2D, Input, Add
from keras.models import Sequential
from keras.optimizers import SGD
# Set seed for experiment reproducibility
seed = 42
tf.random.set_seed(seed)
np.random.seed(seed)



The command to convert the flac format to x.wav is:
flac -df --delete-input-file --preserve-modtime --keep-foreign-metadata <path to file>
 - for NOAA humpbackwhale1 model

In [None]:
from google.colab import drive

drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
from google.colab import auth
auth.authenticate_user()

In [None]:
from google.cloud import storage

In [None]:
audio = '/content/drive/MyDrive/Colab Notebooks/capstone/Cross_02_060203_071428.d20_7.wav'

In [None]:
yamnet_model = hub.load('https://tfhub.dev/google/yamnet/1')
yamnet_model

<tensorflow.python.saved_model.load.Loader._recreate_base_user_object.<locals>._UserObject at 0x7f532299b1d0>

The model classifies 3.92-second context windows of audio as containing or not containing humpback whale sounds. It is intended to be applied as a detector by scoring every context window in a set of underwater passive acoustic monitoring data.

The model feeds a PCEN-normalized spectrogram through a ResNet-50 architecture to a single logistic output unit.

https://tfhub.dev/google/humpback_whale/1

Inputs

    waveform, a float32 Tensor of shape [batch_size, num_samples, num_channels], where it is required that num_channels = 1, but where batch_size and num_samples may take the caller's preferred values on each call.
        Each audio channel (slice [channel_index, :, 0]), should contain 10kHz PCM float32 audio.
            The training data left plenty of headroom; the level of clips with humpback present was typically 0.003 RMS, 0.02 peak, much "quieter" than consumer digital audio.
            Although the model is relatively insensitive to input gain variations as wide as +/-20 dB, users may wish to apply linear scaling to match the levels the model saw in training.
    context_samples, an int64 Tensor of shape [], the hop length at which to slide the scoring context window over waveform.

Advanced Usage

Model attributes allow isolated reuse of parts of the model, in accord with the Reusable SavedModels interface. The callable attributes exposed are:

    front_end, which can be called on a waveform Tensor as described in the score signature inputs to produce a PCEN-normalized spectrogram of shape [batch_size, num_stft_bins, num_channels], where num_channels = 64 is fixed and where num_stft_bins depends on the number of input samples.
    features, which when called on a PCEN spectrogram slice of shape [batch_size, 128, 64] produces feature vectors of shape [batch_size, 2048]. (These might be useful for detecting other audio event types in the HARP data or similar underwater passive acoustic monitoring datasets, but the model developers have not yet validated this through experiment.)
    logits, which, when called on the same type of input as features, outputs the log odds of the input spectrogram containing humpback vocalization.

https://tfhub.dev/google/humpback_whale/1

In [None]:
noaa_model = hub.load('https://tfhub.dev/google/humpback_whale/1')

waveform, _ = tf.audio.decode_wav(tf.io.read_file(audio))
waveform = tf.expand_dims(waveform, 0)  # makes a batch of size 1

pcen_spectrogram = noaa_model.front_end(waveform)
context_window = pcen_spectrogram[:, :128, :]
features = noaa_model.features(context_window)
logits = noaa_model.logits(context_window)
probabilities = tf.nn.sigmoid(logits)

print({
    'pcen_spectrogram': pcen_spectrogram,
    'features': features,
    'logits': logits,
    'probabilities': probabilities,
})


{'pcen_spectrogram': <tf.Tensor: shape=(1, 2497, 64), dtype=float32, numpy=
array([[[0.34337044, 0.352898  , 0.3364389 , ..., 0.31550467,
         0.30421436, 0.3076837 ],
        [0.01286328, 0.11843181, 0.3996929 , ..., 0.36540306,
         0.38193512, 0.3762561 ],
        [0.02342975, 0.06558275, 0.18811762, ..., 0.48939407,
         0.4787593 , 0.2955898 ],
        ...,
        [0.14307153, 0.22290611, 0.5149369 , ..., 0.1615628 ,
         0.2538408 , 0.39627314],
        [0.13941622, 0.06648052, 0.13821816, ..., 0.22090185,
         0.3079667 , 0.3320781 ],
        [0.1844101 , 0.17328954, 0.07394493, ..., 0.21915352,
         0.30805433, 0.1828376 ]]], dtype=float32)>, 'features': <tf.Tensor: shape=(1, 2048), dtype=float32, numpy=
array([[1.6633583 , 0.77532035, 1.2360727 , ..., 2.1375327 , 1.1309146 ,
        2.7635047 ]], dtype=float32)>, 'logits': <tf.Tensor: shape=(1, 1), dtype=float32, numpy=array([[2.268429]], dtype=float32)>, 'probabilities': <tf.Tensor: shape=(1, 1), dtyp

Outputs

    scores, a float32 Tensor of shape [batch_size, num_windows, num_classes], where it will always be true that num_classes = 1, where batch_size will equal the one from the input, and where num_windows is determined by num_samples and context_step_samples.

https://tfhub.dev/google/humpback_whale/1

In [None]:
metadata_fn = noaa_model.signatures['metadata']
metadata = metadata_fn()
print(metadata)

{'class_names': <tf.Tensor: shape=(1,), dtype=string, numpy=array([b'Mn'], dtype=object)>, 'input_sample_rate': <tf.Tensor: shape=(), dtype=int64, numpy=10000>, 'context_width_samples': <tf.Tensor: shape=(), dtype=int64, numpy=39124>}


Acknowledgements

The model developers thank NOAA Pacific Islands Fisheries Science Center for collecting and sharing the data and for their partnership in the model development process, which included providing the initial training labels as well as labels for several rounds of active learning that improved candidate models.

Regarding the dataset, please also refer to the funding and acknowledgement sections in Allen et al.