<a href="https://colab.research.google.com/github/BenUCL/Reef-acoustics-and-AI/blob/main/Code/Produce_a_custom_pretrained_CNN.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Train the CNN on your audio, save this as a pretrained model**

This script provides an example of training the CNN on the minibatch files which can be created with the 'CNN minibatch creation' script. This uses a small subset of the Indonesian dataset.

This outputs a saved version of the model, such that it can be used as a new custom pretrained version of th CNN to extract features from your own audio. This is in the form of four files which are saved to the '/Results/trained_CNN_saved_model/' folder. These are called:


1.   custom_pretrained_CNN.ckpt.meta
2.   custom_pretrained_CNN.ckpt.index
3. custom_pretrained_CNN.ckpt.data-00000-of-00001
4. checkpoint


To use this custom pretrained CNN, you will need to:

1.   Duplicate the Audioset folder and name this 'Custom Audioset'  
2. Delete the vggish_model.ckpt file and copy in the four new files.
3. Open the 'AudiosetAnalysis.py' file in this folder and replace the line: *self.checkpoint_path = 'vggish_model.ckpt'* with: *self.checkpoint_path = 'custom_pretrained_CNN.ckpt'*
4. Save this.
5. You can then open the 'Feature extraction with pretrained CNN.pynb' and change the path given to the *vggish_files =* line to the path for your new Audioset folder.
6. This will now extract features on your audio files exactly as before, but this time it uses the custom pretrained CNN you have created








# **Using Colabs free GPU feature**

Google colab provides free GPU access (with some limits), see here: https://research.google.com/colaboratory/faq.html

This can be used to significantly increase training speed. To switch this on go to 'Runtime' at the top and change type to 'GPU'.


### **If you use this code, please cite Williams et al (2023)**

In [1]:
# Connect your Google Drive
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [2]:
!pip install numpy==1.21.5 resampy==0.2.2 tensorflow==1.15 tf_slim==1.1.0 six==1.15.0 soundfile==0.10.3.post1

""" As package versions began updating this threw errors on the smoke test. 
For a faster download versions could be removed but this may throw errors. 
As of 17/10/22 it gives the below output, but, the smoketest codeblock passes:

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
tensorflow-probability 0.16.0 requires gast>=0.3.2, but you have gast 0.2.2 which is incompatible.
kapre 0.3.7 requires tensorflow>=2.0.0, but you have tensorflow 1.15.0 which is incompatible.
Successfully installed gast-0.2.2 keras-applications-1.0.8 llvmlite-0.32.1 numba-0.49.1 numpy-1.21.5 resampy-0.2.2 soundfile-0.10.3.post1 tensorboard-1.15.0 tensorflow-1.15.0 tensorflow-estimator-1.15.1 tf-slim-1.1.0
WARNING: The following packages were previously imported in this runtime:
  [numpy]
You must restart the runtime in order to use newly installed versions. """

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting numpy==1.21.5
  Downloading numpy-1.21.5-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (15.7 MB)
[K     |████████████████████████████████| 15.7 MB 5.1 MB/s 
[?25hCollecting resampy==0.2.2
  Downloading resampy-0.2.2.tar.gz (323 kB)
[K     |████████████████████████████████| 323 kB 60.0 MB/s 
[?25hCollecting tensorflow==1.15
  Downloading tensorflow-1.15.0-cp37-cp37m-manylinux2010_x86_64.whl (412.3 MB)
[K     |████████████████████████████████| 412.3 MB 27 kB/s 
[?25hCollecting tf_slim==1.1.0
  Downloading tf_slim-1.1.0-py2.py3-none-any.whl (352 kB)
[K     |████████████████████████████████| 352 kB 69.0 MB/s 
Collecting soundfile==0.10.3.post1
  Downloading SoundFile-0.10.3.post1-py2.py3-none-any.whl (21 kB)
Collecting tensorboard<1.16.0,>=1.15.0
  Downloading tensorboard-1.15.0-py3-none-any.whl (3.8 MB)
[K     |████████████████████████████████| 3.8 MB 55.0 MB/s



In [3]:
# Should output 'Looks good to me at the bottom!'
%cd /content/drive/MyDrive/Reef soundscapes with AI/Audioset
!python vggish_smoke_test.py

/content/drive/MyDrive/Reef soundscapes with AI/Audioset
Instructions for updating:
non-resource variables are not supported in the long term

Testing your install of VGGish

Log Mel Spectrogram example:  [[-4.47297436 -4.29457354 -4.14940631 ... -3.9747003  -3.94774997
  -3.78687669]
 [-4.48589533 -4.28825497 -4.139964   ... -3.98368686 -3.94976505
  -3.7951698 ]
 [-4.46158065 -4.29329706 -4.14905953 ... -3.96442484 -3.94895483
  -3.78619839]
 ...
 [-4.46152626 -4.29365061 -4.14848608 ... -3.96638113 -3.95057575
  -3.78538167]
 [-4.46152595 -4.2936572  -4.14848104 ... -3.96640507 -3.95059567
  -3.78537143]
 [-4.46152565 -4.29366386 -4.14847603 ... -3.96642906 -3.95061564
  -3.78536116]]
2022-10-19 17:40:39.605886: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2022-10-19 17:40:39.723109: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there mu

In [4]:
#From original vggish_train_demo.py script on github
from __future__ import print_function

from random import shuffle

import numpy as np
import tensorflow.compat.v1 as tf
import tf_slim as slim

import vggish_input
import vggish_params
import vggish_slim

#Modules added by Ben
import os #for handling directories
import glob #for dealing with files in dir
import pandas as pd #for saving output at end in dataframe
import sklearn
import math
import pickle
from sklearn.model_selection import train_test_split #added for train/test split
from numpy import loadtxt #addded so predictions can be output to CSV file
from datetime import datetime #added to append time to csv output file name to prevent overwriting

Instructions for updating:
non-resource variables are not supported in the long term


**Set paths to access modules and pickle files, also set CNN parameters.**

Two classes are used here, increase _NUM_CLASSES if needed. A batch size of 16 was used as larger batches can cause a memory error on colab depending on which GPU you are  allocated. The network trains for 5 epochs currently to save computation time, the final study used UCL's computing cluster to train for 50 epochs on the full datasets.

In [10]:
#which repeat of the cross-val is this? (1-8):
repeat = 1 # Used to set seed for train/val/test split

### Change paths if you re-structure folders

output_name = 'custom_pretrained_CNN.ckpt' # what to call the ckpt file output


# Path to the location where your audio file are stored:
audio_dir = r'/content/drive/MyDrive/Reef soundscapes with AI/audio_dir' 

# Path to folder containing vggish setup files and 'AudiosetAnalysis' downloaded from sarebs supplementary
vggish_files = r'/content/drive/MyDrive/Reef soundscapes with AI/Audioset' 

# Output folder for results:
results_dir = r'/content/drive/MyDrive/Reef soundscapes with AI/Results/trained_CNN_saved_model/' 
ckpt_file_dir = r'/content/drive/MyDrive/Reef soundscapes with AI/Results/trained_CNN_saved_model/' 

#Set the directories where logmel-spectrograms will be stored for train, test and validation sets:
pickle_trainfiles_dir = r'/content/drive/MyDrive/Reef soundscapes with AI/Results/minibatches_train/'
pickle_valfiles_dir = r'/content/drive/MyDrive/Reef soundscapes with AI/Results/minibatches_val/'
pickle_testfiles_dir = r'/content/drive/MyDrive/Reef soundscapes with AI/Results/minibatches_test/'

#how many classes?:
_NUM_CLASSES = 2

#name a column for each class e.g 'class1', 'class2', or 'healthy', 'degraded'
col_names = 'Healthy','Degraded', 'True class'

#Batch size:
batch_size = 16 # larger batches can cause a memory error on the NN script on colab depending on which GPU you are  allocated 

# Number of epochs.
num_epochs = 10 # set less to 1 to run a quick demo



In [12]:
#### Some final set up
# Find number of minibatches for networks for loop
minibatches = [filename for filename in os.listdir(pickle_trainfiles_dir) if filename.startswith("train_minibatch")]
num_minibatches = len(minibatches) #this takes the last digit of the last pickle files, denoting how many minibatches there are

# Get number of train/test/val minibatches
num_train_batches = int(len(os.listdir(pickle_trainfiles_dir)))
print('Number of train minibatches found: ' + str(num_train_batches))
num_val_batches = int(len(os.listdir(pickle_valfiles_dir)))
print('Number of validation minibatches found: ' + str(num_val_batches))
num_test_batches = int(len(os.listdir(pickle_testfiles_dir)))
print('Number of test minibatches found: ' + str(num_test_batches))

os.chdir(vggish_files) 

# Used to find averages of accuracy score across minibatches later
def Average(lst):
    return sum(lst) / len(lst)

print('Cross validation combination: ' + str(repeat))

Number of train minibatches found: 8
Number of validation minibatches found: 1
Number of test minibatches found: 1
Cross validation combination: 1


# **Run the neural network**

In [8]:
#RUN THIS BLOCK ONLY ONCE PER SESSION - otherwise it will error

flags = tf.app.flags

flags.DEFINE_boolean(
    'train_vggish', True,
    'If True, allow VGGish parameters to change during training, thus '
    'fine-tuning VGGish. If False, VGGish parameters are fixed, thus using '
    'VGGish as a fixed feature extractor.')

flags.DEFINE_string(
    'checkpoint', 'vggish_model.ckpt',
    'Path to the VGGish checkpoint file.')

FLAGS = flags.FLAGS

'An exception has occurred, use %tb to see the full traceback.' error will occur, fear not, this just means its finished 

In [13]:
%%timeit
"""To train 5 epochs on the 123x1min files in the of training data this process
takes up to 80 minutes on a CPU. Depending which GPU Colab provides you this can 
take < 5min on colabs GPU. The final study used NVIDIA A100 GPU's which 
generally provide the highest speed as of 2022."""


### Train NN, output results
r"""This uses the VGGish model definition within a larger model which adds two 
layers on top, and then trains this larger model. 

We input log-mel spectrograms (X_train) calculated above with associated labels 
(y_train), and feed the batches into the model. Once the model is trained, it 
is then executed on the validation and log-mel spectrograms (X_validation, 
X_test), and the accuracy is output for each.

This version of the CNN then saves .ckpt files which are saved versions of
this model which can be used as a pretrained feature extractor, created using 
your own audio"""

def main(X):   
  with tf.Graph().as_default(), tf.Session() as sess:
    # Define VGGish.
    embeddings = vggish_slim.define_vggish_slim(training=FLAGS.train_vggish)
    
    
    # Define a shallow classification model and associated training ops on top
    # of VGGish.
    with tf.variable_scope('mymodel'):
      # Add a fully connected layer with 100 units. Add an activation function
      # to the embeddings since they are pre-activation.
      num_units = 100
      fc = slim.fully_connected(tf.nn.relu(embeddings), num_units)

      logits= slim.fully_connected(                                      
          fc, _NUM_CLASSES, activation_fn=None, scope='logits')
      probabilities = tf.sigmoid(logits, name='probabilities')
    
      # Add training ops.
      with tf.variable_scope('train'):
        global_step = tf.train.create_global_step()

        # Labels are assumed to be fed as a batch multi-hot vectors, with
        # a 1 in the position of each positive class label, and 0 elsewhere.
        labels_input = tf.placeholder(
            tf.float32, shape=(None, _NUM_CLASSES), name='labels')

        # Cross-entropy label loss.
        xent = tf.nn.softmax_cross_entropy_with_logits( 
            logits=logits, labels=labels_input, name='xent')     
        loss = tf.reduce_mean(xent, name='loss_op')
        tf.summary.scalar('loss', loss)

        # We use the same optimizer and hyperparameters as used to train VGGish.
        optimizer = tf.train.AdamOptimizer(
            learning_rate= vggish_params.LEARNING_RATE,     
            epsilon=vggish_params.ADAM_EPSILON)
        train_op = optimizer.minimize(loss, global_step=global_step)

    # Initialize all variables in the model, and then load the pre-trained
    # VGGish checkpoint.
    sess.run(tf.global_variables_initializer())         ### this starts the session appaz
    vggish_slim.load_vggish_slim_checkpoint(sess, FLAGS.checkpoint)

    
    features_input = sess.graph.get_tensor_by_name(
        vggish_params.INPUT_TENSOR_NAME)
    
    # The training loop.
    saver = tf.train.Saver()####
    all_loss = []
    for epoch in range(num_epochs):
            validation_accuracy_scores = []
            test_accuracy_scores = []
            test_batch_scores = []
            val_batch_scores = []
            epoch_loss = 0
            i=0
            while i < num_minibatches: 
                print('mini batch'+str(i))
                train_pickle_file = pickle_trainfiles_dir + 'train_minibatch_' + str(i)
                with open(train_pickle_file, "rb") as fp:   # Unpickling
                  batch = pickle.load(fp)
                batch_x, batch_y = zip(*batch)

                _, c = sess.run([train_op, loss], feed_dict={features_input: batch_x, labels_input: batch_y})
                epoch_loss += c
                i+=1
            #print no. of epochs and loss
            print('Epoch', epoch+1, 'completed out of', num_epochs,', loss:',epoch_loss) 

    all_loss.sort()
    print('Lowest loss: ' + str(all_loss[:1]))
    os.chdir(ckpt_file_dir)
    saver.save(sess, output_name)  ## double space into for loop to get each epoch
    
tf.app.run(main)   

I1019 17:55:10.762954 140493836892032 saver.py:1284] Restoring parameters from vggish_model.ckpt


mini batch0
Epoch 1 completed out of 10 , loss: 0.8381498456001282
mini batch0
Epoch 2 completed out of 10 , loss: 0.6784347295761108
mini batch0
Epoch 3 completed out of 10 , loss: 0.5777193307876587
mini batch0
Epoch 4 completed out of 10 , loss: 0.5069913268089294
mini batch0
Epoch 5 completed out of 10 , loss: 0.4453549087047577
mini batch0
Epoch 6 completed out of 10 , loss: 0.38863372802734375
mini batch0
Epoch 7 completed out of 10 , loss: 0.33752354979515076
mini batch0
Epoch 8 completed out of 10 , loss: 0.29111260175704956
mini batch0
Epoch 9 completed out of 10 , loss: 0.2465088814496994
mini batch0
Epoch 10 completed out of 10 , loss: 0.20115140080451965
Lowest loss: []


SystemExit: ignored

  warn("To exit: use 'exit', 'quit', or Ctrl-D.", stacklevel=1)
