# **Handcrafted Features & Fully Connected NN**

The following code utilizes the [pyAudioAnalysis](https://github.com/tyiannak/pyAudioAnalysis) library to extract handcrafted features from audio data and train a fully convolutional neural network model.When applied to the test set, the model achieves an accuracy of 74% and an F1 (macro) score of 72.5%.

In [1]:
# Importing the drive module from google.colab library
from google.colab import drive

# Mounting the Google Drive to the Colab environment
drive.mount('/content/drive')

project_path = '/content/drive/My Drive/GitHub/MarineMammalSoundClassification/'
%cd /content/drive/My Drive/GitHub/MarineMammalSoundClassification/

Mounted at /content/drive
/content/drive/My Drive/GitHub/MarineMammalSoundClassification


In [2]:
!pip install eyed3
!pip install pydub
# !pip install pyAudioAnalysis

Collecting eyed3
  Downloading eyed3-0.9.7-py3-none-any.whl (246 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m246.1/246.1 kB[0m [31m3.7 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting coverage[toml]<6.0.0,>=5.3.1 (from eyed3)
  Downloading coverage-5.5-cp310-cp310-manylinux1_x86_64.whl (238 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m239.0/239.0 kB[0m [31m5.3 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting deprecation<3.0.0,>=2.1.0 (from eyed3)
  Downloading deprecation-2.1.0-py2.py3-none-any.whl (11 kB)
Collecting filetype<2.0.0,>=1.0.7 (from eyed3)
  Downloading filetype-1.2.0-py2.py3-none-any.whl (19 kB)
Installing collected packages: filetype, deprecation, coverage, eyed3
Successfully installed coverage-5.5 deprecation-2.1.0 eyed3-0.9.7 filetype-1.2.0
Collecting pydub
  Downloading pydub-0.25.1-py2.py3-none-any.whl (32 kB)
Installing collected packages: pydub
Successfully installed pydub-0.25.1


In [3]:
import warnings
warnings.filterwarnings("ignore")

import os
from pyAudioAnalysis import MidTermFeatures as aF
import pandas as pd
import numpy as np

## **Extract Handcrafted Features**

We used the pyAudioAnalysis library, specifically the `MidTermFeatures.directory_feature_extraction` function, to calculate the handcrafted features for all directories in each set: train, validation, and test. The results were saved in separate CSV files for each set in the `handcrafted_features` directory to make it easier to retrieve them later.


In [4]:
def create_csv_with_features(set_name):
    """
    Extracts audio features from directories of audio files and saves them to a CSV file.

    Args:
    set_name (str): The name of the dataset (subdirectory in 'data_split') to process.

    This function assumes the following directory structure:
    - data_split/
        - set_name/
            - class1/
            - class2/
            - ...

    For each class directory, the function extracts audio features using
    the directory_feature_extraction function and saves the results in a CSV file
    located in the 'handcrafted_features' directory.
    """

    set_dir = os.path.join("data_split", set_name)
    set_classes = os.listdir(set_dir)
    dirs = [os.path.join(set_dir, c) for c in set_classes]

    # Define parameters for feature extraction
    m_win, m_step, s_win, s_step = 1, 1, 0.1, 0.05

    features = []

    for d in dirs:
        # Extract feature matrix, file names, and feature names for the directory
        f, files, fn = aF.directory_feature_extraction(d, m_win, m_step, s_win, s_step)

        # Get the class name from the directory path
        class_name = os.path.basename(d)

        # Remove the directory path from file names
        files = [f.replace(d + '/', '') for f in files]

        # Extend feature list with class name and file name
        extended_f = [[class_name, b] + a.tolist() for a, b in zip(f, files)]
        features.extend(extended_f)

    col_names = ['class', 'file'] + fn
    features_df = pd.DataFrame(features, columns=col_names)

    # Save the DataFrame to a CSV file in the 'handcrafted_features' directory
    features_df.to_csv(os.path.join('handcrafted_features', f'{set_name}_features.csv'), sep='\t', header=True)

In [5]:
if not os.path.exists('handcrafted_features'):
   os.makedirs('handcrafted_features')

for set_name in ['train', 'val', 'test']:
  create_csv_with_features(set_name)

Analyzing file 1 of 46: data_split/train/AtlanticSpottedDolphin/61025001.wav
Analyzing file 2 of 46: data_split/train/AtlanticSpottedDolphin/61025002.wav
Analyzing file 3 of 46: data_split/train/AtlanticSpottedDolphin/61025003.wav
Analyzing file 4 of 46: data_split/train/AtlanticSpottedDolphin/61025004.wav
Analyzing file 5 of 46: data_split/train/AtlanticSpottedDolphin/61025006.wav
Analyzing file 6 of 46: data_split/train/AtlanticSpottedDolphin/61025007.wav
Analyzing file 7 of 46: data_split/train/AtlanticSpottedDolphin/61025008.wav
Analyzing file 8 of 46: data_split/train/AtlanticSpottedDolphin/61025009.wav
Analyzing file 9 of 46: data_split/train/AtlanticSpottedDolphin/6102500A.wav
Analyzing file 10 of 46: data_split/train/AtlanticSpottedDolphin/6102500B.wav
Analyzing file 11 of 46: data_split/train/AtlanticSpottedDolphin/6102500D.wav
Analyzing file 12 of 46: data_split/train/AtlanticSpottedDolphin/6102500E.wav
Analyzing file 13 of 46: data_split/train/AtlanticSpottedDolphin/6102500F

## **Fully Connected Neural Network**

A fully connected neural network consists of a series of fully connected layers that connect every neuron in one layer to every neuron in the other layer. The [TensorFlow](https://www.tensorflow.org/) library was used to set up and train the model.

In [6]:
from sklearn.preprocessing import LabelEncoder
from tensorflow import keras
from keras.utils import to_categorical
from keras import layers, Sequential
from keras.callbacks import EarlyStopping, ReduceLROnPlateau
from sklearn.metrics import confusion_matrix
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau
from tensorflow.keras import regularizers
import tensorflow as tf
from sklearn.metrics import accuracy_score
from sklearn.metrics import f1_score

In [7]:
# Functions for each step of the model training and evaluation process

def load_data(file_path, sep='\t'):
    """
    Loads data from a CSV file and prepares it for training the model.

    Args:
    file_path (str): The path to the CSV file containing the data.
    sep (str, optional): The delimiter of the CSV file. Defaults to '\t'.

    Returns:
    tuple: A tuple containing:
        - X (numpy.ndarray): The feature matrix.
        - y (numpy.ndarray): The one-hot encoded labels.
        - encoder (LabelEncoder): The label encoder fitted on the class labels.
    """
    df = pd.read_csv(file_path, sep=sep)
    X = np.array(df.iloc[:, 3:].values.tolist())
    encoder = LabelEncoder()
    y = encoder.fit_transform(df['class'])
    y = to_categorical(y, num_classes=28)
    return X, y, encoder

def create_model(initial_dimensionality, input_shape, num_classes, batch_norm=True, dropout=False):
    """
    Creates a neural network model with decreasing dimensionality.

    Args:
    initial_dimensionality (int): The number of units in the first dense layer.
    input_shape (int): The shape of the input data.
    num_classes (int): The number of output classes.
    batch_norm (bool, optional): Whether to include batch normalization layers. Defaults to True.
    dropout (bool, optional): Whether to include dropout layers. Defaults to False.

    Returns:
    tensorflow.keras.Sequential: The compiled Keras model.
    """
    model = Sequential()
    model.add(layers.Dense(initial_dimensionality, activation='relu', input_shape=(input_shape,)))
    if batch_norm:
        model.add(layers.BatchNormalization())
    if dropout:
        model.add(layers.Dropout(0.1))

    dim = initial_dimensionality
    while dim > 64:
      dim //= 2
      model.add(layers.Dense(dim, activation='relu'))
      if batch_norm:
          model.add(layers.BatchNormalization())
      if dropout:
          model.add(layers.Dropout(0.1))

    model.add(layers.Dense(num_classes, activation='softmax'))
    model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
    return model

def train_model(model, trainX, trainY, valX, valY, epochs=100, batch_size=32, patience=5):
    """
    Trains the given model using the provided training and validation data.

    Args:
    model (tensorflow.keras.Model): The Keras model to be trained.
    trainX (numpy.ndarray): Training data features.
    trainY (numpy.ndarray): Training data labels.
    valX (numpy.ndarray): Validation data features.
    valY (numpy.ndarray): Validation data labels.
    epochs (int, optional): The number of epochs to train the model. Defaults to 100.
    batch_size (int, optional): The batch size to use during training. Defaults to 32.
    patience (int, optional): The number of epochs with no improvement after which training will be stopped. Defaults to 5.

    Returns:
    tensorflow.keras.callbacks.History: The history object that holds training and validation loss and accuracy values.
    """
    # Define callbacks
    early_stopping = EarlyStopping(monitor='val_accuracy', patience=patience, restore_best_weights=True)
    reduce_lr = ReduceLROnPlateau(monitor='val_accuracy', factor=0.1, patience=5, min_lr=1e-5)

    history = model.fit(trainX, trainY,
                        validation_data=(valX, valY),
                        epochs=epochs,
                        batch_size=batch_size,
                        callbacks=[early_stopping, reduce_lr])
    return history

def evaluate_model(model, testX, testY):
    """
    Evaluates the given model using the provided test data.

    Args:
    model (tensorflow.keras.Model): The Keras model to be evaluated.
    testX (numpy.ndarray): Test data features.
    testY (numpy.ndarray): Test data labels.

    Returns:
    tuple: A tuple containing:
        - conf_matrix (numpy.ndarray): The confusion matrix of the test predictions.
        - accuracy (float): The accuracy score of the test predictions.
        - f1 (float): The F1 score of the test predictions.
    """
    test_predictions = np.argmax(model.predict(testX), axis=1)
    test_true = np.argmax(testY, axis=1)
    conf_matrix = confusion_matrix(test_true, test_predictions)
    accuracy = accuracy_score(test_true, test_predictions)
    f1 = f1_score(test_true, test_predictions, average='macro')
    return conf_matrix, accuracy, f1


def save_model(model, file_path):
    """
    Saves the given model to the specified file path.

    Args:
    model (tensorflow.keras.Model): The Keras model to be saved.
    file_path (str): The path where the model will be saved.
    """
    model.save(file_path)

def load_model(file_path):
    """
    Loads a Keras model from the specified file path.

    Args:
    file_path (str): The path from where the model will be loaded.

    Returns:
    tensorflow.keras.Model: The loaded Keras model.
    """
    return tf.keras.models.load_model(file_path)

In [8]:
import utils.tyiannak_utilities as ut
import IPython


trainX, trainY, _ = load_data('handcrafted_features/train_features.csv')
valX, valY, _ = load_data('handcrafted_features/val_features.csv')

def process_pipeline(model_name, initial_units=2048, batch_norm=True, dropout=False, epochs=100, batch_size=32, patience=5):
    """
    Performs the entire pipeline of processing, training, evaluating, and saving a neural network model.

    Args:
    model_name (str): The name to be used for saving the model and HTML output.
    initial_units (int, optional): The number of units in the first dense layer. Defaults to 2048.
    batch_norm (bool, optional): Whether to include batch normalization layers. Defaults to True.
    dropout (bool, optional): Whether to include dropout layers. Defaults to False.
    epochs (int, optional): The number of epochs to train the model. Defaults to 100.
    batch_size (int, optional): The batch size to use during training. Defaults to 32.
    patience (int, optional): The number of epochs with no improvement after which training will be stopped. Defaults to 5.
    """
    # Create and train the model
    model = create_model(initial_units, trainX.shape[1], 28, batch_norm, dropout)
    train_model(model, trainX, trainY, valX, valY, epochs, batch_size, patience)

    # Create directory for saving models if it doesn't exist
    if not os.path.exists('models/NN'):
        os.makedirs('models/NN')

    # Save the trained model
    model_path = os.path.join('models/NN', model_name+'.keras')
    save_model(model, model_path)

    # Load the saved model
    loaded_model = load_model(model_path)

    # Load test data and evaluate the model
    testX, testY, encoder = load_data('handcrafted_features/test_features.csv')
    conf_matrix, accuracy, f1 = evaluate_model(loaded_model, testX, testY)
    print(f'Test Accuracy: {accuracy:.2f}')
    print(f'Test F1 score: {f1:.2f}')

    # Generate and display HTML report for classification results
    output_html = os.path.join('models/NN', model_name+'.html')
    labels = list(encoder.classes_)
    ut.plotly_classification_results(conf_matrix, labels, output_html)

    return IPython.display.HTML(filename=output_html)



In [9]:
process_pipeline('2048_150_32_20', 2048, True, False, 150, 32, 20)

Output hidden; open in https://colab.research.google.com to view.

In [10]:
process_pipeline('1024_150_32_20', 1024, True, False, 150, 32, 20)

Output hidden; open in https://colab.research.google.com to view.

In [11]:
process_pipeline('512_150_32_20', 512, True, False, 150, 32, 20)

Output hidden; open in https://colab.research.google.com to view.

In [12]:
process_pipeline('2048_150_32_30', 2048, True, False, 150, 32, 30)

Output hidden; open in https://colab.research.google.com to view.

In [13]:
process_pipeline('2048_150_16_20', 2048, True, False, 150, 16, 20)

Output hidden; open in https://colab.research.google.com to view.

In [14]:
process_pipeline('2048_200_32_20', 2048, True, False, 200, 32, 20)

Output hidden; open in https://colab.research.google.com to view.

In [15]:
process_pipeline('2048_150_32_20', 2048, True, False, 250, 32, 20)

Output hidden; open in https://colab.research.google.com to view.

In [16]:
process_pipeline('512_200_32_20', 512, True, False, 200, 32, 20)

Output hidden; open in https://colab.research.google.com to view.

In [17]:
process_pipeline('512_200_32_30', 512, True, False, 200, 32, 30)

Output hidden; open in https://colab.research.google.com to view.

In [18]:
process_pipeline('512_200_32_20', 512, True, False, 200, 16, 20)

Output hidden; open in https://colab.research.google.com to view.