# Phoneme Classifier Testing: Noisy
### Author: Cathal Ó Faoláin
### 15:51, 02/08/2024

The goal of this work is to understand how we can use predicted IHC potentials, such as those predicted by WavIHC, introduced in the paper "WaveNet-based approximation of a cochlear filtering and hair cell transduction model".  Feature encoders designed to use these predicted IHC potentials are evaluated against other state-of-the-art feature encoders in order to understand how discriminating they are, and over a range of different Signal-to-Noise Ratios (SNRs).

This notebook tests the feature encoders we shall evaluate. We have 12 feature encoders:

- Contrastive Predictive Coding (CPC)
- CPC-80
- Wav2vec2.0
- Wav2vec2.0-80
- Autoregressive Predictive Coding (APC)
- IHC CPC
- IHC CPC 80
- IHC Wav2vec2
- IHC Wav2vec2 80
- IHC Extract
- IHC Extract 512
- IHC Extract 2

The first three feature encoders, CPC, Wav2vec2.0 and APC are based on the designs used in each of the papers. Any context encoders that tries to model longer-term dependencies have been removed - so no transformers or Recurrent Neural Networks (RNN). This is to allow for us to evaluate how discriminating the features themselves are. 

IHC CPC and Wav2vec2 are adapted feature encoders that take predicted IHC potentials as input rather than the signal alone. Each is inspired by their namesake models.

The testing here is over an SNR range of -10dB to 30dB. There are two main classes of noise added - amplitude modulated and steady. The types of additive noise is:
- White Noise (Steady)

This should give us a good indication of how well our new features are able to separate speech information from a noisy environment or background. 

## Imports

In [1]:
import torch
from torch import nn
import librosa
import time
from torch.nn.utils.rnn import pad_packed_sequence, pack_padded_sequence
from torch.utils.data import DataLoader, Dataset, IterableDataset
import torchaudio
import pandas as pd
import numpy as np
import time
import sys
import yaml
import math
import scipy.signal as signal
from dataclasses import dataclass, field
from typing import List, Tuple
import torch.nn.functional as F
from pathlib import Path
import pickle

In [2]:
sys.path.append('./IHCApproxNH/')
from classes import WaveNet
from utils import utils
from Encoders import FeatureEncoders 
from TIMIT_utils import TIMIT_utils
from Train_TestFunctions import TrainEvalFunctions

## Set Global Variables

In [3]:
#And save location 
dir_results=Path('Results/Noisy')
dir_results.mkdir(parents=True, exist_ok=True)

#Noise variables
SNRs=[30, 25, 20, 15, 10, 5, 0, -5, -10]
noise_types=['White', 'air_conditioner', 'dog_bark', 'street_music', 'children_playing']

## Noise Test : Fully Trained Clean Models

This sections tests models trained on the full clean dataset, and evaluated in their training on the validation dataset. These models have never been exposed to noise in their training, and gives us their overall performance. Both original (signal or mel spectrogram input) models and IHC-input models are tested here over the range of SNRs provided and over all the different noise types.

### Original Models testing

In [4]:
#original_models=[ "MelSimple_MLP", "MelSimple", "Wav2vec2", "Wav2vec2_80", "CPC_80", "CPC",  "SIG_Extract", "SIG_Extract_512", "SIG_Extract_2.0", "Whisper", "Whisper_80", "SIG_Extract_3.0"]

original_models=["Whisper", "Whisper_80", "SIG_Extract_3.0"]

for noise in noise_types:

    dir_snr = Path("./Results/Noisy/{}".format(noise))
    dir_snr.mkdir(parents=True, exist_ok=True)
    print("Making Directory: %s" % dir_snr)
    
    for snr in SNRs:
        dir_snr = Path("./Results/Noisy/{}/{}".format(noise, snr))
        dir_snr.mkdir(parents=True, exist_ok=True)
        print("Making Directory: %s" % dir_snr)
        
        test_accuracies={}

        #Reload any old results so that we can continue training if required
        test_location='Results/Noisy/{}/{}/original_models.pkl'.format(noise, snr)
        if(Path(test_location).is_file()):
            with open(test_location, 'rb') as f:
                test_accuracies = pickle.load(f)
            
        for model in original_models:
            print("=============================")
            print("Starting Noisy Testing for: %s for %s noise at an SNR of %d" %(model, noise, snr))
            
            #Test the model on the noisy dataset
            test_accu, test_loss, unique_phonemes, time=TrainEvalFunctions.test_best(model, distributed=False,  noise=True , noise_type= noise, SNR= snr)

            test_accuracies["{}-Clean".format(model)]=test_accu

            print("==============================")
            print("")
            print("")

        with open(test_location, 'wb') as f:
            pickle.dump(test_accuracies, f)

    


Making Directory: Results/Noisy/White
Making Directory: Results/Noisy/White/30
Starting Noisy Testing for: Whisper for White noise at an SNR of 30
> Initialising model: Whisper
**************************************************************
Testing Best Model found at: Model Checkpoints/Whisper Checkpoints/best_Whisper_checkpoint.pth.tar
**************************************************************
> Initialising model: Whisper
> Setting: Test Mode
Loading model checkpoint
+---------------------------------------------+
Using Test Data. Test Accuracy :
Testing For: | Batchsize 4 | Steps: 40
Evaluation accuracy:  0.5575, Phoneme Error Rate:  0.4425, Loss :  0.8816, Time:  16.0137s, Time per sample:  0.4003s
+---------------------------------------------+


Starting Noisy Testing for: Whisper_80 for White noise at an SNR of 30
> Initialising model: Whisper_80
**************************************************************
Testing Best Model found at: Model Checkpoints/Whisper_80 Checkpoin

### IHC Models testing

Only the models that performed well on k-Fold stability testing are tested here, as we want to give our models a fair chance.

In [5]:
#IHC_models=["IHC_Cpc", "IHC_Wav2vec2", "IHC_Extract_512", "IHC_Extract", "IHC_Cpc_80", "IHC_Wav2vec2_80",   "IHC_Extract_2.0", "IHC_Extract_3.0"]

IHC_models=["IHC_Extract_3.0"]
            
noise_types=['White','air_conditioner', 'dog_bark', 'street_music', 'children_playing']
    
for noise in noise_types:

    dir_snr = Path("./Results/Noisy/{}".format(noise))
    dir_snr.mkdir(parents=True, exist_ok=True)
    print("Making Directory: %s" % dir_snr)
    
    for snr in SNRs:
        dir_snr = Path("./Results/Noisy/{}/{}".format(noise, snr))
        dir_snr.mkdir(parents=True, exist_ok=True)
        print("Making Directory: %s" % dir_snr)
        
        test_accuracies={}

        #Reload any old results so that we can continue training if required
        test_location='Results/Noisy/{}/{}/IHC_models.pkl'.format(noise, snr)
        if(Path(test_location).is_file()):
            with open(test_location, 'rb') as f:
                test_accuracies = pickle.load(f)
            
        for model in IHC_models:
            print("=============================")
            print("Starting Noisy Testing for: %s for %s noise at an SNR of %d" %(model, noise, snr))
            
            #Test the model on the noisy dataset
            test_accu, test_loss, unique_phonemes, time=TrainEvalFunctions.test_best(model, distributed=False,  noise=True , noise_type= noise, SNR= snr)

            test_accuracies["{}-Clean".format(model)]=test_accu

            print("==============================")
            print("")
            print("")

        with open(test_location, 'wb') as f:
            pickle.dump(test_accuracies, f)


Making Directory: Results/Noisy/White
Making Directory: Results/Noisy/White/30
Starting Noisy Testing for: IHC_Extract_3.0 for White noise at an SNR of 30
> Initialising model: IHC_Extract_3.0
**************************************************************
Testing Best Model found at: Model Checkpoints/IHC_Extract_3.0 Checkpoints/best_IHC_Extract_3.0_checkpoint.pth.tar
**************************************************************
> Initialising model: IHC_Extract_3.0
> Setting: Test Mode
Loading model checkpoint
+---------------------------------------------+
Using Test Data. Test Accuracy :
Testing For: | Batchsize 4 | Steps: 237
Evaluation accuracy:  0.5018, Phoneme Error Rate:  0.4982, Loss :  1.3331, Time:  61.0056s, Time per sample:  0.2574s
+---------------------------------------------+


Making Directory: Results/Noisy/White/25
Starting Noisy Testing for: IHC_Extract_3.0 for White noise at an SNR of 25
> Initialising model: IHC_Extract_3.0
**************************************

## K-Fold Model Stability Testing in Noise

As robustness is the major issue with noise, the subsequent section will test the ability of the models saved at each k-fold, which were evaluated on the held out k-fold. These models were only trained and evaluated on clean data too. This thus tests the overall variability of the models, and the difference in how well they learn to generalise. All IHC models are tested here - not just the ones that performed well during the k-fold stability testing. This should confirm that leaving them out was a sound idea.

### Original Models Testing

In [6]:
noise_types=["White", 'air_conditioner', 'dog_bark', 'street_music', 'children_playing']

In [7]:
#original_models=[ "MelSimple_MLP", "MelSimple", "Wav2vec2", "Wav2vec2_80", "CPC_80", "CPC",  "SIG_Extract", "SIG_Extract_512", "SIG_Extract_2.0", "Whisper", "Whisper_80", "SIG_Extract_3.0"]

original_models=["Whisper", "Whisper_80", "SIG_Extract_3.0"]

    
for noise in noise_types:
    dir_kfold=Path("./Results/Noisy/k-Fold Stability/")
    dir_kfold.mkdir(parents=True, exist_ok=True)
    print("Making Directory: %s" % dir_kfold)
    dir_snr = Path("./Results/Noisy/k-Fold Stability/{}".format(noise))
    dir_snr.mkdir(parents=True, exist_ok=True)
    print("Making Directory: %s" % dir_snr)
    
    for snr in SNRs:
        dir_snr = Path("./Results/Noisy/k-Fold Stability/{}/{}".format(noise, snr))
        dir_snr.mkdir(parents=True, exist_ok=True)
        print("Making Directory: %s" % dir_snr)
        

        test_accuracies={}
        
        #Reload any old results so that we can continue training if required
        test_location='Results/Noisy/k-Fold Stability/{}/{}/original_models.pkl'.format(noise, snr)
        if(Path(test_location).is_file()):
            with open(test_location, 'rb') as f:
                test_accuracies = pickle.load(f)
            
        for model in original_models:
            print("=============================")
            print("Starting Noisy k-Fold Model Testing for: %s for %s noise at an SNR of %d" %(model, noise, snr))
            for k in range(5):
                
            
                #Test the model on the noisy dataset
                test_accu, test_loss, unique_phonemes, time=TrainEvalFunctions.test_best(model, distributed=False, Kfold_eval=True, kInt=k,  noise=True , noise_type= noise, SNR= snr)

                test_accuracies["{}-kFold-{}".format(model, k+1)]=test_accu

            print("==============================")
            print("")
            print("")

        #Saved for every SNR
        with open(test_location, 'wb') as f:
            pickle.dump(test_accuracies, f)


Making Directory: Results/Noisy/k-Fold Stability
Making Directory: Results/Noisy/k-Fold Stability/White
Making Directory: Results/Noisy/k-Fold Stability/White/30
Starting Noisy k-Fold Model Testing for: Whisper for White noise at an SNR of 30
> Initialising model: Whisper
**************************************************************
Testing Best Model found at: Model Checkpoints/Whisper Checkpoints/kFold Eval 0/best_Whisper_checkpoint.pth.tar
**************************************************************
> Initialising model: Whisper
> Setting: Test Mode
Loading model checkpoint
+---------------------------------------------+
Using Test Data. Test Accuracy :
Testing For: | Batchsize 4 | Steps: 40
Evaluation accuracy:  0.5453, Phoneme Error Rate:  0.4547, Loss :  0.9598, Time:  15.9626s, Time per sample:  0.3991s
+---------------------------------------------+
> Initialising model: Whisper
**************************************************************
Testing Best Model found at: Model

### IHC Models Testing

In [9]:
#IHC_models=["IHC_Cpc", "IHC_Wav2vec2", "IHC_Extract_512", "IHC_Extract", "IHC_Cpc_80", "IHC_Wav2vec2_80",   "IHC_Extract_2.0", "IHC_Extract_3.0"]

IHC_models=["IHC_Extract_3.0"]
    
for noise in noise_types:
    dir_kfold=Path("./Results/Noisy/k-Fold Stability/")
    dir_kfold.mkdir(parents=True, exist_ok=True)
    print("Making Directory: %s" % dir_kfold)
    dir_snr = Path("./Results/Noisy/k-Fold Stability/{}".format(noise))
    dir_snr.mkdir(parents=True, exist_ok=True)
    print("Making Directory: %s" % dir_snr)
    
    for snr in SNRs:
        dir_snr = Path("./Results/Noisy/k-Fold Stability/{}/{}".format(noise, snr))
        dir_snr.mkdir(parents=True, exist_ok=True)
        print("Making Directory: %s" % dir_snr)
        

        IHC_test_accuracies={}
        
        #Reload any old results so that we can continue training if required. 
        test_location='Results/Noisy/k-Fold Stability/{}/{}/IHC_models.pkl'.format(noise, snr)

        if(Path(test_location).is_file()):
            with open(test_location, 'rb') as f:
                IHC_test_accuracies = pickle.load(f)
            
        for model in IHC_models:
            print("=============================")
            print("Starting Noisy k-Fold Model Testing for: %s for %s noise at an SNR of %d" %(model, noise, snr))
            
            for k in range(5):
                
            
                #Test the model on the noisy dataset
                test_accu, test_loss, unique_phonemes, time=TrainEvalFunctions.test_best(model, distributed=False, Kfold_eval=True, kInt=k,  noise=True , noise_type= noise, SNR= snr)

                IHC_test_accuracies["{}-kFold-{}".format(model, k+1)]=test_accu

            print("==============================")
            print("")
            print("")

        #Saved for every SNR
        with open(test_location, 'wb') as f:
            pickle.dump(IHC_test_accuracies, f)

Making Directory: Results/Noisy/k-Fold Stability
Making Directory: Results/Noisy/k-Fold Stability/White
Making Directory: Results/Noisy/k-Fold Stability/White/30
Starting Noisy k-Fold Model Testing for: IHC_Extract_3.0 for White noise at an SNR of 30
> Initialising model: IHC_Extract_3.0
**************************************************************
Testing Best Model found at: Model Checkpoints/IHC_Extract_3.0 Checkpoints/kFold Eval 0/best_IHC_Extract_3.0_checkpoint.pth.tar
**************************************************************
> Initialising model: IHC_Extract_3.0
> Setting: Test Mode
Loading model checkpoint
+---------------------------------------------+
Using Test Data. Test Accuracy :
Testing For: | Batchsize 4 | Steps: 237


KeyboardInterrupt: 