# Description

This notebook is used for ABR hearing threshold detection and evaluation using a sound level regression (SLR) method.</br>
The method is calibrated here on the [ING](https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3000194) data set used for training the convolutional neural networks.
The model is then applied to both ING and GMC data.

It is shown how to use the ABR_Threshold_Detector module to
+ train/calibrate a threshold detector on the train data set provided by [Ingham et. al](https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3000194) (ING data) and estimate the thresholds
+ save the trained model
+ load the model
+ apply the trained threshold estimator to data from the [German Mouse Clinic](https://www.mouseclinic.de/) and to data provided by [Ingham et. al](https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3000194)
+ evaluate thresholds by comparison with a ground truth (manually assessed thresholds)

In [None]:
%reload_ext autoreload
%autoreload 2
%matplotlib inline 

In [None]:
from IPython.core.display import display, HTML
display(HTML("<style>.container { width:100% !important; }</style>"))

# Load libraries

In [None]:
import os

import pandas as pd
import numpy as np

from ABR_ThresholdFinder_SLR import ABR_Threshold_Detector_multi_stimulus
from ABR_ThresholdFinder_SLR.evaluations import evaluate_classification_against_ground_truth, plot_evaluation_curve_for_specific_stimulus

# Definitions

In [None]:
"""Set the path to the saving location of the models"""
path2models = '../models'
"""Set the path to the data files, for example '../data'"""
path2data = ''

"""Name the columns containing the ABR wave time series data"""
timeseries_columns = ['t%d' %i for i in range(1000)] 

# Load GMC data

In [None]:
GMC_data = pd.read_csv(os.path.join(path2data, 'GMC', 'GMC_abr_curves.csv'), low_memory=False)

# Load ING data

In [None]:
ING_data = pd.read_csv(os.path.join(path2data, 'ING', 'ING_abr_curves.csv'))

# Create the ING calibration data set

In [None]:
# save np.load
np_load_old = np.load

# modify the default parameters of np.load
np.load = lambda *a,**k: np_load_old(*a, allow_pickle=True, **k)

# call load_data with allow_pickle implicitly set to true
train_mice = np.load(os.path.join(path2data, 'ING', 'ING_train_mice.npy'))
    
# restore np.load for future normal usage
np.load = np_load_old

In [None]:
"""
Setting the calibration data set that corresponds to the ING data set 
with which the neural networks were trained
"""
dataset1 = ING_data[ING_data.mouse_id.isin(train_mice)][['mouse_id', 'frequency', 'sound_level', 'threshold'] 
                     + timeseries_columns]
dataset1.head()

# Train threshold detector on the ING calibration data set
Train threshold detector on **dataset1** and save the trained model.

In [None]:
"""Initialize the threshold detector"""
threshold_detector = ABR_Threshold_Detector_multi_stimulus(max_deg = 4,
                                                           threshold_level = 4.0,
                                                           karwgs_random_forest = None,
                                                           number_of_workers = 10)

"""
Train the threshold detector on dataset1 and compute the thresholds for dataset1,
the parameters given to the function are a pandas data frame containing the data and the 
names of the columns of the data frame
"""
thresholds1 = threshold_detector.fit_and_predict_data_set(mouse_id = 'mouse_id',
                                                          sound_level = 'sound_level',
                                                          frequency = 'frequency',
                                                          time_series = timeseries_columns, 
                                                          data = dataset1)

"""Save the trained threshold detector"""
threshold_detector.save_model_to_file(file_name = '../models/INGcalibrated_threshold_det.pkl')

# Threshold detection on GMC data

Load threshold detector from file, apply it to the [German Mouse Clinic](https://www.mouseclinic.de/) data and evaluate the results.

In [None]:
GMC_data2 = GMC_data[['mouse_id', 'frequency', 'sound_level', 'threshold'] + timeseries_columns]
GMC_data2.head()

In [None]:
"""Load threshold detector from file"""
threshold_detector = ABR_Threshold_Detector_multi_stimulus(file_name = '../models/INGcalibrated_threshold_det.pkl')

"""
Use loaded threshold detector to predict thresholds on GMC data
the parameters given to the function are a pandas data frame containing the data and the 
names of the columns of the data frame
"""
GMC_thresholds2 = threshold_detector.predict_new(mouse_id = 'mouse_id',
                                                 sound_level = 'sound_level',
                                                 frequency = 'frequency',
                                                 time_series = timeseries_columns, 
                                                 data = GMC_data2)

"""Append the threshold values to the result data"""
GMC_data2['slr_estimated_thr'] = GMC_thresholds2 

## Save estimations

In [None]:
GMC_data2save = GMC_data2[['mouse_id', 'frequency', 'threshold', 'slr_estimated_thr']].drop_duplicates()
GMC_data2save.to_csv('../reports/GMC_data_INGcalibrated_SLR_estimations.csv', index=False)
GMC_data2save.head()

## Evaluate thresholds by comparing it with a 'ground truth' (a human set threshold in this case)

In [None]:
# 5dB buffer
evaluation = evaluate_classification_against_ground_truth(GMC_data2, 5, 
                                                          frequency = 'frequency',
                                                          mouse_id = 'mouse_id',
                                                          sound_level = 'sound_level',
                                                          threshold_estimated = 'slr_estimated_thr',
                                                          threshold_ground_truth = 'threshold')

evaluation

In [None]:
# 10dB buffer
evaluation = evaluate_classification_against_ground_truth(GMC_data2, 10, 
                                                          frequency = 'frequency',
                                                          mouse_id = 'mouse_id',
                                                          sound_level = 'sound_level',
                                                          threshold_estimated = 'slr_estimated_thr',
                                                          threshold_ground_truth = 'threshold')

evaluation

# Threshold detection on ING data

Load threshold detector from file, apply it to the data set provided by [Ingham et. al](https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3000194) and evaluate the results.

In [None]:
ING_data2 = ING_data[['mouse_id', 'frequency', 'sound_level', 'threshold'] + timeseries_columns]
ING_data2.head()

In [None]:
"""Load threshold detector from file"""
threshold_detector = ABR_Threshold_Detector_multi_stimulus(file_name = '../models/INGcalibrated_threshold_det.pkl')

"""
Use loaded threshold detector to predict thresholds on ING data,
the parameters given to the function are a pandas data frame containing the data and the 
names of the columns of the data frame
"""
ING_thresholds2 = threshold_detector.predict_new(mouse_id = 'mouse_id',
                                                 sound_level = 'sound_level',
                                                 frequency = 'frequency',
                                                 time_series = timeseries_columns, 
                                                 data = ING_data2)

"""Append the threshold values to the result data"""
ING_data2['slr_estimated_thr'] = ING_thresholds2 

## Save estimations

In [None]:
ING_data2save = ING_data2[['mouse_id', 'frequency', 'threshold', 'slr_estimated_thr']].drop_duplicates()
ING_data2save.to_csv('../reports/ING_data_INGcalibrated_SLR_estimations.csv', index=False)
ING_data2save.head()

## Evaluate thresholds by comparing it with a 'ground truth' (a human set threshold in this case)

In [None]:
# 5dB buffer
evaluation = evaluate_classification_against_ground_truth(ING_data2, 5, 
                                                          frequency = 'frequency',
                                                          mouse_id = 'mouse_id',
                                                          sound_level = 'sound_level',
                                                          threshold_estimated = 'slr_estimated_thr',
                                                          threshold_ground_truth = 'threshold')

evaluation

In [None]:
# 10dB buffer
evaluation = evaluate_classification_against_ground_truth(ING_data2, 10, 
                                                          frequency = 'frequency',
                                                          mouse_id = 'mouse_id',
                                                          sound_level = 'sound_level',
                                                          threshold_estimated = 'slr_estimated_thr',
                                                          threshold_ground_truth = 'threshold')

evaluation

