<a href="https://colab.research.google.com/github/phumipatc/CU_Submissions/blob/master/AI/Sound_to_Dementia.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Dataset: DementiaBank**
https://dementia.talkbank.org/


English Pitt Corpus: Cookie theft task
* https://dementia.talkbank.org/access/English/Pitt.html
* Dementia vs control



Preparing environment

In [None]:
%pip install openl3
%pip install pandas
%pip install scikit-learn
%pip install matplotlib
%pip install seaborn

In [1]:
def tryImportColab():
  try:
    import google.colab
    return True
  except ImportError:
    return False

runningInColab = tryImportColab()
runningInColab

False

In [2]:
if(runningInColab):
  from google.colab import drive
  drive.mount('/content/drive')

# **Audio Embedding**
# OpenL3
* https://openl3.readthedocs.io/en/latest/tutorial.html
* http://www.justinsalamon.com/uploads/4/3/9/4/4394963/cramer_looklistenlearnmore_icassp_2019.pdf

# AudioSet
* https://github.com/tensorflow/models/tree/master/research/audioset
* Use vggish
* Or, https://tfhub.dev/google/vggish/1

# Other embedding models
* https://tfhub.dev/s?module-type=audio-embedding


In [3]:
%pwd

'd:\\Code\\CU\\AI'

In [4]:
import pandas as pd

address_sample_csv_path = 'sound-dementia-data/ADReSS-M-train/'
if runningInColab:
	address_sample_csv_path = 'drive/MyDrive/' + address_sample_csv_path

# Get dataFrame
address_sample_original_df = pd.read_csv(address_sample_csv_path + "training-groundtruth.csv")
address_sample_clean_df = address_sample_original_df.dropna().drop_duplicates()

# Cleaning Process
## In gender col, change "Female" to 0 and "Male" to 1
address_sample_clean_df['gender'] = address_sample_clean_df['gender'].apply(lambda x: 0 if x == "Female" else 1)
## In dx column, change "Control" to 0 and "ProbableAD" to 1
address_sample_clean_df['dx'] = address_sample_clean_df['dx'].apply(lambda x: 0 if x == "Control" else 1)

# Drop non-numeric columns
numeric_df = address_sample_clean_df.select_dtypes(include=['float64', 'int64'])

# Calculate correlation matrix
numeric_df.corr()


Unnamed: 0,age,gender,educ,dx,mmse
age,1.0,,-0.229109,0.26158,-0.27389
gender,,,,,
educ,-0.229109,,1.0,-0.350735,0.34784
dx,0.26158,,-0.350735,1.0,-0.797276
mmse,-0.27389,,0.34784,-0.797276,1.0


In [5]:
import openl3
import os
import soundfile as sf

address_sample_path = 'sound-dementia-data/ADReSS-M-train/train/'
if runningInColab:
  address_sample_path = 'drive/MyDrive/' + address_sample_path
address_sample_list = []
for fName in os.listdir(address_sample_path):
  address_sample_list.append(address_sample_path + fName)
address_sample_list

['sound-dementia-data/ADReSS-M-train/train/adrso002.mp3',
 'sound-dementia-data/ADReSS-M-train/train/adrso003.mp3',
 'sound-dementia-data/ADReSS-M-train/train/adrso004.mp3',
 'sound-dementia-data/ADReSS-M-train/train/adrso005.mp3',
 'sound-dementia-data/ADReSS-M-train/train/adrso006.mp3',
 'sound-dementia-data/ADReSS-M-train/train/adrso007.mp3',
 'sound-dementia-data/ADReSS-M-train/train/adrso008.mp3',
 'sound-dementia-data/ADReSS-M-train/train/adrso009.mp3',
 'sound-dementia-data/ADReSS-M-train/train/adrso010.mp3',
 'sound-dementia-data/ADReSS-M-train/train/adrso011.mp3',
 'sound-dementia-data/ADReSS-M-train/train/adrso012.mp3',
 'sound-dementia-data/ADReSS-M-train/train/adrso013.mp3',
 'sound-dementia-data/ADReSS-M-train/train/adrso014.mp3',
 'sound-dementia-data/ADReSS-M-train/train/adrso015.mp3',
 'sound-dementia-data/ADReSS-M-train/train/adrso016.mp3',
 'sound-dementia-data/ADReSS-M-train/train/adrso017.mp3',
 'sound-dementia-data/ADReSS-M-train/train/adrso018.mp3',
 'sound-dement

Before embedded audio, each audio file need to be at the same length. We are going to pad the audio file

In [6]:
import numpy as np
# padding the audio file
def pad_audio(audio, sr, duration):
	padding_samples = int(duration*sr) - len(audio)
	if padding_samples <= 0:
		return audio
	else:
		return np.pad(audio, (0, padding_samples), 'constant')

In [7]:
max_duration = 0
for sample in address_sample_list:
  audio, sr = sf.read(sample)
  max_duration = max(max_duration, len(audio) / sr)
max_duration

268.4602721088435

In [8]:
result = []
for sample in address_sample_list:
  audio, sr = sf.read(sample)
  if len(audio.shape) > 1:
    audio = audio.mean(axis=1)
  audio = pad_audio(audio, sr, max_duration)
  embedding, timestamps = openl3.get_audio_embedding(audio, sr)
  result.append([embedding, timestamps])



In [None]:
result

# **Classification**
# Classics
* https://scikit-learn.org/stable/supervised_learning.html
* Logistic regression, Support Vector Classification, Decision Tree, Random Forest, Neural Net, AdaBoost, Naïve Bayes
* https://scikit-learn.org/stable/auto_examples/classification/plot_classifier_comparison.html

# Classification heads
* https://www.isca-speech.org/archive/pdfs/interspeech_2021/gauder21_interspeech.pdf
* Neural networks - Conv1D (k=1), Conv1D (k=3), Global. Average
* https://www.isca-speech.org/archive/pdfs/interspeech_2021/wang21ca_interspeech.pdf-Neural networks - Conv - Conv1D - Softmax
* Others
* https://www.tensorflow.org/tutorials/images/transfer_learning#add_a_classification_head

**Classic - Logistic Regressing**

In [None]:
# using logistic regressing to predict
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# make X 2D by its row is each sample and 3 column is each feature

X = []
y = []
for i in range(len(result)):
  X.append(result[i][0][0])
  y.append(address_sample_clean_df['dx'].iloc[i])

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
clf = LogisticRegression(random_state=0).fit(X_train, y_train)

In [None]:
# save model 
import pickle
pickle.dump(clf, open('logistic_regression_model.sav', 'wb'))

In [None]:
y_pred = clf.predict(X_test)
accuracy_score(y_test, y_pred)

Showing Confusion Matrix below

In [None]:
# confusion matrix
from sklearn.metrics import confusion_matrix
import seaborn as sns
import matplotlib.pyplot as plt

cm = confusion_matrix(y_test, y_pred)
sns.heatmap(cm, annot=True, fmt='g')
plt.xlabel('Predicted')
plt.ylabel('Truth')
plt.show()