## Antiguo proyecto de Redes Neuronales Recurrentes usando audios .wav apartir de direcciones, convertiendolos en vectores de MFCC y aplicandolos a una RNN

Las siguientes funciones se dedican a obtener las direcciones, filtrarlas por emociones y ingresarlos al MFCC.

_ Se agregan los archivos buscando en 2 direcciones (los datasets de CREMA-D y SAVEE).

_ Para el dataset de CREMA-D se buscan n casos aleatorios de una emocion como entrada.

_ Para el dataset de SAVEE se obtienen todos los casos como entrada.

In [1]:
import matplotlib.pyplot as plt
%matplotlib inline
import IPython.display as ipd
import librosa
import librosa.display
import os
import soundfile as sf
import json
import random
from sklearn import preprocessing
from keras.utils import to_categorical
from sklearn.model_selection import train_test_split
import numpy as np
from keras import Sequential
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from keras.layers import Dropout
from keras.layers import Activation
import datetime
from sklearn.metrics import confusion_matrix

#### Conseguir los paths

La funcion consigue los caminos a los archivos de audio pasandole una direccion y devuelve el camino completa al audio.

In [2]:
def get_files_from_path(directory):
    path_files = []
    dir_list = os.listdir(directory)
    for path in dir_list:
        path_files.append(directory+"\\"+path)
    return path_files

Con una lista de codigos de emociones, caminos a los audios y una funcion para obtener el codigo a partir del path, devuelve solo los paths de las emociones buscadas.

In [3]:
def extract_paths_for_emotions_keys(emotions_code, files_path, get_code):
    paths = []
    emotions_set = set(emotions_code)
    for code_file in files_path:
        if (get_code(code_file) in emotions_set):
            paths.append(code_file)
    return paths

Esta funcion abre el archivo y obtiene el mfcc escalado en un vector de 40 elementos.

In [4]:
def features_extractor(file_name):
    audio, sample_rate = librosa.load(file_name, res_type='kaiser_fast')
    mfccs_features = librosa.feature.mfcc(y=audio,sr=sample_rate,n_mfcc=40)
    mfccs_scaled_features = np.mean(mfccs_features.T, axis=0)
    return mfccs_scaled_features

Esta funcion permite guardar los MFCC con referencia del audio obtenido en un archivo json. 

In [5]:
def save_elements_in_json(examples_saved, name):
    json_files = []
    json_file = {}
    index = 0
    for file in examples_saved:
        json_file = {"id": index, "features":[str(elem) for elem in file[0]] ,"code":file[1], "path":file[2]}
        json_files.append(json_file)
        index += 1
    json_object = json.dumps(json_files)
    with open(f"{name}.json", "w") as outfile:
        outfile.write(json_object)

La funcion permite cargar datos como el MFCC y referencias a la ubicacion del audio de un archivo json.

In [6]:
def load_elements_from_json(name):
    f = open(f'{name}.json')
    data = json.load(f)
    examples = []
    for element in data:
        examples.append(([float(feature) for feature in (element["features"])], element["code"]))
    return examples

La funcion nos permite devolver una lista de MFCC obtenidos de una lista de paths, el MFCC tiene un limite que no le permite cargar archivos de menos igual a 44 kb, al final si se le paso un diccionario imprime las estadisticas de los datos obtenidos en el diccionario.

In [7]:
def get_features(datas_file,get_code, files_filters = dict()):
    examples = []
    for data_file in datas_file:
        file_stats = os.stat(data_file)
        if (file_stats.st_size > 44):
            feature = features_extractor(data_file)
            files_filters[get_code(data_file)]+= 1
            examples.append((feature,get_code(data_file), data_file))
    print(files_filters)
    return examples

Selecciona n lineas a paritir de unos ejemplos

In [105]:
def select_elements(examples, code, quantity, new_code):
    random.shuffle(examples)
    elements = []
    counter = 1
    for example in examples:
        if (counter > quantity):
            break
        if code == example[1]:
            elements.append((example[0],new_code))
            counter = counter + 1
    return elements

def get_code_crema_d(path):
    return path[107:110]

def get_code_savee(path):
    return path[96]

In [423]:
files_path = get_files_from_path(f"{os.getcwd()}\\..\\Datasets\\AudioWav")
emotions_code = ["NEU", "FEA","ANG"]
datas_files = extract_paths_for_emotions_keys(emotions_code, files_path, get_code_crema_d)

In [424]:
files_path_s = get_files_from_path(f"{os.getcwd()}\\..\\Datasets\\ALL")
emotions_code_s = ["a", "f","n"]
datas_files_s = extract_paths_for_emotions_keys(emotions_code_s, files_path_s, get_code_savee) 

Los siguientes bloques obtienen todos los MFCC de una lista de paths.

In [459]:
examples = []
files_filters = dict()
files_filters["NEU"] = 0
files_filters["FEA"] = 0
files_filters["ANG"] = 0
files_filters["a"] = 0
files_filters["f"] = 0
files_filters["n"] = 0

In [460]:
start = datetime.datetime.now()
examples = get_features(datas_files, get_code_crema_d, files_filters)
examples_s = get_features(datas_files_s, get_code_savee, files_filters)
end = datetime.datetime.now() - start
print(f"transcurrio {round((end.microseconds/1000000),2)} s")
print(files_filters)

{'NEU': 1087, 'FEA': 1271, 'ANG': 1270, 'a': 0, 'f': 0, 'n': 0}
{'NEU': 1087, 'FEA': 1271, 'ANG': 1270, 'a': 60, 'f': 60, 'n': 120}
transcurrio 0.4 s
{'NEU': 1087, 'FEA': 1271, 'ANG': 1270, 'a': 60, 'f': 60, 'n': 120}


In [461]:
es = examples + examples_s
entries = []
for example in es:
    entries.append((example[0], example[1]))

Se filtra en un lista la cantidad de datos por cada emocion, devuelve la cantidad de entradas con una salida que indique si existe o no estres.

In [462]:
datas = select_elements(entries, 'NEU', 896,"without_stress")
datas += select_elements(entries, 'ANG', 550, "stress")
datas += select_elements(entries, 'FEA', 550, "stress")
datas += select_elements(entries, 'a', 60, "stress")
datas += select_elements(entries, 'f', 60, "stress")
datas += select_elements(entries, 'n', 120, "without_stress")
random.shuffle(datas)

In [463]:
X = []
y = []
for data in datas:
    X.append(data[0])
    y.append(data[1])

In [464]:
labelencoder=preprocessing.LabelEncoder()
y = to_categorical(labelencoder.fit_transform(y))

Se separa los datos en una parte para el entrenamiento y en otro para el testeo apartir de un porcentaje (0.8, 0.2)

In [465]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size =0.2,random_state=0)

In [466]:
print(len(X_train),len(X_test),len(y_train),len(y_test))

1788 448 1788 448


In [467]:
#"C:\\Users\\bacs2\\Downloads\\Taller De Grado\\Previous\\Datasets\\AudioWAV\\1001_DFA_ANG_XX.wav"

In [468]:
y = np.array(y)

In [469]:
X_train = np.array(X_train)
X_test = np.array(X_test)
y_train = np.array(y_train)
y_test = np.array(y_test)

In [470]:
#creacion del modelo
num_labels = y.shape[1]
dim_entrada = (X_train.shape[1],1)
    
#definiendo modelo
model = Sequential()
model.add(LSTM(units=50,input_shape= dim_entrada))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(2,activation='softmax'))
model.compile(loss='categorical_crossentropy', metrics=['accuracy'],optimizer='adam')

In [471]:
#numero de epocas
num_epochs = 50
num_batch_size = 32
start = datetime.datetime.now()
   
model.fit(X_train, y_train, batch_size=num_batch_size,epochs=num_epochs, validation_data=(X_test, y_test))
duration = datetime.datetime.now() - start

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


In [472]:
test_accuracy = model.evaluate(X_test, y_test, verbose=0)
print(f"val_loss: {test_accuracy[0]}", f"val_accuracy: {test_accuracy[1]}")

val_loss: 0.46012476086616516 val_accuracy: 0.7321428656578064


In [473]:
y_values = model.predict(X_test)
y_prediction=[([1,0] if i[0]>i[1] else [0,1]) for i in y_values]



In [474]:
y_i = len(y_values)
i = 0
true_values = 0
while (i < y_i):
    true_values += (1 if (y_test[i][0] == y_prediction[i][0] or y_test[i][1] == y_prediction[i][1]) else 0)
    i = i + 1 

In [475]:
print(f"El algoritmo acerto {true_values} veces sobre los {y_i} casos.")

El algoritmo acerto 328 veces sobre los 448 casos.


In [422]:
# serialize model to JSON
model_json = model.to_json()
with open("model_5.json", "w") as json_file:
    json_file.write(model_json)
# serialize weights to HDF5
model.save_weights("model_5.h5")
print("Saved model to disk")

Saved model to disk


In [162]:
#0.8125
'''
accuracy 
val_loss: 0.4896196722984314 val_accuracy: 0.7209821343421936
val_loss: 0.5390793085098267 val_accuracy: 0.7388392686843872
val_loss: 0.3872809410095215 val_accuracy: 0.8102678656578064 model_2 
val_loss: 0.41717106103897095 val_accuracy: 0.8035714030265808 model_3
val_loss: 0.4412716329097748 val_accuracy: 0.7767857313156128
val_loss: 0.43090111017227173 val_accuracy: 0.7901785969734192
val_loss: 0.4307461678981781 val_accuracy: 0.8013392686843872
val_loss: 0.44158974289894104 val_accuracy: 0.7410714030265808
val_loss: 0.5022664070129395 val_accuracy: 0.7455357313156128
val_loss: 0.4658648669719696 val_accuracy: 0.7700892686843872
val_loss: 0.46482276916503906 val_accuracy: 0.7633928656578064
val_loss: 0.46192440390586853 val_accuracy: 0.765625
val_loss: 0.4675053656101227 val_accuracy: 0.7388392686843872
val_loss: 0.42560023069381714 val_accuracy: 0.7924107313156128
val_loss: 0.44755128026008606 val_accuracy: 0.7700892686843872
val_loss: 0.4769285023212433 val_accuracy: 0.7522321343421936
val_loss: 0.4537601172924042 val_accuracy: 0.7544642686843872
val_loss: 0.4555659890174866 val_accuracy: 0.7611607313156128
val_loss: 0.459335058927536 val_accuracy: 0.75                model_4
val_loss: 0.4552021026611328 val_accuracy: 0.7723214030265808
val_loss: 0.46012476086616516 val_accuracy: 0.7321428656578064
'''

'\naccuracy \nval_loss: 0.4896196722984314 val_accuracy: 0.7209821343421936\nval_loss: 0.5390793085098267 val_accuracy: 0.7388392686843872\nval_loss: 0.3872809410095215 val_accuracy: 0.8102678656578064 model_2 \nval_loss: 0.41717106103897095 val_accuracy: 0.8035714030265808 model_3\nval_loss: 0.4412716329097748 val_accuracy: 0.7767857313156128\nval_loss: 0.43090111017227173 val_accuracy: 0.7901785969734192\n'