Este trabalho utiliza o DeepUai, um módulo original criado para facilitar operações de ML.

### _imports_ & HELLO WORLD

In [1]:
import sys
sys.path.append('.')
from models import DeepUaiDataset
DeepUaiDataset.hello_world()

H3LL0 W0RLD


### Conjuntos de Dados Disponíveis

In [2]:
print('Datasets disponíveis:\n', DeepUaiDataset.available_datasets())

Datasets disponíveis:
 ['deglut-audios-wav', 'deglut-audios-statistics2']


Temos disponíveis os datasets de áudios de deglutição e a versão de estatísticas desses mesmos áudios.

### Removendo Outliers

O primeiro passo é utilizar um Detector de Anomalias para filtrar os áudios válidos dos inválidos.

O DeepUai permitirá criar facilmente um novo **conjunto de dados** apenas com os dados válidos. Será preciso utilizar esse **conjunto de dados** gerado para filtrar os arquivos de áudio válidos, uma vez que o **conjunto de dados** gerado é apenas das estatísticas, e pretendemos aplicar outros modelos de Machine Learning nos áudios válidos.

In [3]:
import os, shutil
from sklearn.ensemble import IsolationForest
from models.mlns.outlier import DeepUaiOutlierDetection

clf = IsolationForest(contamination=.5)
deepuai = DeepUaiOutlierDetection(clf=clf, name='iforest-standand',
                                  ds_name='deglut-audios-statistics2')
y = deepuai.execute()

CREATE [200]
[UPDATE: 200]


In [4]:
stats_inliers_ds = deepuai.create_inliers_ds()
print(stats_inliers_ds)

wav_ds = DeepUaiDataset('deglut-audios-wav')
print(wav_ds)

inliers_fnames = [os.path.basename(fpath).split('.')[0]
                  for fpath in stats_inliers_ds.filepaths]

wav_inliers_ds_name = 'deglut-audios-wav-inliers'
wav_inliers_ds_path = DeepUaiDataset._get_path(wav_inliers_ds_name)
if not os.path.exists(wav_inliers_ds_path): os.makedirs(wav_inliers_ds_path)

for fpath in wav_ds.filepaths:
    basename = os.path.basename(fpath)
    fname = basename.split('.')[0]
    if fname in inliers_fnames:
        shutil.copyfile(fpath, os.path.join(wav_inliers_ds_path, basename))
wav_inliers_ds = DeepUaiDataset(wav_inliers_ds_name)
print(wav_inliers_ds)

deglut-audios-statistics2-inliers, 245 itens, 0.03344154357910156 MB
deglut-audios-wav, 490 itens, 955.6767692565918 MB
deglut-audios-wav-inliers, 245 itens, 506.48830795288086 MB


### Dividindo os Inliers em Janelas de Tempo

In [5]:
import json
from scipy.io import wavfile

def get_rate(ds: DeepUaiDataset):
    fname, _ = next(ds.samples(fnames=True))
    wavfilepath = os.path.join(ds.path, fname)
    rate, _ = wavfile.read(wavfilepath)
    return rate

def wav2json_windows(ds: DeepUaiDataset, window_duration = 5): # seconds
    new_ds_name = f'{ds.name}-{window_duration}s'
    new_ds_path = DeepUaiDataset._get_path(new_ds_name)
    if not os.path.exists(new_ds_path): os.makedirs(new_ds_path)
    
    rate = get_rate(ds)
    window_size = int(rate * window_duration)
    step_size = window_size // 2
    for fname, sample in ds.samples(fnames=True):
        fname = fname.split('.')[0]
        i, start, stop = 0, 0, step_size
        while stop < len(sample):
            with open(os.path.join(new_ds_path, f'{fname}-{i}.json'), 'w') as file:
                json.dump(sample[start:stop].tolist(), file)
            i += 1
            start += step_size
            stop += step_size
    new_ds = DeepUaiDataset(name=new_ds_name)
    return new_ds

inliers_ds_name = 'deglut-audios-wav-inliers'
inliers_ds = DeepUaiDataset(inliers_ds_name)
windowed_inliers_ds = wav2json_windows(ds=inliers_ds)
windowed_inliers_ds_name = windowed_inliers_ds.name

### Classificando Janelas de Tempo com Detecção de Anomalias

#### Extraindo informações das Janelas de Tempo

As janelas de tempo possuem 110 mil amostras, o que torna impraticável a aplicação da floresta isolada nos dados crus. Será aplicada a mesma transformação utilizada na primeira detecção de anomalias.

In [6]:
import numpy as np

def compute_statistics2(y: list):
    y = np.asarray(y)
    mean = np.mean(y)
    std_dev = np.std(y)
    min_val = np.min(y)
    max_val = np.max(y)
    median = np.median(y)
    q1 = np.percentile(y, 25)
    q3 = np.percentile(y, 75)
    energy = np.sum(y ** 2)
    statistics = {'Mean': mean, 'Standard Deviation': std_dev,
                  'Minimum': min_val, 'Maximum': max_val,
                  'Median': median, 'Q1': q1, 'Q3': q3, 'Energy': energy}
    return [float(s) for s in statistics.values()]

windowed_inliers_ds_name = 'deglut-audios-wav-inliers-5s'
windowed_inliers_ds = DeepUaiDataset(windowed_inliers_ds_name)

windows_stats_ds_name = windowed_inliers_ds_name + f'-statistics2'
windows_stats_ds_path = DeepUaiDataset._get_path(windows_stats_ds_name)
if not os.path.exists(windows_stats_ds_path): os.makedirs(windows_stats_ds_path)
for fname, window in windowed_inliers_ds.samples(fnames=True):
    with open(os.path.join(windows_stats_ds_path, fname), 'w') as file:
        json.dump(compute_statistics2(window), file)
windows_stats_ds = DeepUaiDataset(windows_stats_ds_name)
print(windows_stats_ds)

deglut-audios-wav-inliers-5s-statistics2, 2293 itens, 0.20721149444580078 MB


#### Classificando as Janelas

In [7]:
from sklearn.ensemble import IsolationForest
from models.mlns.outlier import DeepUaiOutlierDetection

clf = IsolationForest(n_estimators=4)
deepuai = DeepUaiOutlierDetection(clf=clf, name='iforest-windows',
                                  ds_name=windows_stats_ds.name)
y = deepuai.execute()
classified_windows_ds = deepuai.create_classified_ds()
print(classified_windows_ds)

CREATE [200]
[UPDATE: 200]
deglut-audios-wav-inliers-5s-statistics2-classified, 2293 itens, 0.20721149444580078 MB


### Usando DeepLearning

A ideia aqui é utilizar a classificação com Detecção de Anomalias como se fosse uma classificação feita por um especialista e aplicar um classificador com DeepLearning sob as janelas classificadas.

In [8]:
from keras.models import Sequential
from keras.layers import Dense
import numpy as np
from sklearn.preprocessing import normalize

ds = classified_windows_ds
train_part = .75
x, y = ds.classified_samples()
x, y = np.asarray(list(x)), np.asarray(list(y))
x = normalize(x, axis=0)
limit = int(len(x) * train_part)
X_train, y_train = x[:limit], y[:limit]
X_test, y_test = x[limit:], y[limit:]
print('shape:', X_train.shape[1])
classes = np.unique(y, return_counts=True)
print(classes)

model = Sequential()
model.add(Dense(units=128, activation='relu', input_shape=(X_train.shape[1],)))
model.add(Dense(units=64, activation='relu'))
model.add(Dense(units=1, activation='sigmoid'))  # For binary classification, use sigmoid activation

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=50, batch_size=32, validation_data=(X_test, y_test), shuffle=True)
loss, accuracy = model.evaluate(X_test, y_test)
print(f'Loss: {loss}, Accuracy: {accuracy}')

2024-05-06 23:04:19.617599: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-05-06 23:04:19.618220: I external/local_tsl/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2024-05-06 23:04:19.621503: I external/local_tsl/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2024-05-06 23:04:19.667946: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


shape: 8
(array([0, 1]), array([ 693, 1600]))
Epoch 1/50


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


[1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 6ms/step - accuracy: 0.5816 - loss: 0.6838 - val_accuracy: 1.0000 - val_loss: 0.4879
Epoch 2/50
[1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.6256 - loss: 0.6555 - val_accuracy: 0.9599 - val_loss: 0.5291
Epoch 3/50
[1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.6683 - loss: 0.6371 - val_accuracy: 0.9495 - val_loss: 0.4666
Epoch 4/50
[1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.6570 - loss: 0.6285 - val_accuracy: 0.9408 - val_loss: 0.4618
Epoch 5/50
[1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.6413 - loss: 0.6301 - val_accuracy: 0.9477 - val_loss: 0.4285
Epoch 6/50
[1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.6690 - loss: 0.6137 - val_accuracy: 0.9495 - val_loss: 0.4352
Epoch 7/50
[1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━

### Conclusão

Deu para obter uma acurácia legal depois de normalizar o conjunto de dados e quebrar um pouco a cabeça com manipulação de arquivos para conseguir deixar o pacote _cross plataform_.

O próximo passo do DeepUAI é o treinamento distribuido, ou seja, fazer as pré-configurações na máquina local e rodar os treinamentos a partir de uma instância remota.

É possível encontrar o DeepUAI neste repositório: https://github.com/fabiorx1/deepuai3