## 3. Model Training and Evaluation

- Classifying(딥러닝을 사용해 urban sounds 분류하기)
  * Model 생성 -> keras
  * 학습(Training) -> model.fit
  * 예측(Prediction) -> 함수 생성
  * 평가(Validation) -> Prediction에서 생성한 함수 활용
 
**Keras를 통해 3개의 layer를 가지는 Multilayer Perceptron(MLP) 신경망 구축을 통해 모델 학습 및 평가하기**

In [1]:
# 앞서 2번에서 전처리 완료된 데이터 저장한 것을 불러오기

%store -r x_train 
%store -r x_test 
%store -r y_train 
%store -r y_test 
%store -r yy 
%store -r le

In [5]:
import tensorflow as tf
import keras

print(tf.__version__)

2.1.0


Using TensorFlow backend.


In [7]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.layers import LSTM
from sklearn import metrics 

num_labels = yy.shape[1] # 10
filter_size = 2

# Construct model 
model = keras.Sequential([
    keras.layers.Dense(256, activation='relu', input_shape=(40,)),
    keras.layers.Dropout(0.5),
    keras.layers.Dense(256, activation='relu'),
    keras.layers.Dropout(0.5),
    keras.layers.Dense(num_labels, activation='softmax')
])

In [8]:
# 모델 컴파일 하기
# 손실함수는 classification에 가장 많이 쓰이는 categorical_crossentropy 사용(점수 낮으면 모델 성능 더 좋아짐)
model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

In [9]:
model.summary() # 모델의 architecture summary

# pre-training 정확도 계산하기
score = model.evaluate(x_test, y_test, verbose=0)
accuracy = 100*score[1]

print("Pre-training accuracy: %.4f%%" % accuracy)

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_4 (Dense)              (None, 256)               10496     
_________________________________________________________________
dropout_3 (Dropout)          (None, 256)               0         
_________________________________________________________________
dense_5 (Dense)              (None, 256)               65792     
_________________________________________________________________
dropout_4 (Dropout)          (None, 256)               0         
_________________________________________________________________
dense_6 (Dense)              (None, 10)                2570      
Total params: 78,858
Trainable params: 78,858
Non-trainable params: 0
_________________________________________________________________
Pre-training accuracy: 12.7075%


In [11]:
# Training
# 해당 모델에 대해 epoch 100회 실시(100번 반복) => 특정 시점에 도달할 때 까지 향상될 것
# batch size는 작게 설정(배치사이즈 크면 모델의 일반화 능력 감소시킬 수 있으므로)
from keras.callbacks import ModelCheckpoint 
from datetime import datetime 

num_epochs = 100
num_batch_size = 32
checkpoint_path = './saved_models/weights.best.basic_mlp.hdf5'

# 학습(Training)하는 동안 체크포인트 저장하기
# 훈련 중간과 마지막에 자동으로 저장하도록 옵션 설정(모델 재사용성 up!)
checkpointer = keras.callbacks.ModelCheckpoint(checkpoint_path,
                                                verbose=1, save_best_only=True)
start = datetime.now()

# 모델 학습(Training)
model.fit(x_train, y_train, 
          batch_size=num_batch_size, 
          epochs=num_epochs, 
          validation_data=(x_test, y_test), 
          callbacks=[checkpointer], 
          verbose=1)

duration = datetime.now() - start
print("Training completed in time: ", duration)

Train on 6985 samples, validate on 1747 samples
Epoch 1/100

Epoch 00001: val_loss improved from inf to 2.14515, saving model to ./saved_models/weights.best.basic_mlp.hdf5
Epoch 2/100

Epoch 00002: val_loss improved from 2.14515 to 1.93710, saving model to ./saved_models/weights.best.basic_mlp.hdf5
Epoch 3/100

Epoch 00003: val_loss improved from 1.93710 to 1.80177, saving model to ./saved_models/weights.best.basic_mlp.hdf5
Epoch 4/100

Epoch 00004: val_loss improved from 1.80177 to 1.67735, saving model to ./saved_models/weights.best.basic_mlp.hdf5
Epoch 5/100

Epoch 00005: val_loss improved from 1.67735 to 1.58717, saving model to ./saved_models/weights.best.basic_mlp.hdf5
Epoch 6/100

Epoch 00006: val_loss improved from 1.58717 to 1.45580, saving model to ./saved_models/weights.best.basic_mlp.hdf5
Epoch 7/100

Epoch 00007: val_loss improved from 1.45580 to 1.31757, saving model to ./saved_models/weights.best.basic_mlp.hdf5
Epoch 8/100

Epoch 00008: val_loss improved from 1.31757 to 


Epoch 00033: val_loss improved from 0.57514 to 0.56319, saving model to ./saved_models/weights.best.basic_mlp.hdf5
Epoch 34/100

Epoch 00034: val_loss improved from 0.56319 to 0.56054, saving model to ./saved_models/weights.best.basic_mlp.hdf5
Epoch 35/100

Epoch 00035: val_loss did not improve from 0.56054
Epoch 36/100

Epoch 00036: val_loss improved from 0.56054 to 0.54813, saving model to ./saved_models/weights.best.basic_mlp.hdf5
Epoch 37/100

Epoch 00037: val_loss did not improve from 0.54813
Epoch 38/100

Epoch 00038: val_loss improved from 0.54813 to 0.54495, saving model to ./saved_models/weights.best.basic_mlp.hdf5
Epoch 39/100

Epoch 00039: val_loss improved from 0.54495 to 0.53134, saving model to ./saved_models/weights.best.basic_mlp.hdf5
Epoch 40/100

Epoch 00040: val_loss improved from 0.53134 to 0.50820, saving model to ./saved_models/weights.best.basic_mlp.hdf5
Epoch 41/100

Epoch 00041: val_loss did not improve from 0.50820
Epoch 42/100

Epoch 00042: val_loss did not 


Epoch 00069: val_loss did not improve from 0.45187
Epoch 70/100

Epoch 00070: val_loss did not improve from 0.45187
Epoch 71/100

Epoch 00071: val_loss improved from 0.45187 to 0.44554, saving model to ./saved_models/weights.best.basic_mlp.hdf5
Epoch 72/100

Epoch 00072: val_loss improved from 0.44554 to 0.43901, saving model to ./saved_models/weights.best.basic_mlp.hdf5
Epoch 73/100

Epoch 00073: val_loss did not improve from 0.43901
Epoch 74/100

Epoch 00074: val_loss improved from 0.43901 to 0.43436, saving model to ./saved_models/weights.best.basic_mlp.hdf5
Epoch 75/100

Epoch 00075: val_loss did not improve from 0.43436
Epoch 76/100

Epoch 00076: val_loss did not improve from 0.43436
Epoch 77/100

Epoch 00077: val_loss did not improve from 0.43436
Epoch 78/100

Epoch 00078: val_loss did not improve from 0.43436
Epoch 79/100

Epoch 00079: val_loss did not improve from 0.43436
Epoch 80/100

Epoch 00080: val_loss did not improve from 0.43436
Epoch 81/100

Epoch 00081: val_loss did n

In [12]:
import os 

checkpoint_dir = os.path.dirname(checkpoint_path)

!ls {checkpoint_dir}

weights.best.basic_mlp.hdf5


In [13]:
# 모델 테스트하기(결과 차이가 아주 조금인 약 0.04 발생하는 것은 과적합이 아니라는 것을 의미)
# => 잘 훈련된 모델!

# Training set의 정확도 
score = model.evaluate(x_train, y_train, verbose=0)
print("Training Accuracy: ", score[1])

# Test set의 정확도
score = model.evaluate(x_test, y_test, verbose=0)
print("Testing Accuracy: ", score[1])

Training Accuracy:  0.9248389601707458
Testing Accuracy:  0.8809387683868408


In [14]:
# Prediction(예측하기) => 특정 .wav 파일 불러와서 예측하는 함수 생성
import librosa 
import numpy as np 

def extract_feature(file_name):
    try:
        audio_data, sample_rate = librosa.load(file_name, res_type='kaiser_fast') 
        mfccs = librosa.feature.mfcc(y=audio_data, sr=sample_rate, n_mfcc=40)
        mfccsscaled = np.mean(mfccs.T,axis=0)
        
    except Exception as e:
        print("Error encountered while parsing file: ", file)
        return None, None

    return np.array([mfccsscaled])

In [15]:
def print_prediction(file_name):
    prediction_feature = extract_feature(file_name) 

    predicted_vector = model.predict_classes(prediction_feature)
    predicted_class = le.inverse_transform(predicted_vector) 
    print("The predicted class is:", predicted_class[0], '\n') 

    predicted_proba_vector = model.predict_proba(prediction_feature) 
    predicted_proba = predicted_proba_vector[0]
    for i in range(len(predicted_proba)): 
        category = le.inverse_transform(np.array([i]))
        print(category[0], "\t\t : ", format(predicted_proba[i], '.32f') )

In [16]:
# Validation
# 1번 쥬피터에서 확인했던 파일을 불러와 평가(예측)해보기

import IPython.display as ipd

filename = './UrbanSound8K/audio/fold1/180937-7-2-1.wav' # 드릴(착음기)소리
ipd.Audio(filename)
print_prediction(filename)

The predicted class is: jackhammer 

air_conditioner 		 :  0.00000007911994970299929264001548
car_horn 		 :  0.00000876384729053825139999389648
children_playing 		 :  0.00000047031798544594494160264730
dog_bark 		 :  0.00000001747014977127037127502263
drilling 		 :  0.00002442583172523882240056991577
engine_idling 		 :  0.00000000209675699025524409080390
gun_shot 		 :  0.00000000283887136021121477824636
jackhammer 		 :  0.99996364116668701171875000000000
siren 		 :  0.00000091087048303961637429893017
street_music 		 :  0.00000160393108217249391600489616


In [17]:
filename = './UrbanSound8K/audio/fold5/100852-0-0-0.wav' # 에어컨소리
ipd.Audio(filename)
print_prediction(filename)

The predicted class is: air_conditioner 

air_conditioner 		 :  0.99999964237213134765625000000000
car_horn 		 :  0.00000000202525374248807565891184
children_playing 		 :  0.00000000560530599713615629298147
dog_bark 		 :  0.00000000050791471029043577800621
drilling 		 :  0.00000026281065856892382726073265
engine_idling 		 :  0.00000006883440306637567118741572
gun_shot 		 :  0.00000000000583838966755623189897
jackhammer 		 :  0.00000001565264184932857460808009
siren 		 :  0.00000000000237798626276375379973
street_music 		 :  0.00000000162685298565889979727217


In [18]:
filename = './UrbanSound8K/audio/fold7/101848-9-0-0.wav' # 거리음악 소리
ipd.Audio(filename)
print_prediction(filename)

The predicted class is: street_music 

air_conditioner 		 :  0.00524505879729986190795898437500
car_horn 		 :  0.00082799512892961502075195312500
children_playing 		 :  0.04531287774443626403808593750000
dog_bark 		 :  0.00366795668378472328186035156250
drilling 		 :  0.00080098351463675498962402343750
engine_idling 		 :  0.00529348896816372871398925781250
gun_shot 		 :  0.00067102635512128472328186035156
jackhammer 		 :  0.00991755258291959762573242187500
siren 		 :  0.00024422770366072654724121093750
street_music 		 :  0.92801880836486816406250000000000


In [19]:
filename = './UrbanSound8K/audio/fold10/100648-1-0-0.wav' # 차 경적소리
ipd.Audio(filename)
print_prediction(filename)

The predicted class is: street_music 

air_conditioner 		 :  0.03475644811987876892089843750000
car_horn 		 :  0.03183532506227493286132812500000
children_playing 		 :  0.17433208227157592773437500000000
dog_bark 		 :  0.14379122853279113769531250000000
drilling 		 :  0.08944501727819442749023437500000
engine_idling 		 :  0.03119277395308017730712890625000
gun_shot 		 :  0.07807990163564682006835937500000
jackhammer 		 :  0.02646535448729991912841796875000
siren 		 :  0.01510414294898509979248046875000
street_music 		 :  0.37499770522117614746093750000000


In [22]:
filename = './children.wav' # 애들 뛰노는 소리
# ipd.Audio(filename)
print_prediction(filename)

The predicted class is: children_playing 

air_conditioner 		 :  0.00021424313308671116828918457031
car_horn 		 :  0.00022006606741342693567276000977
children_playing 		 :  0.38865002989768981933593750000000
dog_bark 		 :  0.33827415108680725097656250000000
drilling 		 :  0.04806090518832206726074218750000
engine_idling 		 :  0.00154160428792238235473632812500
gun_shot 		 :  0.19597542285919189453125000000000
jackhammer 		 :  0.00002670191861398052424192428589
siren 		 :  0.01512144505977630615234375000000
street_music 		 :  0.01191546581685543060302734375000


제일 마지막 '차 경적소리'를 '거리음악 소리'로 잘못 분류한 것을 제외하면   
거의 정확한 일치율을 보임!

유튜브에서 가져온 아이들 뛰노는 소리도 구분 완료