#**스마트폰 센서 데이터 기반 모션 분류**
# 단계2 : 기본 모델링


## 0.미션3

* 데이터 전처리
    * 가변수화, 데이터 분할, NaN 확인 및 조치, 스케일링 등 필요한 전처리 수행
* 다양한 딥러닝 구조의 모델로 분류 모델 생성
    * 최소 4개 이상 모델링 수행
    * 각 모델별 최소 5회 반복수행해서 얻은 성능의 평균으로 비교
    * 각 모델의 성능을 저장하는 별도 데이터 프레임을 만들고 비교
* 옵션 : 다음 사항은 선택사항입니다. 시간이 허용하는 범위 내에서 수행하세요.
    * 상위 N개 변수를 선정하여 모델링 및 성능 비교
        * 모델링에 항상 모든 변수가 필요한 것은 아닙니다.
        * 변수 중요도 상위 N개를 선정하여 모델링하고 타 모델과 성능을 비교하세요.
        * 상위 N개를 선택하는 방법은, 변수를 하나씩 늘려가며 모델링 및 성능 검증을 수행하여 적절한 지점을 찾는 것입니다.
* 성능 가이드
    * Accuracy : 0.90 ~ 0.99

## 1.환경설정

* 세부 요구사항
    - 경로 설정 : 구글콜랩
        * 구글 드라이브 바로 밑에 project3 폴더를 만들고,
        * 데이터 파일을 복사해 넣습니다.
    - 기본적으로 필요한 라이브러리를 import 하도록 코드가 작성되어 있습니다.
        * 필요하다고 판단되는 라이브러리를 추가하세요.


### (1) 경로 설정

* 구글 드라이브 연결

In [1]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [2]:
path = '/content/drive/MyDrive/project3/'

### (2) 라이브러리 불러오기

* 라이브러리 로딩

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

import joblib
from tqdm import tqdm

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler, LabelEncoder
from sklearn.metrics import *

from keras.models import Sequential
from keras.layers import Dense, Flatten, Dropout, Input
from keras.backend import clear_session
from keras.optimizers import Adam

In [None]:
# 학습곡선 함수
def dl_history_plot(history):
    plt.figure(figsize=(10,6))
    plt.plot(history['loss'], label='train_err', marker = '.')
    plt.plot(history['val_loss'], label='val_err', marker = '.')

    plt.ylabel('Loss')
    plt.xlabel('Epoch')
    plt.legend()
    plt.grid()
    plt.show()

### (3) 데이터 불러오기

* 주어진 데이터셋
    * data01_train.csv : 학습 및 검증용
    * data01_test.csv : 테스트용
    * feature.csv : feature 이름을 계층구조로 정리한 데이터

* 세부 요구사항
    * 칼럼 삭제 : data01_train.csv와 data01_test.csv 에서 'subject' 칼럼은 불필요하므로 삭제합니다.

#### 1) 데이터로딩

In [None]:
traindata = pd.read_csv(path + 'data01_train.csv')
traindata = traindata.drop(columns='subject')
traindata.head()
traindata.shape

(5881, 562)

In [None]:
testdata = pd.read_csv(path + 'data01_test.csv')
testdata = testdata.drop(columns='subject')
testdata.head()
testdata.shape

(1471, 562)

In [None]:
featuresdata = pd.read_csv(path + 'features.csv')
featuresdata.head()

Unnamed: 0,sensor,agg,axis,feature_name
0,tBodyAcc,mean(),X,tBodyAcc-mean()-X
1,tBodyAcc,mean(),Y,tBodyAcc-mean()-Y
2,tBodyAcc,mean(),Z,tBodyAcc-mean()-Z
3,tBodyAcc,std(),X,tBodyAcc-std()-X
4,tBodyAcc,std(),Y,tBodyAcc-std()-Y


#### 2) 기본 정보 조회

In [None]:
traindata.shape

(5881, 562)

In [None]:
#결측치
missing_values = traindata.isna().sum()
missing_values = missing_values[missing_values > 0]
print(missing_values)

Series([], dtype: int64)


In [None]:
# 피처들 타입
categorical_columns = traindata.select_dtypes(include=['object']).columns
numerical_columns = traindata.select_dtypes(include=['number']).columns
print("Categorical columns:", len(categorical_columns))
print("Numerical columns:", len(numerical_columns))

Categorical columns: 1
Numerical columns: 561


In [None]:
# 종속변수 분포 확인
target=traindata['Activity']
target.value_counts()

Unnamed: 0_level_0,count
Activity,Unnamed: 1_level_1
LAYING,1115
STANDING,1087
SITTING,1032
WALKING,998
WALKING_UPSTAIRS,858
WALKING_DOWNSTAIRS,791


## **2. 데이터 전처리**

* 가변수화, 데이터 분할, NaN 확인 및 조치, 스케일링 등 필요한 전처리를 수행한다.


### (1) 데이터 분할1 : x, y

* 세부 요구사항
    - x, y로 분할합니다.

In [None]:
target = 'Activity'
x = traindata.drop(columns=target)
y = traindata.loc[:, target]

### (2) 스케일링


* 세부 요구사항
    - 스케일링을 필요로 하는 알고리즘 사용을 위해서 코드 수행
    - min-max 방식 혹은 standard 방식 중 한가지 사용.

In [None]:
scaler = MinMaxScaler()
x = scaler.fit_transform(x)

### (3) Y 전처리
* integer encoding : LabelEncoder
* (필요시) one-hot encoding

In [None]:
int_encoder = LabelEncoder()
y = int_encoder.fit_transform(y)
y

### (4) 데이터분할2 : train, validation

* 세부 요구사항
    - train : val = 8 : 2 혹은 7 : 3
    - random_state 옵션을 사용하여 다른 모델과 비교를 위해 성능이 재현되도록 합니다.

In [None]:
x_train, x_val, y_train, y_val = train_test_split(x, y, test_size=0.3, random_state=20)

## **3. 기본 모델링**



* 세부 요구사항
    - 모델1 : Base line 모델
        * Hidden Layer 없이 모델 생성
    - 모델2 : 복잡한 모델 생성
        * 최소 5개 이상의 은닉층을 추가한 모델
    - 모델3 ~ n : 튜닝 모델
        * 학습률, epoch 등 조정
        * 모델2에 과적합을 방지하기 위한 규제 기법 추가
        * Accuracy 최대화 시키는 모델 생성하기
    - 각 모델은 최소 5번 반복수행해서 얻은 성능의 평균값을 기록

### 팀원당 Best Model
- 팀원 각 모델 5개 설계
- 각 모델 5번 학습 및 Test
- 평균 정확도가 가장 높은 모델을 팀원 당 한 개씩 아래에 작성

###Base Model

In [None]:
n = x_train.shape[1]

# 모델 설계
model = Sequential( [Input(shape = (n,)),
                     Dense( 6, activation = 'softmax')] )
model.summary()


# Test Data Predict Result
# =======================================================
# 1. confusion_matrix
# [[290   0   0   0   2   0]
#  [  2 241  10   0   0   1]
#  [  0   50 226   1   0   3]
#  [  0   0   0 210   12   6]
#  [  0   0   0   0 195   0]
#  [  0   0   0   9   17 198]]


# 2. classification_report
#               precision    recall  f1-score   support

#            0        0.98      0.99      0.98       292
#            1       0.83      0.95      0.88       254
#            2       0.96      0.79      0.86       287
#            3       1.00      0.92      0.96       228
#            4       0.86      1.00      0.92       195
#            5       0.95      0.92      0.94       215

#     accuracy                           0.92      1471
#    macro avg       0.93      0.93      0.92      1471
# weighted avg       0.93      0.92      0.92      1471



# 3. accuracy_score
# 0.9245
# =======================================================

### (1) 모델1

In [None]:
# 모델 설계
model9 = Sequential([Input(shape=(n, )),
                     Dense(512, activation='relu'),
                     Dense(128, activation='relu'),
                     Dense(64, activation='relu'),
                     Dense(32, activation='relu'),
                     Dense(6, activation='softmax')])
model9.summary()
# 콜백 함수 정의
es = EarlyStopping(monitor='val_loss', min_delta=0.001, patience=10)
# 모델 컴파일
model9.compile(optimizer=Adam(learning_rate=0.0003), loss='sparse_categorical_crossentropy')
# 모델 학습
hist9 = model9.fit(x_train, y_train, epochs=200, validation_split=0.2, callbacks=[es]).history

# Test Data Predict Result
# =======================================================
# 1. confusion_matrix
# [[292   0   0   0   0   0]
#  [  0 219  35   0   0   0]
#  [  0   4 283   0   0   0]
#  [  0   0   0 228   0   0]
#  [  0   0   0   1 194   0]
#  [  0   0   0   9   0 206]]


# 2. classification_report
#               precision    recall  f1-score   support

#            0       1.00      1.00      1.00       292
#            1       0.98      0.86      0.92       254
#            2       0.89      0.99      0.94       287
#            3       0.96      1.00      0.98       228
#            4       1.00      0.99      1.00       195
#            5       1.00      0.96      0.98       215

#     accuracy                           0.97      1471
#    macro avg       0.97      0.97      0.97      1471
# weighted avg       0.97      0.97      0.97      1471



# 3. accuracy_score
# 0.9666893269884432
# =======================================================

### (2) 모델2

In [None]:
nfeatures = x_train.shape[1]

# 모델 설계
model3 = Sequential( [Input(shape = (nfeatures,)),
                      Dense(128, activation = 'relu'),
                      Dense(64, activation = 'relu'),
                      Dense(32, activation = 'relu'),
                      Dense(16, activation = 'relu'),
                      Dense(6, activation = 'softmax')] )
# 모델 컴파일
model3.compile(optimizer=Adam(learning_rate=0.001), loss= 'sparse_categorical_crossentropy')
# 모델 학습
history3 = model3.fit(x_train, y_train, epochs = 100, validation_split=0.2).history1. confusion_matrix


# Test Data Predict Result
# =======================================================
# 1. confusion_matrix
# [[291   0   0   0   1   0]
#  [  0 239  15   0   0   0]
#  [  0  13 274   0   0   0]
#  [  0   0   0 223   4   1]
#  [  0   0   0   0 195   0]
#  [  0   0   0   1   2 212]]

# 2. Classification_report
#               precision    recall  f1-score   support

#            0       1.00      1.00      1.00       292
#            1       0.95      0.94      0.94       254
#            2       0.95      0.95      0.95       287
#            3       1.00      0.98      0.99       228
#            4       0.97      1.00      0.98       195
#            5       1.00      0.99      0.99       215

#     accuracy                           0.97      1471
#    macro avg       0.98      0.98      0.98      1471
# weighted avg       0.97      0.97      0.97      1471


# 3. accuracy_score: 0.9748470428280082
# =======================================================

### (3) 모델3

In [None]:
# 모델 설계
model9 = Sequential([Input(shape=(nfeatures,)),
                     Dense(512, activation='relu'),
                     Dense(512, activation='relu'),
                     Dense(256, activation='relu'),
                     Dense(128, activation='relu'),
                     Dense(64, activation='relu'),
                     Dropout(0.3),
                     Dense(128, activation='relu'),
                     Dense(256, activation='relu'),
                     Dense(512, activation='relu'),
                     Dense(512, activation='relu'),
                     Dense(6, activation='softmax')])



model9.summary()
# 모델 컴파일
model9.compile(optimizer=Adam(learning_rate=0.0001), loss='sparse_categorical_crossentropy', metrics=['accuracy'])

"""


****Best Model****
**test Result**

1. confusion_matrix
[[231   0   0   0   0   0]
 [  0 198   2   0   0   0]
 [  0   7 219   0   0   0]
 [  0   0   0 196   0   2]
 [  0   0   0   0 143   2]
 [  0   0   0   0   2 175]]

2. classification_report
      precision    recall  f1-score   support

  0       1.00      1.00      1.00       231
  1       0.97      0.99      0.98       200
  2       0.99      0.97      0.98       226
  3       1.00      0.99      0.99       198
  4       0.99      0.99      0.99       145
  5       0.98      0.99      0.98       177
    accuracy                           0.99      1177
   macro avg       0.99      0.99      0.99      1177
weighted avg       0.99      0.99      0.99      1177

3. accuracy_score
0.9872557349192863
 """

### (4) 모델4

In [None]:
clear_session()


# 모델 설계
model18 = Sequential([Input(shape=(nfeatures, )),
                     Dense(512, activation='relu'),
                     Dense(512, activation='relu'),
                     Dense(512, activation='relu'),
                     Dense(512, activation='relu'),
                     Dropout(0.4),
                     Dense(256, activation='relu'),
                     Dense(256, activation='relu'),
                     Dense(256, activation='relu'),
                     Dropout(0.3),
                     Dense(128, activation='relu'),
                     Dense(128, activation='relu'),
                     Dropout(0.2),
                     Dense(6, activation='softmax')])

# 모델 컴파일
model18.compile(optimizer=Adam(learning_rate=0.0001), loss='sparse_categorical_crossentropy')

# 모델 학습
hist=model18.fit(x_train, y_train, epochs=100, validation_split=.2).history

# Test Data Predict Result
# =======================================================
# 1. confusion_matrix
#[[291   1   0   0   0   0]
# [  0 234  20   0   0   0]
# [  0   7 280   0   0   0]
# [  0   0   0 228   0   0]
# [  0   0   0   0 194   1]
# [  0   0   0   4   1 210]]


# 2. classification_report
#               precision    recall  f1-score   support

#            0       1.00      1.00      1.00       292
#            1       0.97      0.92      0.94       254
#            2       0.93      0.98      0.95       287
#            3       0.98      1.00      0.99       228
#            4       0.99      0.99      0.99       195
#            5       1.00      0.98      0.99       215

#    accuracy                           0.98      1177
#   macro avg       0.98      0.98      0.98      1177
#weighted avg       0.98      0.98      0.98      1177


# 3. accuracy_score
# 0.9913280054412514
# =======================================================

###(5) 모델5

In [None]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler
import tensorflow as tf


seed=56713
np.random.seed(seed)
tf.random.set_seed(seed)
random.seed(seed)

#Sensor별 데이터프레임 분리
data1=traindata.copy()
df_dict={}
for f in featuresdata['sensor'].unique():
  columns_=[]
  for col in data1.columns:
    if '-' in col and f == col.split('-')[0]:
      columns_.append(col)
    elif f == 'angle' and f in col:
      columns_.append(col)
  p_data=data1[columns_]
  data1.drop(columns_,axis=1,inplace=True)
  df_dict[f] = p_data

# 데이터 전처리 및 분리
scaled_data = {}
scaler_dict={}
for sensor in featuresdata['sensor'].unique():
    scaler = MinMaxScaler()
    scaled_data[sensor] = scaler.fit_transform(df_dict[sensor])
    scaler_dict[sensor] = scaler
X = [scaled_data[sensor] for sensor in df_dict.keys()]
Y = pd.get_dummies(traindata['Activity'])
X_indices = range(len(Y))
train_indices, test_indices, Y_train, Y_test = train_test_split(X_indices, Y, test_size=0.2, stratify=Y, random_state=seed)
X_train_list = []
X_test_list = []
for sensor_data in X:
    X_train_list.append(sensor_data[train_indices])
    X_test_list.append(sensor_data[test_indices])


# 모델 설계
input_1 = tf.keras.layers.Input(shape=(df_dict[featuresdata['sensor'].unique()[0]].shape[1],))
input_2 = tf.keras.layers.Input(shape=(df_dict[featuresdata['sensor'].unique()[1]].shape[1],))
input_3 = tf.keras.layers.Input(shape=(df_dict[featuresdata['sensor'].unique()[2]].shape[1],))
input_4 = tf.keras.layers.Input(shape=(df_dict[featuresdata['sensor'].unique()[3]].shape[1],))
input_5 = tf.keras.layers.Input(shape=(df_dict[featuresdata['sensor'].unique()[4]].shape[1],))
input_6 = tf.keras.layers.Input(shape=(df_dict[featuresdata['sensor'].unique()[5]].shape[1],))
input_7 = tf.keras.layers.Input(shape=(df_dict[featuresdata['sensor'].unique()[6]].shape[1],))
input_8 = tf.keras.layers.Input(shape=(df_dict[featuresdata['sensor'].unique()[7]].shape[1],))
input_9 = tf.keras.layers.Input(shape=(df_dict[featuresdata['sensor'].unique()[8]].shape[1],))
input_10 = tf.keras.layers.Input(shape=(df_dict[featuresdata['sensor'].unique()[9]].shape[1],))
input_11 = tf.keras.layers.Input(shape=(df_dict[featuresdata['sensor'].unique()[10]].shape[1],))
input_12 = tf.keras.layers.Input(shape=(df_dict[featuresdata['sensor'].unique()[11]].shape[1],))
input_13 = tf.keras.layers.Input(shape=(df_dict[featuresdata['sensor'].unique()[12]].shape[1],))
input_14 = tf.keras.layers.Input(shape=(df_dict[featuresdata['sensor'].unique()[13]].shape[1],))
input_15 = tf.keras.layers.Input(shape=(df_dict[featuresdata['sensor'].unique()[14]].shape[1],))
input_16 = tf.keras.layers.Input(shape=(df_dict[featuresdata['sensor'].unique()[15]].shape[1],))
input_17 = tf.keras.layers.Input(shape=(df_dict[featuresdata['sensor'].unique()[16]].shape[1],))
input_18 = tf.keras.layers.Input(shape=(df_dict[featuresdata['sensor'].unique()[17]].shape[1],))

x1_1 = tf.keras.layers.Dense(32, activation='relu', kernel_initializer='he_normal')(input_1)
x1_2 = tf.keras.layers.Dense(32, activation='relu', kernel_initializer='he_normal')(input_2)
x1_3 = tf.keras.layers.Dense(32, activation='relu', kernel_initializer='he_normal')(input_3)
x1_4 = tf.keras.layers.Dense(32, activation='relu', kernel_initializer='he_normal')(input_4)
x1_5 = tf.keras.layers.Dense(32, activation='relu', kernel_initializer='he_normal')(input_5)
x1_6 = tf.keras.layers.Dense(32, activation='relu', kernel_initializer='he_normal')(input_6)
x1_7 = tf.keras.layers.Dense(32, activation='relu', kernel_initializer='he_normal')(input_7)
x1_8 = tf.keras.layers.Dense(32, activation='relu', kernel_initializer='he_normal')(input_8)
x1_9 = tf.keras.layers.Dense(32, activation='relu', kernel_initializer='he_normal')(input_9)
x1_10 = tf.keras.layers.Dense(32, activation='relu', kernel_initializer='he_normal')(input_10)
x1_11 = tf.keras.layers.Dense(32, activation='relu', kernel_initializer='he_normal')(input_11)
x1_12 = tf.keras.layers.Dense(32, activation='relu', kernel_initializer='he_normal')(input_12)
x1_13 = tf.keras.layers.Dense(32, activation='relu', kernel_initializer='he_normal')(input_13)
x1_14 = tf.keras.layers.Dense(32, activation='relu', kernel_initializer='he_normal')(input_14)
x1_15 = tf.keras.layers.Dense(32, activation='relu', kernel_initializer='he_normal')(input_15)
x1_16 = tf.keras.layers.Dense(32, activation='relu', kernel_initializer='he_normal')(input_16)
x1_17 = tf.keras.layers.Dense(32, activation='relu', kernel_initializer='he_normal')(input_17)
x1_18 = tf.keras.layers.Dense(32, activation='relu', kernel_initializer='he_normal')(input_18)

x2_1 = tf.keras.layers.Dense(3, activation='relu', kernel_initializer='he_normal')(x1_1)
x2_2 = tf.keras.layers.Dense(3, activation='relu', kernel_initializer='he_normal')(x1_2)
x2_3 = tf.keras.layers.Dense(3, activation='relu', kernel_initializer='he_normal')(x1_3)
x2_4 = tf.keras.layers.Dense(3, activation='relu', kernel_initializer='he_normal')(x1_4)
x2_5 = tf.keras.layers.Dense(3, activation='relu', kernel_initializer='he_normal')(x1_5)
x2_6 = tf.keras.layers.Dense(3, activation='relu', kernel_initializer='he_normal')(x1_6)
x2_7 = tf.keras.layers.Dense(3, activation='relu', kernel_initializer='he_normal')(x1_7)
x2_8 = tf.keras.layers.Dense(3, activation='relu', kernel_initializer='he_normal')(x1_8)
x2_9 = tf.keras.layers.Dense(3, activation='relu', kernel_initializer='he_normal')(x1_9)
x2_10 = tf.keras.layers.Dense(3, activation='relu', kernel_initializer='he_normal')(x1_10)
x2_11 = tf.keras.layers.Dense(3, activation='relu', kernel_initializer='he_normal')(x1_11)
x2_12 = tf.keras.layers.Dense(3, activation='relu', kernel_initializer='he_normal')(x1_12)
x2_13 = tf.keras.layers.Dense(3, activation='relu', kernel_initializer='he_normal')(x1_13)
x2_14 = tf.keras.layers.Dense(3, activation='relu', kernel_initializer='he_normal')(x1_14)
x2_15 = tf.keras.layers.Dense(3, activation='relu', kernel_initializer='he_normal')(x1_15)
x2_16 = tf.keras.layers.Dense(3, activation='relu', kernel_initializer='he_normal')(x1_16)
x2_17 = tf.keras.layers.Dense(3, activation='relu', kernel_initializer='he_normal')(x1_17)
x2_18 = tf.keras.layers.Dense(3, activation='relu', kernel_initializer='he_normal')(x1_18)

concatenated = tf.keras.layers.Concatenate()([x1_1, x1_2, x1_3, x1_4, x1_5, x1_6, x1_7, x1_8, x1_9, x1_10,
                                              x1_11, x1_12, x1_13, x1_14, x1_15, x1_16, x1_17, x1_18,
                                              x2_1, x2_2, x2_3, x2_4, x2_5, x2_6, x2_7, x2_8, x2_9, x2_10,
                                              x2_11, x2_12, x2_13, x2_14, x2_15, x2_16, x2_17, x2_18])

x3_1 = tf.keras.layers.Dense(32, activation='relu', kernel_initializer='he_normal')(concatenated)
x3_2 = tf.keras.layers.Dense(6, activation='softmax')(x3_1)

model = tf.keras.Model(inputs=[input_1, input_2, input_3, input_4, input_5, input_6, input_7, input_8, input_9,
                               input_10, input_11, input_12, input_13, input_14, input_15, input_16, input_17, input_18],
                       outputs=[x3_2])


# 콜백 함수 정의
es = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=50, mode='min', verbose=1, restore_best_weights=True)
rd = tf.keras.callbacks.ReduceLROnPlateau(monitor='val_loss', factor=0.8, patience=5, mode='min', verbose=1, min_lr=0.000001)
cl = [es, rd]

# 모델 컴파일
optimizers = tf.keras.optimizers.Adam(learning_rate=0.001)
model.compile(optimizer=optimizers, loss='categorical_crossentropy', metrics=['accuracy'])

#모델 학습
hist = model.fit(X_train_list, Y_train, validation_split=0.2, epochs=1000, verbose=1, callbacks=cl)


"""
****Best Model****
**test Result**

1. confusion_matrix
[[292   0   0   0   0   0]
 [  0 242  12   0   0   0]
 [  0  10 277   0   0   0]
 [  0   0   0 227   0   1]
 [  0   0   0   0 194   1]
 [  0   0   0   1   0 214]]

2. classification_report
    precision    recall  f1-score   support

0       1.00      1.00      1.00       292
1       0.96      0.95      0.96       254
2       0.96      0.97      0.96       287
3       1.00      1.00      1.00       228
4       1.00      0.99      1.00       195
5       0.99      1.00      0.99       215

    accuracy                           0.98      1471
   macro avg       0.98      0.98      0.98      1471
weighted avg       0.98      0.98      0.98      1471

3. accuracy_score
0.9830047586675731
"""

###(6) 모델6

In [None]:
# 모델 설계 - BatchNormalization 추가
model_test = Sequential([Input(shape = (nfeatures_test, )),
                     Dense(128, activation='relu'),
                     BatchNormalization(),
                     Dropout(0.4),
                     Dense(128, activation='relu'),
                     BatchNormalization(),
                     Dropout(0.4),
                     Dense(64, activation='relu'),
                     BatchNormalization(),
                     Dropout(0.3),
                     Dense(64, activation='relu'),
                     BatchNormalization(),
                     Dropout(0.3),
                     Dense(32, activation='relu'),
                     BatchNormalization(),
                     Dropout(0.2),
                     Dense(6, activation='softmax')])
model_test.summary()

# 콜백함수 정의 -  ReduceLROnPlateau 적용: 성능이 개선되지 않으면 학습률을 줄임
reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=5, min_lr=0.00001)
early_stopping = EarlyStopping(monitor='val_loss', patience=10, min_delta=0.0001, restore_best_weights=True)
# 모델 컴파일
model_test.compile(optimizer=Adam(learning_rate=0.0001), loss='sparse_categorical_crossentropy')
# 모델 학습
history_test = model_test.fit(x_train, y_train, epochs=100, validation_split=0.2, callbacks=[early_stopping, reduce_lr]).history

# Test Data Predict Result
# =======================================================
# 1. confusion_matrix
# [[222   0   0   0   0   0]
# [  0 188  10   0   0   0]
# [  0  17 218   0   0   0]
# [  0   0   0 192   0   0]
# [  0   0   0   2 154   0]
# [  0   0   0   2   0 172]]
# 2. classification_report
#           precision    recall  f1-score   support

#           0       1.00      1.00      1.00       222
#           1       0.92      0.95      0.93       198
#           2       0.96      0.93      0.94       235
#           3       0.98      1.00      0.99       192
#           4       1.00      0.99      0.99       156
#           5       1.00      0.99      0.99       174
#     accuracy                           0.97      1177
  # macro avg       0.98      0.98      0.98      1177
# weighted avg       0.97      0.97      0.97      1177

# 3. accuracy_score
# 0.9694915254237289

## 4.성능비교

* 세부 요구사항
    - test 데이터에 대한 전처리
    - 각 모델에 대해서 test 데이터로 성능 측정

In [None]:
# test 데이터에 대한 전처리

# x, y 분리
x = testdata.drop('Activity', axis=1)
y = testdata['Activity']

# x, y 전처리
y = LabelEncoder().fit_transform(y)
scaler = MinMaxScaler()
x = scaler.fit_transform(x)

# 모델에 대하여 예측 및 성능 측정
test_pred = model18.predict(x)
test_pred=test_pred.argmax(axis=1)
print(confusion_matrix(y, test_pred))
print(classification_report(y, test_pred))
print(accuracy_score(y, test_pred))
Average accuracy score of the 5 models


# Average accuracy score of the 5 models for test data

# Base Model: 0.9245
# model 1: 0.9619306594153637
# model 2: 0.9748470428280082
# model 3: 0.9601694915254237
# model 4: 0.984065613
# model 5: 0.9832503944653477
# model 6: 0.9661016949152542
