# 5. RNN 계열의 네트워크를 사용한 시계열 데이터 처리
## 5.1 데이터 로드와 전처리

In [1]:
import pandas as pd
import numpy as np
import pickle
import os
from collections import deque
from sklearn.preprocessing import MinMaxScaler

train_x, train_y = np.array([]), np.array([])
train_x_seq, train_y_seq = np.array([[]]), np.array([])

want_para = ['WAFWTK', 'FCWP']
scaler = MinMaxScaler()

for file in os.listdir('./DB')[0:2]:
    time_seq = 3
    if '.csv' in file:
        csv_db = pd.read_csv(f'./DB/{file}', index_col=0)
        #1. CSV 파일을 Numpy 배열로 전환
        get_xdb = csv_db[want_para].to_numpy()

        #2. 라벨링
        get_ydb = csv_db.loc[:, 'Normal_0'].to_numpy()
        accident_nub = {
            '12': 1, # LOCA
            '13': 2, # SGTR
            '15': 1, # PZR PORV [LOCA]
            '17': 1, # Feedwater line leak [LOCA]
            '18': 3, # Steam Line Rupture MSLB
            '52': 3, # Steam Line Rupture MSLB (non-isolable)
        }
        get_mal_nub = file.split(',')[0][1:] # '(12, 000000, 10)' -> 12
        get_y = np.where(get_ydb != 0, accident_nub[get_mal_nub], get_ydb)

        #3. 데이터 축적
        train_x = get_xdb if train_x.shape[0] == 0 else np.concatenate((train_x, get_xdb), axis=0)
        train_y = np.append(train_y, get_y, axis=0)

        if time_seq != 1:
            for i in range(len(get_xdb) - time_seq - 1):
                # print(get_xdb[i:i + time_seq], get_y[i + time_seq + 1])
                x__ = np.array([get_xdb[i:i + time_seq]])
                train_x_seq = x__ if train_x_seq.shape[1] == 0 else np.concatenate((train_x_seq, np.array([get_xdb[i:i + time_seq]])), axis=0)

                y__ = np.array([get_y[i + time_seq + 1]])
                train_y_seq = y__ if train_y_seq.shape[0] == 0 else np.concatenate((train_y_seq, y__), axis=0)

        #4. min_max scaler update
        scaler.partial_fit(train_x)

        if time_seq != 1:
            print(f'Read {file} \t train_x shape : {np.shape(train_x_seq)} train_y shape : {np.shape(train_y_seq)}')
        else:
            print(f'Read {file} \t train_x shape : {np.shape(train_x)} train_y shape : {np.shape(train_y)}')

# 5. 전체 db min-max scaling
train_x_seq = np.array([scaler.transform(_) for _ in train_x_seq])

# 6. 저장
save_data_info = {
    'scaler': scaler,
    'want_para': want_para,
    'time_seq': time_seq,
    'train_x': train_x_seq,
    'train_y': train_y_seq,
}

with open('db_info.pkl', 'wb') as f:
    pickle.dump(save_data_info, f)

Read (52, 120700, 75).csv 	 train_x shape : (117, 3, 2) train_y shape : (117,)
Read (12, 100010, 50).csv 	 train_x shape : (234, 3, 2) train_y shape : (234,)


# 4. 훈련데이터 불러오기 및 네트워크 훈련
## 4.1 훈련데이터 불러오기

In [2]:
with open('db_info.pkl', 'rb') as f:
    save_data_info = pickle.load(f)

## 3.2 네트워크 빌드 및 훈련

In [4]:
import tensorflow.keras as k

model = k.Sequential([
    # k.layers.RNN(k.layers.SimpleRNNCell(32), input_shape=(save_data_info['time_seq'], len(save_data_info['want_para']))),
    k.layers.RNN(k.layers.SimpleRNNCell(32), input_shape=(save_data_info['time_seq'], len(save_data_info['want_para']))),
    k.layers.Flatten(),
    k.layers.Dense(128, activation='relu'),
    k.layers.Dense(4, activation='softmax')
])

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])
print(model.summary())
model.fit(save_data_info['train_x'], save_data_info['train_y'], epochs=5)


Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
rnn (RNN)                    (None, 32)                1120      
_________________________________________________________________
flatten (Flatten)            (None, 32)                0         
_________________________________________________________________
dense (Dense)                (None, 128)               4224      
_________________________________________________________________
dense_1 (Dense)              (None, 4)                 516       
Total params: 5,860
Trainable params: 5,860
Non-trainable params: 0
_________________________________________________________________
None
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<tensorflow.python.keras.callbacks.History at 0x21f85604eb0>

## 3.3 네트워크 저장

In [None]:
model.save_weights('model.h5')

## 3.4 네트워크 로드

In [None]:
model.load_weights('model.h5')