# AI 기계 시설물 분야 AI 학습용 데이터 활용 경진대회 - 시계열정맨팀 모델 소스코드

# **Data preprocessing**

## normalize train data size
### -  split train data to size (60,1)

## data aumentation
### -  shift varying amount(<60) and split into (60,1) 

## final train data size
### - kimm : 180000
### - vib  : 140000
### - cur  : 130000*3

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
cd /content/drive/MyDrive/기계데이터

/content/drive/MyDrive/기계데이터


## **모델 학습**
#### kimm, vib는 동일한 모델, cur은 동일 레이어 다중 입력 모델


### fcn(kimm&vib) model

In [None]:
import os
import numpy as np
import pandas as pd
import sys
import sklearn
import tensorflow.keras as keras
import tensorflow as tf
import time
 
from keras import layers
from keras.models import Sequential, Model
from keras import Input
from keras.layers.merge import concatenate

from sklearn.model_selection import train_test_split

In [None]:
def create_directory(directory_path):
    if os.path.exists(directory_path):
        return None
    else:
        try:
            os.makedirs(directory_path)
        except:
            return None
        return directory_path

In [None]:
def read_dataset(root_dir, dataset_name):
  
    datasets_dict = {}

    root_dir_dataset = root_dir
    df = pd.read_csv(root_dir_dataset + '/' + dataset_name + '.csv')
    df = df.drop(['Unnamed: 0'], axis = 'columns')
    df = sklearn.utils.shuffle(df)

    y = df.values[:, -1]

    x = df.iloc[:,:-1]

    x_train, x_test, y_train, y_test = train_test_split(x,y,test_size = 0.1, shuffle =True, random_state = 1004) 

    x_train.columns = range(x_train.shape[1])
    x_test.columns = range(x_test.shape[1])

    x_train = x_train.values
    x_test = x_test.values

    # normalization
    std_ = x_train.std(axis=1, keepdims=True)
    std_[std_ == 0] = 1.0
    x_train = (x_train - x_train.mean(axis=1, keepdims=True)) / std_

    std_ = x_test.std(axis=1, keepdims=True)
    std_[std_ == 0] = 1.0
    x_test = (x_test - x_test.mean(axis=1, keepdims=True)) / std_

    datasets_dict[dataset_name] = (x_train.copy(), y_train.copy(), x_test.copy(),
                                    y_test.copy())
    return datasets_dict

In [None]:
def save_logs(output_directory, hist, y_pred, y_true, duration, lr=True, y_true_val=None, y_pred_val=None):

    hist_df = pd.DataFrame(hist.history)
    hist_df.to_csv(output_directory + 'history.csv', index=False)

    df_metrics = calculate_metrics(y_true, y_pred, duration, y_true_val, y_pred_val)
    df_metrics.to_csv(output_directory + 'df_metrics.csv', index=False)

    index_best_model = hist_df['loss'].idxmin()
    row_best_model = hist_df.loc[index_best_model]

    df_best_model = pd.DataFrame(data=np.zeros((1, 6), dtype=np.float), index=[0],
                                 columns=['best_model_train_loss', 'best_model_val_loss', 'best_model_train_acc',
                                          'best_model_val_acc', 'best_model_learning_rate', 'best_model_nb_epoch'])

    df_best_model['best_model_train_loss'] = row_best_model['loss']
    df_best_model['best_model_val_loss'] = row_best_model['val_loss']
    df_best_model['best_model_train_acc'] = row_best_model['accuracy']
    df_best_model['best_model_val_acc'] = row_best_model['val_accuracy']
    if lr == True:
        df_best_model['best_model_learning_rate'] = row_best_model['lr']
    df_best_model['best_model_nb_epoch'] = index_best_model

    df_best_model.to_csv(output_directory + 'df_best_model.csv', index=False)

    plot_epochs_metric(hist, output_directory + 'epochs_loss.png')    # plot losses

    return df_metrics

In [None]:
def calculate_metrics(y_true, y_pred, duration, y_true_val=None, y_pred_val=None):

    res = pd.DataFrame(data=np.zeros((1, 4), dtype=np.float), index=[0],
                       columns=['precision', 'accuracy', 'recall', 'duration'])
    res['precision'] = precision_score(y_true, y_pred, average='macro')
    res['accuracy'] = accuracy_score(y_true, y_pred)

    if not y_true_val is None:
        res['accuracy_val'] = accuracy_score(y_true_val, y_pred_val)

    res['recall'] = recall_score(y_true, y_pred, average='macro')
    res['duration'] = duration
    return res

In [None]:
class Classifier_FCN:

	def __init__(self, output_directory, input_shape, nb_classes, verbose=False,build=True):
		self.output_directory = output_directory
		if build == True:
			self.model = self.build_model(input_shape, nb_classes)
			if(verbose==True):
				self.model.summary()
			self.verbose = verbose
			self.model.save_weights(self.output_directory+'model_init.h5')
		return

	def build_model(self, input_shape, nb_classes):
		input_layer = keras.layers.Input(input_shape)

		conv1 = keras.layers.Conv1D(filters=128, kernel_size=8, padding='same')(input_layer)
		conv1 = keras.layers.BatchNormalization()(conv1)
		conv1 = keras.layers.Activation(activation='relu')(conv1)

		conv2 = keras.layers.Conv1D(filters=256, kernel_size=5, padding='same')(conv1)
		conv2 = keras.layers.BatchNormalization()(conv2)
		conv2 = keras.layers.Activation('relu')(conv2)

		conv3 = keras.layers.Conv1D(128, kernel_size=3,padding='same')(conv2)
		conv3 = keras.layers.BatchNormalization()(conv3)
		conv3 = keras.layers.Activation('relu')(conv3)

		gap_layer = keras.layers.GlobalAveragePooling1D()(conv3)

		output_layer = keras.layers.Dense(nb_classes, activation='softmax')(gap_layer)

		model = keras.models.Model(inputs=input_layer, outputs=output_layer)

		model.compile(loss='categorical_crossentropy', optimizer = keras.optimizers.Adam(), 
			metrics=['accuracy'])

		reduce_lr = keras.callbacks.ReduceLROnPlateau(monitor='loss', factor=0.5, patience=50, 
			min_lr=0.0001)

		file_path = self.output_directory+'best_model.h5'

		model_checkpoint = keras.callbacks.ModelCheckpoint(filepath=file_path, monitor='loss', 
			save_best_only=True)

		self.callbacks = [reduce_lr,model_checkpoint]

		return model 

	def model_fit(self, x_train, y_train, x_val, y_val,y_true):
		if not tf.test.is_gpu_available:
			print('error')
			exit()

		batch_size = 8
		nb_epochs = 100

		mini_batch_size = int(min(x_train.shape[0]/10, batch_size))

		start_time = time.time() 

		hist = self.model.fit(x_train, y_train, batch_size=mini_batch_size, epochs=nb_epochs,
			verbose=self.verbose, validation_data=(x_val,y_val), callbacks=self.callbacks)
		
		duration = time.time() - start_time

		self.model.save(self.output_directory+'last_model.h5')

		model = keras.models.load_model(self.output_directory+'best_model.h5')

		y_pred = model.predict(x_val)

		y_pred = np.argmax(y_pred , axis=1)

		save_logs(self.output_directory, hist, y_pred, y_true, duration)

		keras.backend.clear_session()

In [None]:
def fit_classifier():
    x_train = datasets_dict[dataset_name][0]
    y_train = datasets_dict[dataset_name][1]
    x_test = datasets_dict[dataset_name][2]
    y_test = datasets_dict[dataset_name][3]

    nb_classes = len(np.unique(np.concatenate((y_train, y_test), axis=0)))

    # label encoding
    enc = sklearn.preprocessing.OneHotEncoder(categories='auto')
    enc.fit(np.concatenate((y_train, y_test), axis=0).reshape(-1, 1))
    y_train = enc.transform(y_train.reshape(-1, 1)).toarray()
    y_test = enc.transform(y_test.reshape(-1, 1)).toarray()

    y_true = np.argmax(y_test, axis=1)

    if len(x_train.shape) == 2:  # if univariate
        # add a dimension to make it multivariate with one dimension 
        x_train = x_train.reshape((x_train.shape[0], x_train.shape[1], 1))
        x_test = x_test.reshape((x_test.shape[0], x_test.shape[1], 1))

    input_shape = x_train.shape[1:]

    classifier = Classifier_FCN(output_directory, input_shape, nb_classes, verbose = True)

    classifier.model_fit(x_train, y_train, x_test, y_test, y_true)

In [None]:
# main - kimm

root_dir = 'preprocessing_data' # data folder name 
    
dataset_name = 'kimm_df' # data name

output_directory = 'models/kimm/'

test_dir_df_metrics = output_directory + 'df_metrics.csv'

print('Method: ', dataset_name)

create_directory(output_directory)
datasets_dict = read_dataset(root_dir, dataset_name)

fit_classifier()

print('DONE')

# the creation of this directory means
create_directory(output_directory + '/DONE')

In [None]:
# main - vibration

root_dir = 'preprocessing_data' # data folder name 
    
dataset_name = 'vibration_new' # data name

output_directory = 'models/vibration/'

test_dir_df_metrics = output_directory + 'df_metrics.csv'

print('Method: ', dataset_name)

create_directory(output_directory)
datasets_dict = read_dataset(root_dir, dataset_name)

fit_classifier()

print('DONE')

# the creation of this directory means
create_directory(output_directory + '/DONE')

### fcn(curr) model

In [None]:
def read_datasetR(root_dir, dataset_Rname):
    datasets_dictR = {}

    root_dir_dataset = root_dir
    df = pd.read_csv(root_dir_dataset + dataset_Rname + '.csv')
    df = sklearn.utils.shuffle(df)

    # y_test = df_test.values[:, -1]
    y = df.values[:, -1]
    x = df.iloc[:,:-1]
    x_train, x_test, y_train, y_test = train_test_split(x,y,test_size = 0.1, shuffle =True, random_state = 1004) 

    x_train.columns = range(x_train.shape[1])
    x_test.columns = range(x_test.shape[1])

    x_train = x_train.values
    x_test = x_test.values

    # znorm
    std_ = x_train.std(axis=1, keepdims=True)
    std_[std_ == 0] = 1.0
    x_train = (x_train - x_train.mean(axis=1, keepdims=True)) / std_

    std_ = x_test.std(axis=1, keepdims=True)
    std_[std_ == 0] = 1.0
    x_test = (x_test - x_test.mean(axis=1, keepdims=True)) / std_

    datasets_dictR[dataset_Rname] = (x_train.copy(), y_train.copy(), x_test.copy(),
                                    y_test.copy())

    return datasets_dictR


def read_datasetT(root_dir, dataset_Tname):
    datasets_dictT = {}

    root_dir_dataset = root_dir
    df = pd.read_csv(root_dir_dataset + dataset_Tname + '.csv')
    df = sklearn.utils.shuffle(df)

    # y_test = df_test.values[:, -1]
    y = df.values[:, -1]
    x = df.iloc[:,:-1]
    x_train, x_test, y_train, y_test = train_test_split(x,y,test_size = 0.1, shuffle =True, random_state = 1004) 

    x_train.columns = range(x_train.shape[1])
    x_test.columns = range(x_test.shape[1])

    x_train = x_train.values
    x_test = x_test.values

    # znorm
    std_ = x_train.std(axis=1, keepdims=True)
    std_[std_ == 0] = 1.0
    x_train = (x_train - x_train.mean(axis=1, keepdims=True)) / std_

    std_ = x_test.std(axis=1, keepdims=True)
    std_[std_ == 0] = 1.0
    x_test = (x_test - x_test.mean(axis=1, keepdims=True)) / std_

    datasets_dictT[dataset_Tname] = (x_train.copy(), y_train.copy(), x_test.copy(),
                                    y_test.copy())

    return datasets_dictT


def read_datasetS(root_dir, dataset_Sname):
    datasets_dictS = {}

    root_dir_dataset = root_dir
    df = pd.read_csv(root_dir_dataset + dataset_Sname + '.csv')
    df = sklearn.utils.shuffle(df)

    y = df.values[:, -1]
    x = df.iloc[:,:-1]
    x_train, x_test, y_train, y_test = train_test_split(x,y,test_size = 0.1, shuffle =True, random_state = 1004) 

    x_train.columns = range(x_train.shape[1])
    x_test.columns = range(x_test.shape[1])

    x_train = x_train.values
    x_test = x_test.values

    # znorm
    std_ = x_train.std(axis=1, keepdims=True)
    std_[std_ == 0] = 1.0
    x_train = (x_train - x_train.mean(axis=1, keepdims=True)) / std_

    std_ = x_test.std(axis=1, keepdims=True)
    std_[std_ == 0] = 1.0
    x_test = (x_test - x_test.mean(axis=1, keepdims=True)) / std_

    datasets_dictS[dataset_Sname] = (x_train.copy(), y_train.copy(), x_test.copy(),
                                    y_test.copy())

    return datasets_dictS

In [None]:
class Classifier_FCN_CUR:
  
  def __init__(self, output_directory, input_shape1, input_shape2,  input_shape3, nb_classes, verbose=True,build=True):
    self.output_directory = output_directory
    if build == True:
      self.model = self.build_cur_model(input_shape1,input_shape2, input_shape3, nb_classes)
      if(verbose==True):
        self.model.summary()
      self.verbose = verbose
      self.model.save_weights(self.output_directory+'model_init.h5')
    return
    
  def build_cur_model(self, input_shape1, input_shape2, input_shape3, nb_classes):
    # R
    input_layerR = keras.layers.Input(input_shape1)
    conv1R = keras.layers.Conv1D(filters=128, kernel_size=8, padding='same')(input_layerR)
    conv1R = keras.layers.BatchNormalization()(conv1R)
    conv1R = keras.layers.Activation(activation='relu')(conv1R)
    conv2R = keras.layers.Conv1D(filters=256, kernel_size=5, padding='same')(conv1R)
    conv2R = keras.layers.BatchNormalization()(conv2R)
    conv2R = keras.layers.Activation('relu')(conv2R)
    conv3R = keras.layers.Conv1D(128, kernel_size=3,padding='same')(conv2R)
    conv3R = keras.layers.BatchNormalization()(conv3R)
    conv3R = keras.layers.Activation('relu')(conv3R)
    gap_layerR = keras.layers.GlobalAveragePooling1D()(conv3R)
    
    model1 = gap_layerR

    #T
    input_layerT = keras.layers.Input(input_shape2)
    conv1T = keras.layers.Conv1D(filters=128, kernel_size=8, padding='same')(input_layerT)
    conv1T = keras.layers.BatchNormalization()(conv1T)
    conv1T = keras.layers.Activation(activation='relu')(conv1T)
    conv2T = keras.layers.Conv1D(filters=256, kernel_size=5, padding='same')(conv1T)
    conv2T = keras.layers.BatchNormalization()(conv2T)
    conv2T = keras.layers.Activation('relu')(conv2T)
    conv3T = keras.layers.Conv1D(128, kernel_size=3,padding='same')(conv2T)
    conv3T = keras.layers.BatchNormalization()(conv3T)
    conv3T = keras.layers.Activation('relu')(conv3T)
    gap_layerT = keras.layers.GlobalAveragePooling1D()(conv3T)
    model2 = gap_layerT
    
    # S
    input_layerS = keras.layers.Input(input_shape3)
    conv1S = keras.layers.Conv1D(filters=128, kernel_size=8, padding='same')(input_layerS)
    conv1S = keras.layers.BatchNormalization()(conv1S)
    conv1S = keras.layers.Activation(activation='relu')(conv1S)
    conv2S = keras.layers.Conv1D(filters=256, kernel_size=5, padding='same')(conv1S)
    conv2S = keras.layers.BatchNormalization()(conv2S)
    conv2S = keras.layers.Activation('relu')(conv2S)
    conv3S = keras.layers.Conv1D(128, kernel_size=3,padding='same')(conv2S)
    conv3S = keras.layers.BatchNormalization()(conv3S)
    conv3S = keras.layers.Activation('relu')(conv3S)
    gap_layerS = keras.layers.GlobalAveragePooling1D()(conv3S)
    model3 = gap_layerS

    model = layers.concatenate([model1, model2, model3])
    output_layer = keras.layers.Dense(nb_classes, activation='softmax')(model)
    model = keras.models.Model(inputs=[input_layerR, input_layerT, input_layerS], outputs=output_layer)  # 모델 정의 끝

    model.compile(loss='categorical_crossentropy', optimizer = keras.optimizers.Adam(), 
			metrics=['accuracy'])
      
    reduce_lr = keras.callbacks.ReduceLROnPlateau(monitor='loss', factor=0.5, patience=50, min_lr=0.00001)
    file_path = self.output_directory+'best_model.h5'
    
    model_checkpoint = keras.callbacks.ModelCheckpoint(filepath=file_path, monitor='loss', save_best_only=True)
    self.callbacks = [reduce_lr,model_checkpoint]
    
    return model

  def model_fit(self, Rx_train,Tx_train,Sx_train, y_train, Rx_test, Tx_test, Sx_test, y_test, y_true):
    if not tf.test.is_gpu_available:
      print('error')
      exit()

    batch_size = 64
    nb_epochs = 100
    mini_batch_size = int(min(Rx_train.shape[0]/10, batch_size))
    start_time = time.time() 

    hist = self.model.fit( [Rx_train, Tx_train, Sx_train], y_train, batch_size=mini_batch_size, epochs=nb_epochs,
			verbose=self.verbose, validation_data=([Rx_test,Tx_test,Sx_test],y_test), callbacks=self.callbacks)
      
    duration = time.time() - start_time
    
    self.model.save(self.output_directory+'last_model.h5')
    
    model = keras.models.load_model(self.output_directory+'best_model.h5')
    
    y_pred = model.predict([Rx_test,Tx_test,Sx_test])

    y_pred = np.argmax(y_pred , axis=1)
    
    save_logs(self.output_directory, hist, y_pred, y_true, duration)
    
    keras.backend.clear_session()

In [None]:
def fit_classifier_curr():

    Rx_train = datasets_dictR[dataset_Rname][0]
    y_train = datasets_dictR[dataset_Rname][1]
    Rx_test = datasets_dictR[dataset_Rname][2]
    y_test = datasets_dictR[dataset_Rname][3]

    Tx_train = datasets_dictT[dataset_Tname][0]
    Ty_train = datasets_dictT[dataset_Tname][1]
    Tx_test = datasets_dictT[dataset_Tname][2]
    Ty_test = datasets_dictT[dataset_Tname][3]

    Sx_train = datasets_dictS[dataset_Sname][0]
    Sy_train = datasets_dictS[dataset_Sname][1]
    Sx_test = datasets_dictS[dataset_Sname][2]
    Sy_test = datasets_dictS[dataset_Sname][3]


    nb_classes = len(np.unique(np.concatenate((y_train, y_test), axis=0)))
    enc = sklearn.preprocessing.OneHotEncoder(categories='auto')
    enc.fit(np.concatenate((y_train, y_test), axis=0).reshape(-1, 1))
    y_train = enc.transform(y_train.reshape(-1, 1)).toarray()
    y_test = enc.transform(y_test.reshape(-1, 1)).toarray()

    y_true = np.argmax(y_test, axis=1)

    if len(Rx_train.shape) == 2: 
        Rx_train = Rx_train.reshape((Rx_train.shape[0], Rx_train.shape[1], 1))
        Rx_test = Rx_test.reshape((Rx_test.shape[0], Rx_test.shape[1], 1))

    if len(Tx_train.shape) == 2:  
        Tx_train = Tx_train.reshape((Tx_train.shape[0], Tx_train.shape[1], 1))
        Tx_test = Tx_test.reshape((Tx_test.shape[0], Tx_test.shape[1], 1))

    if len(Sx_train.shape) == 2:  
        Sx_train = Sx_train.reshape((Sx_train.shape[0], Sx_train.shape[1], 1))
        Sx_test = Sx_test.reshape((Sx_test.shape[0], Sx_test.shape[1], 1))

    input_shape1 = Rx_train.shape[1:]
    print('inputshape 출력',input_shape1)
    input_shape2 = Tx_train.shape[1:]
    input_shape3 = Sx_train.shape[1:]

    print('input shape: ', input_shape1)

    classifier = Classifier_FCN_CUR(output_directory, input_shape1, input_shape2,  input_shape3, nb_classes, verbose=True)
    classifier.model_fit(Rx_train,Tx_train,Sx_train, y_train, Rx_test, Tx_test, Sx_test, y_test, y_true)


In [None]:
# main 
root_dir = 'preprocessing_data/' # data folder name 

# data name
dataset_Rname = 'curR'
dataset_Tname = 'curT'
dataset_Sname = 'curS'

output_directory = 'models/current/'

test_dir_df_metrics = output_directory + 'df_metrics.csv'

print('Method: ', dataset_Rname)

create_directory(output_directory)
datasets_dictR = read_datasetR(root_dir, dataset_Rname)
datasets_dictT = read_datasetT(root_dir, dataset_Tname)
datasets_dictS = read_datasetS(root_dir, dataset_Sname)

fit_classifier_curr()

print('DONE')

# the creation of this directory means
create_directory(output_directory + '/DONE')

## **Test -kimm/vib/curr 나누기**

In [None]:
submission = pd.read_csv('dataSet_Submission_Re/Submission_Commit.csv')

In [None]:
submission = submission.drop(index= 100)

In [None]:
submission_file = submission.reset_index()

In [None]:
submission = submission_file.drop(['index'], axis = 'columns')

In [None]:
Kimm_index = []
Vib_index = []
Curr_index = []
for i in range(len(submission)):
    if submission['Category'][i] == 'Kimm':
        Kimm_index.append(i)
    elif submission['Category'][i] == 'Vibration':
        Vib_index.append(i)
    elif submission['Category'][i] == 'Current':
        Curr_index.append(i)

In [None]:
folder_name = 'dataSet_Submission_Re'
file_list = os.listdir(folder_name)
file_list.sort()
print(file_list)
file_list = file_list[:-1]
print(len(file_list))

['001.csv', '002.csv', '003.csv', '004.csv', '005.csv', '006.csv', '007.csv', '008.csv', '009.csv', '010.csv', '011.csv', '012.csv', '013.csv', '014.csv', '015.csv', '016.csv', '017.csv', '018.csv', '019.csv', '020.csv', '021.csv', '022.csv', '023.csv', '024.csv', '025.csv', '026.csv', '027.csv', '028.csv', '029.csv', '030.csv', '031.csv', '032.csv', '033.csv', '034.csv', '035.csv', '036.csv', '037.csv', '038.csv', '039.csv', '040.csv', '041.csv', '042.csv', '043.csv', '044.csv', '045.csv', '046.csv', '047.csv', '048.csv', '049.csv', '050.csv', '051.csv', '052.csv', '053.csv', '054.csv', '055.csv', '056.csv', '057.csv', '058.csv', '059.csv', '060.csv', '061.csv', '062.csv', '063.csv', '064.csv', '065.csv', '066.csv', '067.csv', '068.csv', '069.csv', '070.csv', '071.csv', '072.csv', '073.csv', '074.csv', '075.csv', '076.csv', '077.csv', '078.csv', '079.csv', '080.csv', '081.csv', '082.csv', '083.csv', '084.csv', '085.csv', '086.csv', '087.csv', '088.csv', '089.csv', '090.csv', '091.csv'

### kimm

In [None]:
for i, index in enumerate(Kimm_index):
    print(file_list[index])
    one_signal = pd.read_csv(os.path.join(folder_name, file_list[index]))
    one_signal = pd.DataFrame(one_signal['value'].to_numpy().reshape(1,-1))
    if i == 0:
        kimm_df = pd.DataFrame(one_signal)
    else:
        kimm_df = pd.concat([kimm_df, one_signal])


001.csv
002.csv
007.csv
009.csv
011.csv
016.csv
018.csv
020.csv
023.csv
024.csv
026.csv
029.csv
041.csv
043.csv
044.csv
049.csv
053.csv
054.csv
057.csv
062.csv
069.csv
072.csv
074.csv
077.csv
079.csv
084.csv
085.csv
086.csv
099.csv
100.csv
106.csv
109.csv
120.csv
122.csv
123.csv
125.csv
128.csv
130.csv
132.csv
133.csv
134.csv
141.csv
144.csv
149.csv


In [None]:
df_kimm = kimm_df.reset_index()
df_kimm = df_kimm.drop(['index'], axis = 'columns')

In [None]:
df_kimm.head()

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59
0,0.07247,0.148405,-0.005984,0.080193,-0.012642,0.109416,0.034238,0.012545,0.079226,-0.159403,-0.046621,0.060151,-0.003577,0.083524,-0.007164,0.021305,0.044288,0.039392,0.006948,0.025837,0.016643,-0.008716,0.076095,0.049042,0.084887,0.021137,0.029714,0.002027,0.070583,0.03109,0.012484,0.026629,0.015141,0.04827,0.078368,0.081856,0.039785,0.094646,0.012825,0.054586,0.056137,0.068157,0.013187,0.018519,0.036048,0.039433,0.042249,0.007657,0.042256,-0.002996,0.058905,0.001762,0.077766,0.025617,0.052918,0.026266,0.076022,0.050075,0.04564,0.032859
1,-0.12828,0.125385,-0.02854,-0.027849,-0.010175,-0.029405,0.086704,0.008864,0.143977,-0.034374,0.114979,0.04289,0.116378,0.030064,0.04356,-0.033613,0.002899,0.028841,0.033855,0.062885,0.023564,0.053296,0.012006,0.084109,0.065108,0.074852,0.02149,-0.00485,0.038795,0.042328,0.045866,0.013715,0.048431,-0.000233,0.064802,0.041573,0.071847,0.024251,0.039011,0.01811,0.066123,0.060172,0.028283,0.046933,0.026959,0.071874,0.031991,0.056067,0.028114,0.04973,0.024438,0.040598,0.017703,0.051307,0.023619,0.041275,0.036324,0.054426,0.046997,0.052025
2,-0.000147,-0.000178,-0.000166,-0.000146,-0.000137,-0.000167,-0.000153,-0.00014,-0.000145,-0.000122,-9.8e-05,-0.000149,-0.000126,-0.000116,-8.5e-05,-0.000112,-0.000102,-0.000131,-9.8e-05,-0.000127,-0.000128,-0.000126,-0.000106,-0.00012,-0.000112,-0.00011,-0.000103,-0.000103,-0.000103,-7.3e-05,-0.000103,-8.5e-05,-0.000112,-0.000108,-7.5e-05,-8.9e-05,-0.000104,-8.1e-05,-0.000101,-0.00013,-0.000125,-0.000131,-0.000142,-0.000138,-0.00016,-0.000144,-0.000161,-0.000214,-0.000202,-0.000171,-0.000216,-0.000175,-0.000178,-0.000183,-0.000186,-0.000173,-0.000208,-0.000193,-0.000254,-0.00022
3,0.003663,0.015076,0.006914,0.001995,-0.004502,0.010738,0.013627,-0.006417,0.001377,0.00147,3.9e-05,-0.007124,-0.005212,0.005553,-0.004801,-0.009535,0.008419,0.002777,-0.01057,-0.00135,0.005058,-0.001047,0.002331,0.00083,0.003319,-0.001502,-0.001088,0.016614,-0.00358,-0.004602,0.009506,0.003333,0.003651,-0.008404,0.003147,0.011109,-0.010006,0.003519,0.01286,0.001178,-0.005656,0.01375,0.000125,-0.002524,0.001447,0.006092,-0.003864,-0.000242,-0.004087,0.003548,-0.023124,-0.004702,6.1e-05,-0.003876,-0.00591,-0.005753,-0.005322,-0.007519,-0.007264,-0.000643
4,7e-06,2e-05,4.4e-05,4.7e-05,3.1e-05,4.2e-05,5.1e-05,3.9e-05,3.9e-05,5.4e-05,4.1e-05,1.4e-05,5.1e-05,3.5e-05,3.6e-05,3.3e-05,1.5e-05,1.2e-05,5.2e-05,5.8e-05,6.2e-05,5.3e-05,5.3e-05,3.2e-05,8.7e-05,4.8e-05,6.6e-05,5.4e-05,6.9e-05,6.4e-05,4.3e-05,4.6e-05,8.3e-05,7.3e-05,6.4e-05,4.4e-05,4.6e-05,5.9e-05,3.2e-05,6.2e-05,5.4e-05,3.5e-05,4.5e-05,4.2e-05,4e-05,5.4e-05,6.2e-05,2.2e-05,4.4e-05,1.5e-05,4.3e-05,3.4e-05,3.9e-05,2.6e-05,5.1e-05,6.2e-05,4e-06,5.2e-05,3.1e-05,3e-05


In [None]:
print(df_kimm.shape)

(44, 60)


In [None]:
df_kimm.to_csv('kimm_test.csv')

### vibration

In [None]:
for i, index in enumerate(Vib_index):
    print(file_list[index])
    one_signal = pd.read_csv(os.path.join(folder_name, file_list[index]))
    one_signal = pd.DataFrame(one_signal['value'].to_numpy().reshape(1,-1))
    if i == 0:
        vib_df = pd.DataFrame(one_signal)
    else:
        vib_df = pd.concat([vib_df, one_signal])


003.csv
005.csv
008.csv
010.csv
012.csv
014.csv
015.csv
019.csv
021.csv
025.csv
027.csv
028.csv
030.csv
032.csv
033.csv
036.csv
038.csv
039.csv
045.csv
046.csv
048.csv
051.csv
052.csv
056.csv
058.csv
059.csv
063.csv
064.csv
066.csv
070.csv
071.csv
073.csv
082.csv
083.csv
088.csv
090.csv
091.csv
092.csv
094.csv
096.csv
102.csv
104.csv
108.csv
110.csv
113.csv
114.csv
117.csv
119.csv
127.csv
129.csv
131.csv
135.csv
137.csv
140.csv
143.csv
148.csv


In [None]:
df_vib = vib_df.reset_index()
df_vib = df_vib.drop(['index'], axis = 'columns')

In [None]:
df_vib.head()

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59
0,0.001162,0.001894,0.002627,0.00104,-0.000669,0.000429,0.000796,0.000307,0.001772,-0.000547,0.001406,0.002871,0.000185,0.000918,0.001406,0.001162,6.3e-05,-0.000547,0.000796,0.001162,0.002749,0.002871,0.003359,0.001894,0.000796,0.002627,0.000429,0.000429,-0.000303,0.000429,0.001772,0.001528,0.001406,0.000674,-0.001157,0.000307,0.000918,6.3e-05,-0.000425,0.002016,-0.000547,-0.000547,0.000674,0.002016,6.3e-05,-0.001035,-0.002378,-5.9e-05,0.000551,-0.002012,0.000551,0.002505,0.001406,-0.001402,-0.000181,0.000918,-0.001402,-0.002378,-0.000669,-0.001157
1,0.002066,0.002676,0.000235,0.000113,0.000845,0.002676,0.002188,-0.000742,-0.001352,0.000235,0.000235,-9e-06,0.000235,-0.000253,-0.000375,-0.000131,-0.000742,0.000601,0.002676,0.001456,-0.00184,-0.001718,-0.001352,-0.002084,-0.001474,-0.002695,-0.000742,-0.001352,-0.000375,-0.000375,0.000235,-0.000619,-0.00123,-0.002451,-0.003183,-0.000864,0.001822,0.002188,0.000845,-0.000375,-0.000375,0.001944,0.001822,-0.001474,-0.001474,0.001456,0.001089,-0.001718,-0.001474,-9e-06,0.000235,-0.000253,-0.00123,-0.001596,-0.002328,-0.000619,-0.001474,-0.001962,-9e-06,0.001212
2,-6.2e-05,0.001281,0.003478,0.003234,0.000305,0.000549,0.000427,0.001037,0.000793,0.001281,0.000793,0.000305,-0.00116,-0.00055,0.000182,-0.000428,0.000182,-0.000184,0.001159,0.000549,0.000549,0.002746,0.000671,-0.000916,0.001159,0.000915,0.001403,0.001891,0.001037,0.000915,0.000182,-6.2e-05,-0.000428,-0.000428,-0.000794,-0.00055,-0.001404,0.002136,0.000915,-0.002381,-0.000794,0.001159,0.000305,0.000549,0.002136,0.001403,-0.000306,0.000549,-0.000794,-0.000428,0.001525,-6.2e-05,-0.001282,0.000182,6e-05,-0.000794,0.000182,0.000305,0.001403,-0.000428
3,-0.000765,-0.000765,-0.000643,-0.000887,-0.001376,-0.00162,-0.001742,-0.001376,-0.000765,-0.000765,-0.000277,0.000822,0.00131,0.000577,0.000333,0.001066,0.000944,8.9e-05,-0.000155,0.000944,0.00131,0.000333,0.000577,0.000455,-0.001132,-0.001986,-0.001498,-0.001132,-0.000521,0.000211,0.000211,0.000211,0.001066,0.001798,0.00131,0.000333,0.000211,0.0007,0.000455,-0.000521,-0.000277,0.0007,0.000822,-0.000399,-0.001254,-0.000521,0.000455,0.000822,8.9e-05,-0.000521,-0.000155,-0.000155,-0.001376,-0.001742,-0.000521,-3.3e-05,-0.000643,-0.001009,-0.000643,8.9e-05
4,0.001327,0.004379,0.002304,-0.0038,-0.00087,0.001938,0.001938,0.001083,0.003036,0.001938,0.002426,0.001449,-0.00087,-0.00148,-0.000138,-1.6e-05,0.000351,0.000595,0.000839,-0.001969,-0.000504,-0.000626,0.000351,0.000107,-0.00087,-0.00148,-0.001358,-0.001236,-0.001358,0.002304,0.001083,-0.000504,0.000839,0.001205,-0.00087,-1.6e-05,0.000717,0.000961,0.001449,-1.6e-05,-0.001236,-0.002335,-0.001725,0.000107,-1.6e-05,0.001327,0.001693,0.002182,0.001815,-0.001602,-0.001725,0.001327,0.000839,-0.002579,-0.000748,0.000839,-0.002335,-0.000748,0.001327,0.002182


In [None]:
print(df_vib.shape)

(56, 60)


In [None]:
df_vib.to_csv('vib_test.csv')

### current

In [None]:
for i, index in enumerate(Curr_index):
    print(file_list[index])
    if index == 141:
        print('error => zero padding')
        one_signal = pd.read_csv(os.path.join(folder_name, file_list[index]))
        r_signal = pd.DataFrame(one_signal.iloc[:,1].to_numpy().reshape(1,-1))
        t_signal = pd.DataFrame(np.zeros(r_signal.shape))
        s_signal = pd.DataFrame(np.zeros(r_signal.shape))
    else:
        one_signal = pd.read_csv(os.path.join(folder_name, file_list[index]))
        r_signal = pd.DataFrame(one_signal.iloc[:,1].to_numpy().reshape(1,-1))
        t_signal = pd.DataFrame(one_signal.iloc[:,2].to_numpy().reshape(1,-1))
        s_signal = pd.DataFrame(one_signal.iloc[:,3].to_numpy().reshape(1,-1))
    if i == 0:
        curr_r_df = pd.DataFrame(r_signal)
        curr_t_df = pd.DataFrame(t_signal)
        curr_s_df = pd.DataFrame(s_signal)
    else:
        curr_r_df = pd.concat([curr_r_df, r_signal])
        curr_t_df = pd.concat([curr_t_df, t_signal])
        curr_s_df = pd.concat([curr_s_df, s_signal])

004.csv
006.csv
013.csv
017.csv
022.csv
031.csv
034.csv
035.csv
037.csv
040.csv
042.csv
047.csv
050.csv
055.csv
060.csv
061.csv
065.csv
067.csv
068.csv
075.csv
076.csv
078.csv
080.csv
081.csv
087.csv
089.csv
093.csv
095.csv
097.csv
098.csv
101.csv
103.csv
105.csv
107.csv
111.csv
112.csv
115.csv
116.csv
118.csv
121.csv
124.csv
126.csv
136.csv
138.csv
139.csv
142.csv
error => zero padding
145.csv
146.csv
147.csv
150.csv


In [None]:
df_curr_r = curr_r_df.reset_index()
df_curr_r = df_curr_r.drop(['index'], axis = 'columns')

In [None]:
df_curr_t = curr_t_df.reset_index()
df_curr_t = df_curr_t.drop(['index'], axis = 'columns')

In [None]:
df_curr_s = curr_s_df.reset_index()
df_curr_s = df_curr_s.drop(['index'], axis = 'columns')

In [None]:
print(df_curr_r.shape)
print(df_curr_t.shape)
print(df_curr_s.shape)

(50, 60)
(50, 60)
(50, 60)


In [None]:
df_curr_r.to_csv('curr_R_test.csv')
df_curr_t.to_csv('curr_T_test.csv')
df_curr_s.to_csv('curr_S_test.csv')

## **모델 적용** 

In [None]:
#normalization
def z_norm(df):
  data = df.to_numpy()
  std_ = data.std(axis=1, keepdims=True)
  std_[std_ == 0] = 1.0
  norm_data = (data - data.mean(axis=1, keepdims=True)) / std_
  return norm_data

### kimm

In [None]:
kimm_model = keras.models.load_model('models/kimm/best_model.h5',
                                custom_objects=None, compile=True)

In [None]:
kimm_sample = tf.convert_to_tensor(z_norm(df_kimm), dtype=tf.float64)

In [None]:
y_pred = kimm_model.predict(kimm_sample)
y_kimm_pred = np.argmax(y_pred,axis=1)
print(y_kimm_pred)

[0 0 1 0 1 4 4 0 0 1 1 0 0 2 0 1 3 2 0 4 4 1 2 0 4 0 1 4 3 2 4 0 0 0 0 2 0
 0 0 0 0 0 0 0]


In [None]:
len(y_kimm_pred)

44

### vibration

In [None]:
vib_model = keras.models.load_model('models/vibration/best_model.h5',
                                custom_objects=None, compile=True)

In [None]:
vib_sample = tf.convert_to_tensor(z_norm(df_vib), dtype=tf.float64)

In [None]:
y_pred = vib_model.predict(vib_sample)
y_vib_pred = np.argmax(y_pred,axis=1)
print(y_vib_pred)

[4 1 1 0 2 3 1 1 2 4 4 3 4 4 1 0 3 0 1 0 1 3 3 4 3 4 4 2 0 0 4 1 2 4 2 2 1
 3 2 4 4 0 0 0 4 0 0 0 2 0 0 2 2 4 1 3]


In [None]:
len(y_vib_pred)

56

### current

In [None]:
curr_model = keras.models.load_model('models/current/best_model.h5',
                                custom_objects=None, compile=True)

In [None]:
cur_R_sample = tf.convert_to_tensor(z_norm(df_curr_r), dtype=tf.float64)
cur_T_sample = tf.convert_to_tensor(z_norm(df_curr_t), dtype=tf.float64)
cur_S_sample = tf.convert_to_tensor(z_norm(df_curr_s), dtype=tf.float64)

In [None]:
y_pred = curr_model.predict([cur_R_sample,cur_T_sample, cur_S_sample])
y_curr_pred = np.argmax(y_pred,axis=1)
print(y_curr_pred)

[3 0 0 3 1 0 1 3 2 3 2 0 3 3 4 2 4 4 4 4 4 1 2 1 2 4 4 1 0 1 0 0 0 0 0 0 0
 4 0 0 0 4 0 0 0 0 0 0 1 4]


In [None]:
len(y_curr_pred)

50

## make Submission_commit.csv

In [None]:
for i,index in enumerate(Kimm_index):
  submission['Label'][index] = y_kimm_pred[i]
for i,index in enumerate(Vib_index):
  submission['Label'][index] = y_vib_pred[i]
for i,index in enumerate(Curr_index):
  submission['Label'][index] = y_curr_pred[i]

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  after removing the cwd from sys.path.
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [None]:
for i in range(len(submission)):
  label = submission['Label'][i]
  if label == 0:
    submission['Label'][i] = 'normal'
  elif label == 1:
    submission['Label'][i] = 'bearing'
  elif label == 2:
    submission['Label'][i] = 'belt'
  elif label == 3:
    submission['Label'][i] = 'misalignment'
  elif label == 4:
    submission['Label'][i] = 'unbalance'
  else:
    print('index: '+ str(index))
    print('error')

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  after removing the cwd from sys.path.
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  iloc._setitem_with_indexer(indexer, value)
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  if sys.path[0] == '':
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  # Remove the CWD from sys.path while we load stuff.
A value is tr

In [None]:
submission

Unnamed: 0,File_name,Category,Motor,Label
0,1,Kimm,15.0,normal
1,2,Kimm,15.0,normal
2,3,Vibration,7.5,unbalance
3,4,Current,5.5,misalignment
4,5,Vibration,11.0,bearing
...,...,...,...,...
145,146,Current,5.5,normal
146,147,Current,55.0,bearing
147,148,Vibration,15.0,misalignment
148,149,Kimm,18.5,normal


In [None]:
submission.isnull().sum()

File_name    0
Category     0
Motor        0
Label        0
dtype: int64

In [None]:
submission.to_csv('Submission_Commit.csv', index = False)