# Power Quality Classification using Muti Layer Perceptron (Dataset 2)

This notebook focusses on developing a Multi Layer perceptron which classifies a particular power signal into its respective power quality condition. The dataset used here contains signals which belong to one of the 6 classes(power quality condition). Each signal is characterized by 256 data points. Here the signals provided are in time domain.

In [None]:
#importing the required libraries
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import tensorflow as tf
import datetime
from scipy.fft import fft,fftfreq
from sklearn.preprocessing import StandardScaler
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Activation
from tensorflow.keras.optimizers import Adam

In [None]:
#loading the dataset using pandas
x_train = pd.read_csv("../input/power-quality-distribution-dataset-2/VoltageL1Train.csv")
y_train = pd.read_csv("../input/power-quality-distribution-dataset-2/outputTrain.csv")
x_test = pd.read_csv("../input/power-quality-distribution-dataset-2/VoltageL1Test.csv")
y_test = pd.read_csv("../input/power-quality-distribution-dataset-2/outputTest.csv")

In [None]:
print("x_train",x_train.shape)
print("y_train",y_train.shape)
print("x_test",x_test.shape)
print("y_test",y_test.shape)

## Data Preprocessing

This segment of notebook contains all the preprocessing steps which are performed on the data.

### Data cleaning

In [None]:
#dropna() function is used to remove all those rows which contains NA values
x_train.dropna(axis=0,inplace=True)
y_train.dropna(axis=0,inplace=True)
x_test.dropna(axis=0,inplace=True)
y_test.dropna(axis=0,inplace=True)

In [None]:
#shape of the data frame after dropping the rows containing NA values
print("x_train",x_train.shape)
print("y_train",y_train.shape)
print("x_test",x_test.shape)
print("y_test",y_test.shape)

In [None]:
#here we are constructing the array which will finally contain the column names
header =[]
for i in range(1,x_train.shape[1]+1):
    header.append("Col"+str(i))

In [None]:
#assigning the column name array to the respectinve dataframes
x_train.columns = header
x_test.columns = header

In [None]:
#assigning the column name array to the respectinve dataframes
header = ["output"]
y_train.columns = header
y_test.columns = header

In [None]:
x_train.head()

In [None]:
x_test.head()

In [None]:
y_train.head()

In [None]:
y_test.head()

In [None]:
#here we are splitting the training set in the ratio of 70%,30% (training set,validation set)
from sklearn.model_selection import train_test_split
x_train, x_val, y_train, y_val = train_test_split(x_train, y_train, test_size=0.30, random_state=42)

In [None]:
# get_dummies function is used here to perform one hot encoding of the y_* numpy arrays
y_train_hot = pd.get_dummies(y_train['output'])
y_test_hot = pd.get_dummies(y_test['output'])
y_val_hot = pd.get_dummies(y_val['output'])

In [None]:
y_train_hot.head()

### Data transformation

The data transformation steps employed here are as follows:<br>

1) Fourier Transform<br>
2) Normalization

In [None]:
x_train = x_train.to_numpy()
x_test = x_test.to_numpy()
x_val = x_val.to_numpy()

In [None]:
#here we are overwritting the dataframe with the respective waves which we obtained after doing fourier 
#transformation
for i in range(0,x_train.shape[0]):
    x_train[i][:] = np.abs(fft(x_train[i][:]))
    
for i in range(0,x_test.shape[0]):
    x_test[i][:] = np.abs(fft(x_test[i][:]))

for i in range(0,x_val.shape[0]):
    x_val[i][:] = np.abs(fft(x_val[i][:]))

In [None]:
#here we are performing normalization
transform = StandardScaler()
x_train_tr = transform.fit_transform(x_train)
x_test_tr = transform.fit_transform(x_test)
x_val_tr = transform.fit_transform(x_val)

In [None]:
x_train_tr = np.log(x_train)
x_test_tr = np.log(x_test)
x_val_tr = np.log(x_val)

In [None]:
#final dimensions of the data
print("Training",x_train_tr.shape)
print(y_train_hot.shape)
print("Validation",x_val_tr.shape)
print(y_val_hot.shape)
print("Test",x_test_tr.shape)
print(y_test_hot.shape)
sampling_rate = x_train_tr.shape[1]

## Model creation and training

In [None]:
def model_training(no_of_classes,sampling_rate):
    model = Sequential()

    model.add(Dense(64, input_shape=(sampling_rate,), activation = 'relu'))
    model.add(Dense(32, activation = 'relu'))
    #model.add(Dropout(0.6))
    model.add(Dense(16, activation = 'relu'))
    #model.add(Dropout(0.6))
    model.add(Dense(no_of_classes, activation = 'softmax'))

    model.compile(loss='categorical_crossentropy', metrics=['accuracy'], optimizer='adam')
    return(model)

In [None]:

log_dir = "logs2/fit/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir, histogram_freq=1)

model = model_training(6,sampling_rate)
history = model.fit(x_train_tr, y_train_hot, batch_size=64, epochs=30, validation_data=(x_val_tr, y_val_hot), callbacks=[tensorboard_callback])


In [None]:
%load_ext tensorboard
%tensorboard --logdir logs2/fit

In [None]:
model.summary()

## Model evaluation

In [None]:
print("min val:",min(history.history['val_accuracy']))
print("avg val",np.mean(history.history['val_accuracy']) )
print("max val:",max(history.history['val_accuracy']))
print()
print("min train:",min(history.history['accuracy']))
print("avg train",np.mean(history.history['accuracy']) )
print("max train:",max(history.history['accuracy']))

In [None]:
pred_acc = model.evaluate(x_test_tr,y_test_hot)
print("Test accuracy is {}".format(pred_acc))

In [None]:
from sklearn.metrics import confusion_matrix
import seaborn as sn

In [None]:
array = confusion_matrix(y_test_hot.to_numpy().argmax(axis=1), model.predict(x_test_tr).argmax(axis=1))

In [None]:
array

In [None]:
to_cm = pd.DataFrame(array, index = [i for i in ["Type-1","Type-2","Type-3","Type-4","Type-5","Type-6"]],
                  columns = [i for i in ["Type-1","Type-2","Type-3","Type-4","Type-5","Type-6"]])
plt.figure(figsize = (13,9))
sn.heatmap(to_cm, annot=True)