# one-year-industrial-component-degradation

Dataset:  https://www.kaggle.com/inIT-OWL/one-year-industrial-component-degradation

### Note:
**I'm new to Kaggle kernels, this notebook was created localy on jupyter then adapted for the kaggle environment, but alot of things got left out, such as how all the data was put in one dataframe, so here's my github repo for the original notebook**

https://github.com/Iuryck/Machine_Degradation

        Imports

In [None]:
import matplotlib.pyplot as plt
import os
import pandas as pd
import seaborn as sns
import numpy as np
from sklearn.cluster import KMeans
from sklearn import preprocessing
from sklearn.svm import OneClassSVM
from numpy.random import seed
from keras.layers import Input, Dropout
from keras.layers.core import Dense 
from keras.models import Model, Sequential, load_model
from keras import regularizers
from keras.models import model_from_json
from scipy.special import softmax

# Context
##### This dataset contains the machine data of a degrading component recorded over the duration of 12 month total. It was initiated in the European research and innovation project IMPROVE.

# Content
###### The Vega shrink-wrapper from OCME is deployed in large production lines in the food and beverage industry. The machine groups loose bottles or cans into set package sizes, wraps them in plastic film and then heat-shrinks the plastic film to combine them into a package. The plastic film is fed into the machine from large spools and is then cut to the length needed to wrap the film around a pack of goods. The cutting assembly is an important component of the machine to meet the high availability target. Therefore, the blade needs to be set-up and maintained properly. Furthermore, the blade can not be inspected visually during operation due to the blade being enclosed in a metal housing and its fast rotation speed. Monitoring the cutting blades degradation will increase the machines reliability and reduce unexpected downtime caused by failed cuts.



#### The 519 files in the dataset are of the format MM-DDTHHMMSS_NUM_modeX.csv, where MM is the month ranging from 1-12 (not calendar month), DD is the day of the month, HHMMSS is the start time of day of recording, NUM is the sample number and X is a mode ranging from 1-8. Each file is a ~8 second sample with a time resolution of 4ms that totals 2048 time-samples for every file.

#  DATA

### --pCut::Motor_Torque -> Torque in nM

### --pCut::CTRL_Position_controller::Lag_error -> Represent the instantaneous position error between the set-point from the path generator and the real current encoder position of the motor

### --pCut::CTRL_Position_controller::Actual_position -> Cutting blade position in mm

### --pCut::CTRL_Position_controller::Actual_speed -> Speed of the cutting blade

### --pSvolFilm::CTRL_Position_controller::Actual_position -> Plastic film unwinder position in mm

### --pSvolFilm::CTRL_Position_controller::Actual_speed -> Speed of the plastic film unwinder

### --pSvolFilm::CTRL_Position_controller::Lag_error -> Represent the instantaneous position error between the set-point from the path generator and the real current encoder position of the motor

### --pSpintor::VAX_speed -> VAX measurement of performance


**First things first, we have info on the samples that aren't in our sample csv files. Also, all our samples are separated in different files. So I compiled it all together to get a big dataframe with everything we need as the One_year_compiled.csv.**

In [None]:
main_df = pd.read_csv('../input/datasetsone-year-compiledcsv/One_year_compiled.csv')
main_df.describe()

**Let's look for correlation between data. As you will see, the only things that look correlated is the Motor Torque, Blade Speed and the Blade Lag Error, also the VAX speed and wrapper speed. Excluding of course, month and sample_number, and all self correlations on the diagonal.**

In [None]:
#heatmap of correlations from -1 to 1
sns.heatmap(main_df.corr(), vmin= -1, vmax = 1)

**Lets drop some columns and flip our motor torque column values by multiplying it by -1. Just for some more visual understanding**

In [None]:
main_df = pd.read_csv('../input/datasetsone-year-compiledcsv/One_year_compiled.csv')
#Dropping nonsense columns for this proposal                                 (axis=1) = columns
main_df = main_df.drop(['day', 'hour', 'sample_Number', 'month', 'timestamp'], axis=1)
#Flipping column values
main_df['pCut::Motor_Torque'] = main_df['pCut::Motor_Torque'] *-1
#Heatmap
sns.heatmap(main_df.corr(), vmin= -1, vmax = 1)


## Treating non-numerical data

**In our dataset, the modes for the machine may influence in data patterns, so we need to transform that data from string to numerical classes. This function will do just that.** 

In [None]:
def handle_non_numeric(df):
    # Values in each column for each column
    columns = df.columns.values
    
    for column in columns:
        
        # Dictionary with each numerical value for each text
        text_digit_vals = {}
        
        # Receives text to convert to a number
        def convert_to_int (val):
            
            # Returns respective numerical value for class
            return text_digit_vals[val]
        
        # If values in columns are not float or int
        if df[column].dtype !=np.int64 and df[column].dtype != np.float64:
            
            # Gets values form current column
            column_contents = df[column].values.tolist()
            
            # Gets unique values from current column
            unique_elements = set(column_contents)
            
            # Classification starts at 0
            x=0
            
            for unique in unique_elements:
                
                # Adds the class value for the text in dictionary, if it's not there
                if unique not in text_digit_vals:
                    text_digit_vals[unique] = x
                    x+=1
            
            # Maps the numerical values to the text values in columns 
            df[column] = list(map(convert_to_int, df[column]))
    
    return df

# Approach

**A few algorithms will be tested to see if we can get some info on the machine. We will use OneClass SVM and KMeans with 1 cluster to try clustering. After that we will try an Autoencoder to try and reproduce data based on the machines healthy condition.**

**In all 3 cases we will grab a slice of the first rows and consider it as the Healthy State of the machine, then feed it to the algorithms. After that we will give the algorithms the entirety of the dataset and see how they perform on the rest of the data. Deviations, low scores, and high losses will be considered as anomalies to be studied.**

# OneClass SVM approach

**OneClass SVM is used for outlier detection, it tries to find 2 classes in the data, the "normal" class and the outliers. We will use the SVM to try and find outliers and anomalies.**

In [None]:
#Grabbing the entire dataset
main_df = pd.read_csv('../input/datasetsone-year-compiledcsv/One_year_compiled.csv')
#Dropping columns with unwanted/irrelevant info for the algorithm
main_df = main_df.drop(['day', 'hour', 'sample_Number', 'month', 'timestamp'], axis=1)
#Transforming modes into classified data
main_df = handle_non_numeric(main_df)

#Passing our dataframe as our features
X = main_df

#Defining preprocessor for the data
scaler = preprocessing.MinMaxScaler()
#Preprocessing
X = pd.DataFrame(scaler.fit_transform(X), 
                              columns=X.columns, 
                              index=X.index)


#Scaling
X = preprocessing.scale(X)
#Splitting the feature data for training data. First 200.000 rows.
X_train = X[:200000]


#Creating a fitting OneClass SVM
ocsvm = OneClassSVM(nu=0.25, gamma=0.05)
ocsvm.fit(X_train)

**Predicting and classifying the dataset in anomalies and non-anomalies, then passing it to a dataframe.**

In [None]:


df=main_df.copy()
df['anomaly'] = pd.Series(ocsvm.predict(X))



**Saving Dataframe**

In [None]:
#Saving Dataframe.
df.to_csv('Labled_df.csv')

**Reading into dataframe**

In [None]:
#Reading into dataframe
df = pd.read_csv('../input/created/Labled_df.csv', index_col=0)
df.head()

**Visualizing anomalies**

In [None]:
#Getting labled groups
scat_1 = df.groupby('anomaly').get_group(1)
scat_0 = df.groupby('anomaly').get_group(-1)

# Plot size
plt.subplots(figsize=(15,7))

# Plot group 1 -labeled, color green, point size 1
plt.plot(scat_1.index,scat_1['pCut::Motor_Torque'], 'g.', markersize=1)

# Plot group -1 -labeled, color red, point size 1
plt.plot(scat_0.index, scat_0['pCut::Motor_Torque'],'r.', markersize=1)


**Visualizing scores for the whole dataset**

In [None]:
#Creating a dataframe for the score of each data sample
score = pd.DataFrame()
#Returning scores for the dataset
score['score'] = ocsvm.score_samples(X)

#Plot size
plt.subplots(figsize=(15,7))
#Plotting
score['score'].plot()
#Saving score dataframe
score.to_csv('SVM_Score.csv')

**Inverted score moving mean**

In [None]:
fig, ax = plt.subplots(figsize=(15,7))


((score['score'].rolling(20000).mean())*-1).plot(ax=ax)

**Scat plot to see the score through the noise**

In [None]:
plt.subplots(figsize=(15,7))
plt.plot(score.index, score['score'],'r.', markersize=1)

# KMeans approach

**Kmeans approach will do the same thing as the OC-SVM**

In [None]:
#------ Preparing features for training and future prediction -----
main_df = pd.read_csv('../input/datasetsone-year-compiledcsv/One_year_compiled.csv')
main_df = main_df.drop(['day', 'hour', 'sample_Number', 'month', 'timestamp'], axis=1)
main_df = handle_non_numeric(main_df)
X = main_df

scaler = preprocessing.MinMaxScaler()

X = pd.DataFrame(scaler.fit_transform(X), 
                              columns=X.columns, 
                              index=X.index)



X = preprocessing.scale(X)
#-------------------------------------------------------------------


#Percentage of the data that will be considered healthy condition
train_percentage = 0.15
#Integer value for the slice that will be considered healthy condition
train_size = int(len(main_df.index)*train_percentage)
#Grabbing slice for training data
X_train = X[:train_size]


#Defining KMeans with 1 cluster
kmeans = KMeans(n_clusters=1)
#Fitting the algorithm
kmeans.fit(X_train)

#Creating a copy of the main dataset
k_anomaly = main_df.copy()

#Dataframe now will receive the distance of each data sample from the cluster
k_anomaly = pd.DataFrame(kmeans.transform(X))

#Saving cluster distane into csv file
k_anomaly.to_csv('KM_Distance.csv')

#Plot
plt.subplots(figsize=(15,7))

plt.plot(k_anomaly.index, k_anomaly[0], 'g', markersize=1)

# AutoEncoder approach

**AutoEncodersare Neural nets that expandes and compresses data into higher and lower dimensions, then tries to recreate the data. The idea is that the autoencoder will understand the relation between the features and from that, it will recreate the exact data it was given.**

**We will feed the algorithm the healthy state of the machine. As it tries to rebuild the rest of the data as the healthy state, reconstruction loss, difference between predicted machine data and real machine data, will be considered "unhealthy" state.**

In [None]:
#------------------------- Preparing data for training --------------------------- 
main_df = pd.read_csv('../input/datasetsone-year-compiledcsv/One_year_compiled.csv')
main_df = main_df.drop(['day', 'hour', 'sample_Number', 'month', 'timestamp'], axis=1)
main_df = handle_non_numeric(main_df)
X = main_df

scaler = preprocessing.MinMaxScaler()

X = pd.DataFrame(scaler.fit_transform(X), 
                              columns=X.columns, 
                              index=X.index)



X = preprocessing.scale(X)


train_percentage = 0.15
train_size = int(len(main_df.index)*train_percentage)

X_train = X[:train_size]
#----------------------------------------------------------------------------------



#Seed for random batch validation and training
seed(10)


#Elu activatoin function
act_func = 'elu'

# Input layer
model=Sequential()

# First hidden layer, connected to input vector X. 
model.add(Dense(50,activation=act_func,
                kernel_initializer='glorot_uniform',
                kernel_regularizer=regularizers.l2(0.0),
                input_shape=(X_train.shape[1],)
               )
         )
# Second hidden layer
model.add(Dense(10,activation=act_func,
                kernel_initializer='glorot_uniform'))
# Thrid hidden layer
model.add(Dense(50,activation=act_func,
                kernel_initializer='glorot_uniform'))

# Input layer
model.add(Dense(X_train.shape[1],
                kernel_initializer='glorot_uniform'))

# Loss function and Optimizer choice
model.compile(loss='mse',optimizer='adam')

# Train model for 50 epochs, batch size of 200 
NUM_EPOCHS=50
BATCH_SIZE=200

#Grabbing validation and training loss over epochs
history=model.fit(np.array(X_train),np.array(X_train),
                  batch_size=BATCH_SIZE, 
                  epochs=NUM_EPOCHS,
                  validation_split=0.1,
                  verbose = 1)


**Plotting Validation loss and Training loss over the epochs**

In [None]:
plt.subplots(figsize=(15,7))

plt.plot(history.history['loss'],'b',label='Training loss')
plt.plot(history.history['val_loss'],'r',label='Validation loss')
plt.legend(loc='upper right')
plt.xlabel('Epochs')
plt.ylabel('Loss, [mse]')

plt.show()

**Now we will feed the algortihm the same training data, and make it try to reconstruct data. We will then see the distribution of the loss over the train data, further on we will use this distribution to determine some Thresholds.**

In [None]:
#Reconstructing train data
X_pred = model.predict(np.array(X_train))

#Creating dataframe for reconstructed data
X_pred = pd.DataFrame(X_pred,columns=main_df.columns)
X_pred.index = pd.DataFrame(X_train).index

#Dataframe to get the difference of predicted data and real data. 
scored = pd.DataFrame(index=pd.DataFrame(X_train).index)
#Returning the mean of the loss for each column
scored['Loss_mae'] = np.mean(np.abs(X_pred-X_train), axis = 1)

#plot
plt.subplots(figsize=(15,7))
sns.distplot(scored['Loss_mae'],
             bins = 15, 
             kde= True,
            color = 'blue');





**Now to do the same thing but with all our data to see the loss over time, this will give us interesting data.**

In [None]:

#Reconstructing full data
X_pred = model.predict(np.array(X))
X_pred = pd.DataFrame(X_pred,columns=main_df.columns)
X_pred.index = pd.DataFrame(X).index

#Returning mean of the losses for each column and putting it in a dataframe
scored = pd.DataFrame(index=pd.DataFrame(X).index)
scored['Loss_mae'] = np.mean(np.abs(X_pred-X), axis = 1)

#Plot size
plt.subplots(figsize=(15,7))


#Saving dataframe
scored.to_csv('AutoEncoder_loss.csv')

#Plot
plt.plot(scored['Loss_mae'],'b',label='Prediction Loss')

plt.legend(loc='upper right')
plt.xlabel('Sample')
plt.ylabel('Loss, [mse]')

# Results Analysis

### Scatter plot for each algorithm scoring, to see through noise.

In [None]:
#Plot size
plt.subplots(figsize=(15,7))
#Reading loss csv file
enc_loss = pd.read_csv('../input/created/AutoEncoder_loss.csv')
#Plot
plt.plot(enc_loss.index,enc_loss['Loss_mae'], 'g.', markersize=1,label="AutoEncoder Loss")
#Labels and legends
plt.legend(loc='upper right')
plt.xlabel('Sample')
#Show plot
plt.show()

plt.subplots(figsize=(15,7))
k_anomaly = pd.read_csv('../input/created/KM_Distance.csv')
plt.plot(k_anomaly.index,k_anomaly['0'], 'g.', markersize=1,label="KM cluster Distance")
plt.legend(loc='upper right')
plt.xlabel('Sample')
plt.show()

plt.subplots(figsize=(15,7))
score = pd.read_csv('../input/created/SVM_Score.csv')
plt.plot(score.index,score['score'], 'g.', markersize=1,label="OCSVM score")
plt.legend(loc='upper right')
plt.xlabel('Sample')
plt.show()


**Plotting each algorithm scoring together, with OCSVM flipped over 0.**

In [None]:
#Plot size
plt.subplots(figsize=(15,7))

#Reading each socring csv file
k_anomaly = pd.read_csv('../input/created/KM_Distance.csv')
score = pd.read_csv('../input/created/SVM_Score.csv')
enc_loss = pd.read_csv('../input/created/AutoEncoder_loss.csv')

#Scaling data for vizualization
k_distance = k_anomaly/k_anomaly.max()
svm_score = (score/score.max())*-1

plt.plot(enc_loss.index,enc_loss['Loss_mae'], label="AutoEncoder Loss")
plt.plot(svm_score.index, svm_score['score'],label="OCSVM score")
plt.plot(k_distance.index,k_distance['0'], label="Kmeans Euclidean Dist")



plt.gca().legend(('AutoEncoder Loss','OCSVM score * -1','Kmeans Euclidean Dist'))


**Looking for correlation between the algorithms**

In [None]:
#Reading score files
k_anomaly = pd.read_csv('../input/created/KM_Distance.csv')
score = pd.read_csv('../input/created/SVM_Score.csv')
enc_loss = pd.read_csv('../input/created/AutoEncoder_loss.csv')

# Dataframe to see correlation
corr = pd.DataFrame()

#Passing score data to corr dataframe 
corr['SVM_score'] = score['score']
corr['KM_cluster_distance'] = k_anomaly['0']
corr['AutoEnc_loss'] = enc_loss['Loss_mae']

#Seeing correlation
corr.corr()

**Scatter plot with movving mean**

In [None]:
#---- Reading data and passing it to dataframe again ----- 

k_anomaly = pd.read_csv('../input/created/KM_Distance.csv')
score = pd.read_csv('../input/created/SVM_Score.csv')
enc_loss = pd.read_csv('../input/created/AutoEncoder_loss.csv')

corr = pd.DataFrame()
corr['SVM_score'] = score['score']
corr['KM_cluster_distance'] = k_anomaly['0']
corr['AutoEnc_loss'] = enc_loss['Loss_mae']
#---------------------------------------------------------


#Plot size
plt.subplots(figsize=(15,7))

#Scatter plot of SVM score
plt.plot(corr.index, corr['SVM_score'], 'g.', markersize=1, label = 'OCSVM_score')
#Plotting moving mean of 1000 data points
plt.plot(corr.index, corr['SVM_score'].rolling(1000).mean(), 'r', markersize=1, label = 'Moving Mean')
#Legend
plt.legend(loc='upper right')
#Show
plt.show()


plt.subplots(figsize=(15,7))
  
plt.plot(corr.index, corr['KM_cluster_distance'], 'g.', markersize=1, label = 'KM_cluster_distance')
plt.plot(corr.index, corr['KM_cluster_distance'].rolling(1000).mean(), 'r', markersize=1, label = 'Moving Mean')

plt.legend(loc='upper right')
plt.show()





plt.subplots(figsize=(15,7))
  
plt.plot(corr.index, corr['AutoEnc_loss'], 'g.', markersize=1, label = 'AutoEnc_loss')
plt.plot(corr.index, corr['AutoEnc_loss'].rolling(1000).mean(), 'r', markersize=1, label = 'Moving Mean')


plt.legend(loc='upper right')
plt.show()



**Loss Distribution over training data**

In [None]:
k_anomaly = pd.read_csv('../input/created/KM_Distance.csv')
score = pd.read_csv('../input/created/SVM_Score.csv')
enc_loss = pd.read_csv('../input/created/AutoEncoder_loss.csv')

corr = pd.DataFrame()
corr['SVM_score'] = score['score']
corr['KM_cluster_distance'] = k_anomaly['0']
corr['AutoEnc_loss'] = enc_loss['Loss_mae']



#Plot size
plt.subplots(figsize=(10,7))
#Hist plot of first 160.000 rows, 15 bins
sns.distplot(corr['SVM_score'].head(160000), bins=15)
#Show
plt.show()

plt.subplots(figsize=(10,7))
sns.distplot(corr['KM_cluster_distance'].head(160000),bins=15)
plt.show()

plt.subplots(figsize=(10,7))
sns.distplot(corr['AutoEnc_loss'].head(160000),bins=15)
plt.show()

**Loss distribution over entire dataset**

In [None]:
k_anomaly = pd.read_csv('../input/created/KM_Distance.csv')
score = pd.read_csv('../input/created/SVM_Score.csv')
enc_loss = pd.read_csv('../input/created/AutoEncoder_loss.csv')

corr = pd.DataFrame()
corr['SVM_score'] = score['score']
corr['KM_cluster_distance'] = k_anomaly['0']
corr['AutoEnc_loss'] = enc_loss['Loss_mae']





plt.subplots(figsize=(10,7))
sns.distplot(corr['SVM_score'], bins=15)
plt.show()

plt.subplots(figsize=(10,7))
sns.distplot(corr['KM_cluster_distance'],bins=15)
plt.show()

plt.subplots(figsize=(10,7))
sns.distplot(corr['AutoEnc_loss'],bins=15)
plt.show()

**Now we will use the info on the Training Loss distribution to determine some thresholds for the graphs. Moving means will also be plotted.**

**_Upper threshold = Highest values of training loss distribution_**

**_Lower threshold = Lowest values of training loss distribution_**

**_Highes density = Mode of training loss distribution_**

In [None]:
k_anomaly = pd.read_csv('../input/created/KM_Distance.csv')
score = pd.read_csv('../input/created/SVM_Score.csv')
enc_loss = pd.read_csv('../input/created/AutoEncoder_loss.csv')

corr = pd.DataFrame()
corr['SVM_score'] = score['score']
corr['KM_cluster_distance'] = k_anomaly['0']
corr['AutoEnc_loss'] = enc_loss['Loss_mae']



#Creating an array for the thresholds to be plotted over the entire dataset
lower_threshold = np.full((corr['SVM_score'].size, 1), 0)
upper_threshold = np.full((corr['SVM_score'].size, 1), 18000)
high_density_threshold = np.full((corr['SVM_score'].size, 1), 13250)

#Plot size
plt.subplots(figsize=(15,7))

#Score Plot
plt.plot(corr.index, corr['SVM_score'], 'k', markersize=1, label = 'OCSVM_score')
#Moving mean plot
plt.plot(corr.index, corr['SVM_score'].rolling(100).mean(), 'r', markersize=1, label = 'Moving Mean')
#Threshold plots
plt.plot(corr.index, lower_threshold, label='Lower Threshold')
plt.plot(corr.index, upper_threshold, label = 'Upper Threshold')
plt.plot(corr.index, high_density_threshold, label = 'Highest Density')
plt.legend(loc='upper right')
#Show
plt.show()


lower_threshold = np.full((corr['KM_cluster_distance'].size, 1), 1.2)
upper_threshold = np.full((corr['KM_cluster_distance'].size, 1), 17.5)
high_density_threshold = np.full((corr['KM_cluster_distance'].size, 1), 2.5)

plt.subplots(figsize=(15,7))
  
plt.plot(corr.index, corr['KM_cluster_distance'], 'k', markersize=1, label = 'KM_cluster_distance')
plt.plot(corr.index, corr['KM_cluster_distance'].rolling(100).mean(), 'r', markersize=1, label = 'Moving Mean')
plt.plot(corr.index, lower_threshold, label='Lower Threshold')
plt.plot(corr.index, upper_threshold, label = 'Upper Threshold')
plt.plot(corr.index, high_density_threshold, label = 'Highest Density')
plt.legend(loc='upper right')
plt.show()



lower_threshold = np.full((corr['AutoEnc_loss'].size, 1), 0)
upper_threshold = np.full((corr['AutoEnc_loss'].size, 1), 0.1)
high_density_threshold = np.full((corr['AutoEnc_loss'].size, 1), 0.05)

plt.subplots(figsize=(15,7))
  
plt.plot(corr.index, corr['AutoEnc_loss'], 'k', markersize=1, label = 'AutoEnc_loss')
plt.plot(corr.index, corr['AutoEnc_loss'].rolling(100).mean(), 'r', markersize=1, label = 'Moving Mean')
plt.plot(corr.index, lower_threshold, label='Lower Threshold')
plt.plot(corr.index, upper_threshold, label = 'Upper Threshold')
plt.plot(corr.index, high_density_threshold, label = 'Highest Density')
plt.legend(loc='upper right')
plt.show()


**Scatter plot of the algorithm scores vs each other**

In [None]:
k_anomaly = pd.read_csv('../input/created/KM_Distance.csv')
score = pd.read_csv('../input/created/SVM_Score.csv')
enc_loss = pd.read_csv('../input/created/AutoEncoder_loss.csv')

corr = pd.DataFrame()
corr['SVM_score'] = score['score']
corr['KM_cluster_distance'] = k_anomaly['0']
corr['AutoEnc_loss'] = enc_loss['Loss_mae']




plt.subplots(figsize=(15,7))
  
plt.plot(corr['KM_cluster_distance'],corr['SVM_score'],'b.',markersize=1 )
plt.xlabel('KM')
plt.ylabel('SVM')
plt.show()


plt.subplots(figsize=(15,7))
  
plt.plot(corr['AutoEnc_loss'],corr['SVM_score'],'b.' ,markersize=1 )
plt.xlabel('Encoder')
plt.ylabel('SVM')
plt.show()

plt.subplots(figsize=(15,7))
  
plt.plot(corr['AutoEnc_loss'],corr['KM_cluster_distance'],'b.' ,markersize=1 )
plt.xlabel('Encoder')
plt.ylabel('KM')
plt.show()

# Continuing with Autoencoder

**Though there is some visual anomaly simularity between the algorithms, the clustering algorithms give us much noise and not much to work on. On the other hand the autoencoder has a almost certain run to failure point. We can´t conclude with absolute certainty that components were changed after the highest loss peak, but it is much possible.**

**Furthermore, we will now analyze the loss of the encoder by month, with the thresholds.**  

In [None]:
k_anomaly = pd.read_csv('../input/created/KM_Distance.csv')
score = pd.read_csv('../input/created/SVM_Score.csv')
enc_loss = pd.read_csv('../input/created/AutoEncoder_loss.csv')

corr = pd.DataFrame()
corr['SVM_score'] = score['score']
corr['KM_cluster_distance'] = k_anomaly['0']
corr['AutoEnc_loss'] = enc_loss['Loss_mae']

main_df = pd.read_csv('../input/datasetsone-year-compiledcsv/One_year_compiled.csv')



#Passing encoder loss to main dataframe, to make it easier to separate by month
main_df['AutoEnc_loss'] = corr['AutoEnc_loss']

#Getting list of months
months = main_df['month'].dropna().unique()

#Looping through every month
for month in months:
    #Grabbing the slice of the dataframe for each month 
    month_df = main_df.groupby('month').get_group(month)
    
    
    # Array Thresholds
    upper_threshold = np.full((month_df['AutoEnc_loss'].size, 1), 0.1)
    high_density_threshold = np.full((month_df['AutoEnc_loss'].size, 1), 0.05)

    #Plot
    plt.subplots(figsize=(15,7))
    plt.plot(month_df.index, month_df['AutoEnc_loss'], label=f'AutoEnc_loss month_{month}')
    plt.plot(month_df.index, upper_threshold, label = 'Upper Threshold')
    plt.plot(month_df.index, high_density_threshold, label = 'Highest Density')
    plt.legend(loc='upper right')
    plt.ylim(0,1.3)
    
    plt.show()
    
    

**Now we will see Loss distribution by month**

In [None]:
k_anomaly = pd.read_csv('../input/created/KM_Distance.csv')
score = pd.read_csv('../input/created/SVM_Score.csv')
enc_loss = pd.read_csv('../input/created/AutoEncoder_loss.csv')

corr = pd.DataFrame()
corr['SVM_score'] = score['score']
corr['KM_cluster_distance'] = k_anomaly['0']
corr['AutoEnc_loss'] = enc_loss['Loss_mae']

main_df = pd.read_csv('../input/datasetsone-year-compiledcsv/One_year_compiled.csv')


main_df['AutoEnc_loss'] = corr['AutoEnc_loss']





months = main_df['month'].dropna().unique()

for month in months:
    month_df = main_df.groupby('month').get_group(month)
    
    
    
    plt.subplots(figsize=(15,7))
    sns.distplot((month_df['AutoEnc_loss']), bins=15).set_title(f'Month {month} Loss Distribution')
    #X axis limits
    plt.xlim([-1.2,1.2])
    plt.show()

    
    
    
    

**An interesting way to see the anomaly is the tail of the distribution. As we could see in the plot by month, month 4 has the highest anomaly point and counts. It's also obervable the influence of that in the distribution of the loss. Maybe by looking at the Kurtosis of each month we can get more info.**

**Kurtosis will tell us about the shape of the distribution. High kurtosis means that alot of data points have the same value, and that the tails or the standard deviation is realy small or non existent (in our case, many datapoints close to 0 means a nice condition of the machine). Low Kurtosis means we have lots of spread out datapoints, which gives the distribution wide fat tails, almost the size of it´s peak.** 

In [None]:
k_anomaly = pd.read_csv('../input/created/KM_Distance.csv')
score = pd.read_csv('../input/created/SVM_Score.csv')
enc_loss = pd.read_csv('../input/created/AutoEncoder_loss.csv')

corr = pd.DataFrame()
corr['SVM_score'] = score['score']
corr['KM_cluster_distance'] = k_anomaly['0']
corr['AutoEnc_loss'] = enc_loss['Loss_mae']

main_df = pd.read_csv('../input/datasetsone-year-compiledcsv/One_year_compiled.csv')


main_df['AutoEnc_loss'] = corr['AutoEnc_loss']


months = main_df['month'].dropna().unique()

for month in months:
    month_df = main_df.groupby('month').get_group(month)
    kurt = (month_df['AutoEnc_loss']).kurtosis()
    print(f'Month {month} kurtosis = {kurt}')

**So months with low kurtosis are the months with more anomalies, which can tell us a little about the condition of the machine.**

**We won't be moving on with this distribution analyses approach, for I personaly don't know much about distribution analyses. _But it maybe a very interesting approach to analyze the data_.** 

# Sensor detection

**Now that we know where the machine has a problem, we will try to find which component/sensor is causing this disturbance in the autoencoder. For that we will train it again, and get it´s predictions and losses for each column to see which one has the highest contribution to the total loss.**

In [None]:
main_df = pd.read_csv('../input/datasetsone-year-compiledcsv/One_year_compiled.csv')
main_df = main_df.drop(['day', 'hour', 'sample_Number', 'month', 'timestamp'], axis=1)
main_df = handle_non_numeric(main_df)
X = main_df

scaler = preprocessing.MinMaxScaler()

X = pd.DataFrame(scaler.fit_transform(X), 
                              columns=X.columns, 
                              index=X.index)



X = preprocessing.scale(X)


train_percentage = 0.15
train_size = int(len(main_df.index)*train_percentage)

X_train = X[:train_size]

seed(10)

act_func = 'elu'

# Input layer:
model=Sequential()
# First hidden layer, connected to input vector X. 
model.add(Dense(50,activation=act_func,
                kernel_initializer='glorot_uniform',
                kernel_regularizer=regularizers.l2(0.0),
                input_shape=(X_train.shape[1],)
               )
         )

model.add(Dense(10,activation=act_func,
                kernel_initializer='glorot_uniform'))

model.add(Dense(50,activation=act_func,
                kernel_initializer='glorot_uniform'))

model.add(Dense(X_train.shape[1],
                kernel_initializer='glorot_uniform'))

model.compile(loss='mse',optimizer='adam')

# Train model for 100 epochs, batch size of 10: 
NUM_EPOCHS=50
BATCH_SIZE=200

history=model.fit(np.array(X_train),np.array(X_train),
                  batch_size=BATCH_SIZE, 
                  epochs=NUM_EPOCHS,
                  validation_split=0.1,
                  verbose = 1)

In [None]:
#Predicting and passing prediction to dataframe
X_pred = model.predict(np.array(X))
X_pred = pd.DataFrame(X_pred,columns=main_df.columns)
X_pred.index = pd.DataFrame(main_df).index

#Passing X from an array to a dataframe
X = pd.DataFrame(X,columns=main_df.columns)
X.index = pd.DataFrame(main_df).index

#Dataframe where all the loss per columns will go
loss_df = pd.DataFrame()

#Dropping mode as it can't logically contribute to degredation
main_df.drop('mode',axis=1, inplace=True)

#Iterating through columns
for column in main_df.columns:
    #Getting the loss of the prediction for that column
    loss_df[f'{column}'] = (X_pred[f'{column}'] - X[f'{column}']).abs()
     
    #Plotting the loss
    plt.subplots(figsize=(15,7))
    plt.plot(loss_df.index, loss_df[f'{column}'], label=f'{column} loss')
    plt.legend(loc='upper right')
    
    plt.show()

#Saving loss Dataframe
loss_df.to_csv('AutoEncoder_loss_p_column.csv')

**Now we will apply Softmax function to each row so we can get the percentage that each colunm contributes to the total loss. As for the sum of each row will give us 1.**

In [None]:
sftmax_df = pd.read_csv('../input/created/AutoEncoder_loss_p_column.csv', index_col=0)
sftmax_df = softmax(sftmax_df, axis=1)
sftmax_df.describe()

**Plotting the percentage of each columns contribution to total loss**

In [None]:
for column in sftmax_df.columns:
    

    plt.subplots(figsize=(15,7))
    plt.plot(sftmax_df.index, sftmax_df[f'{column}'], label=f'{column} loss')
    plt.legend(loc='upper right')
    
    plt.show()

**Now we can plot a stack plot to better visualize the contribution of each column to the total loss. As you will see the Blades position contributes very much to the total loss on that peak we saw. We will look in closer to that slice.**

In [None]:
plt.subplots(figsize=(15,7))

#Labels for stackbar plot
df_label = ['Torque', 'Cut lag','Cut speed','Cut position','Film position','Film speed','Film lag','VAX']

#Stackbar plot
plt.stackplot(sftmax_df.index, sftmax_df['pCut::Motor_Torque'],
             sftmax_df['pCut::CTRL_Position_controller::Lag_error'],
             sftmax_df['pCut::CTRL_Position_controller::Actual_speed'],
              sftmax_df['pCut::CTRL_Position_controller::Actual_position'],
             sftmax_df['pSvolFilm::CTRL_Position_controller::Actual_position'],
             sftmax_df['pSvolFilm::CTRL_Position_controller::Actual_speed'],
             sftmax_df['pSvolFilm::CTRL_Position_controller::Lag_error'],
             sftmax_df['pSpintor::VAX_speed'],
             labels = df_label)

plt.legend(loc='upper center', ncol=8)

plt.ylim(0,1)

In [None]:
plt.subplots(figsize=(15,7))

df_label = ['Torque', 'Cut lag','Cut speed','Cut position','Film position','Film speed','Film lag','VAX']

#Grabbing the slice where the larger anomaly is
sftmax_df = sftmax_df[400000:600000]

plt.stackplot(sftmax_df.index, sftmax_df['pCut::Motor_Torque'],
             sftmax_df['pCut::CTRL_Position_controller::Lag_error'],
             sftmax_df['pCut::CTRL_Position_controller::Actual_speed'],
              sftmax_df['pCut::CTRL_Position_controller::Actual_position'],
             sftmax_df['pSvolFilm::CTRL_Position_controller::Actual_position'],
             sftmax_df['pSvolFilm::CTRL_Position_controller::Actual_speed'],
             sftmax_df['pSvolFilm::CTRL_Position_controller::Lag_error'],
             sftmax_df['pSpintor::VAX_speed'],
             labels = df_label)

plt.legend(loc='upper center', ncol=8)

plt.ylim(0,1)

**Looks like the Lag Error for the blade also gives a big slice of the total loss. The possible explanation here is that the blade is worn, and for that, it's starting to deviate from the path the machine tries to trace for the blade when cutting the film.**

**Now we will look into the distribution of the contribution of each column to teh total loss, Just to get a better grasp of which sensors are giving higher loss.**

In [None]:
for column in sftmax_df.columns:
    

    plt.subplots(figsize=(15,7))
    
    sns.distplot(( sftmax_df[f'{column}']), bins=15).set_title(f'Contribution Distribution')
    plt.xlim(0,1)
    
    plt.show()

# Conclusion 

**With all the info gathered we could tell where and when the machine suffered massive degradation. We could also tell which of the measurments contributes more to the loss on the whole year of the machine, which tells us these components made need more attention. Percentile thresholds, Distribution anomaly analyses, SVMs and much more other methods can be used for the detection of component wear over time with the info given here.**

**A problem with this method is the need to preprocess and scale data entirely to be then given to the algorithm. A way to overcome this problem is to use this dataset or a slice of it and combine it with a new slice of data for system health analyses. And for each new slice of data that needs analyzing, we remove the previous added slice, and keep this dataset intact.** 

# References

**-- https://www.sciencedirect.com/science/article/pii/S221282711830307X**

**-- https://towardsdatascience.com/machine-learning-for-anomaly-detection-and-condition-monitoring-d4614e7de770**

