# 9.6 Assignment: Artificial Neural Networks

The following ANN uses the Steel Plages Faults Dataset which has the following attributes:

DEPENDENT VARIABLES (7 types of steel plates faults)
- Pastry

- Z_scratch

- K_scratch

- Stains

- Dirtiness

- Bumps

- Other_Faults

INDEPENDENT VARIABLES (all numeric except the 2 noted below)
- X_Minimum	

- X_Maximum	

- Y_Minimum

- Y_Maximum

- Pixels_Areas

- X_Perimeter

- Y_Perimeter

- Sum_of_Luminosity

- Minimum_of_Luminosity

- Maximum_of_Luminosity

- Length_of_Conveyer

- TypeOfSteel_A300 (categorical)

- TypeOfSteel_A400 (categorical)

- Steel_Plate_Thickness

- Edges_Index

- Empty_Index

- Square_Index

- Outside_X_Index

- Edges_X_Index

- Edges_Y_Index

- Outside_Global_Index

- LogOfAreas

- Log_X_Index

- Log_Y_Index

- Orientation_Index

- Luminosity_Index

- SigmoidOfAreas

### TABLE OF CONTENTS:
[Data processing](#Data-processing)

[ANN](#ANN)

[Conclusions](#Conclusions)

### Objectives
- build a neural network to see how well it can predict the type of faults in steel plates from numerical attributes only

In [1]:
#libraries
import pandas as pd
import numpy as np

import torch
import torch.nn as nn
import torch.optim as optim

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

# Data processing

In [22]:
#load data
col_names = ['x_min', 'x_max', 'y_min', 'y_max', 'pixels_areas', 'x_perim', 'y_perim', 'sum_lumin', 'min_lumin',
            'max_lumin', 'len_conveyer', 'type_a300', 'type_a400', 'thickness', 'edges_idx', 'empty_idx',
            'square_idx', 'outside_x_idx', 'edges_x_idx', 'edges_y_idx', 'outside_global_idx', 
            'log_areas', 'log_x_idx', 'log_y_idx', 'orientation_idx', 'lumin_idx', 'sigmoid_areas', 
             'pastry', 'z_scratch', 'k_scratch', 'stains', 'dirtiness', 'bumps', 'other_faults',]

data = pd.read_csv('Assignment 9 dataset-Faults.nna', sep = '\t', names = col_names)
data.head()

Unnamed: 0,x_min,x_max,y_min,y_max,pixels_areas,x_perim,y_perim,sum_lumin,min_lumin,max_lumin,...,orientation_idx,lumin_idx,sigmoid_areas,pastry,z_scratch,k_scratch,stains,dirtiness,bumps,other_faults
0,42,50,270900,270944,267,17,44,24220,76,108,...,0.8182,-0.2913,0.5822,1,0,0,0,0,0,0
1,645,651,2538079,2538108,108,10,30,11397,84,123,...,0.7931,-0.1756,0.2984,1,0,0,0,0,0,0
2,829,835,1553913,1553931,71,8,19,7972,99,125,...,0.6667,-0.1228,0.215,1,0,0,0,0,0,0
3,853,860,369370,369415,176,13,45,18996,99,126,...,0.8444,-0.1568,0.5212,1,0,0,0,0,0,0
4,1289,1306,498078,498335,2409,60,260,246930,37,126,...,0.9338,-0.1992,1.0,1,0,0,0,0,0,0


In [3]:
#check df shape
data.shape

(1941, 34)

In [4]:
#check for nulls
data.isnull().any()

pastry                False
z_scratch             False
k_scratch             False
stains                False
dirtiness             False
bumps                 False
other_faults          False
x_min                 False
x_max                 False
y_min                 False
y_max                 False
pixels_areas          False
x_perim               False
y_perim               False
sum_lumin             False
min_lumin             False
max_lumin             False
len_conveyer          False
type_a300             False
type_a400             False
thickness             False
edges_idx             False
empty_idx             False
square_idx            False
outside_x_idx         False
edges_x_idx           False
edges_y_idx           False
outside_global_idx    False
log_areas             False
log_x_idx             False
log_y_idx             False
orientation_idx       False
lumin_idx             False
sigmoid_areas         False
dtype: bool

In [5]:
#find empty cells
def find_empty(df, col_list):
    empty_cells = []
    for column in col_list:
        for index, value in df[column].items():
            if value == ' ':
                empty_cells.append((index, column, value))
    return empty_cells

find_empty(data, col_names)

[]

In [6]:
data.describe()

Unnamed: 0,pastry,z_scratch,k_scratch,stains,dirtiness,bumps,other_faults,x_min,x_max,y_min,...,outside_x_idx,edges_x_idx,edges_y_idx,outside_global_idx,log_areas,log_x_idx,log_y_idx,orientation_idx,lumin_idx,sigmoid_areas
count,1941.0,1941.0,1941.0,1941.0,1941.0,1941.0,1941.0,1941.0,1941.0,1941.0,...,1941.0,1941.0,1941.0,1941.0,1941.0,1941.0,1941.0,1941.0,1941.0,1941.0
mean,571.136012,617.964451,1650685.0,1650739.0,1893.878413,111.855229,82.965997,206312.1,84.548686,130.193715,...,0.083288,-0.131305,0.58542,0.081401,0.097888,0.201443,0.037094,0.028336,0.20711,0.346728
std,520.690671,497.62741,1774578.0,1774590.0,5168.45956,301.209187,426.482879,512293.6,32.134276,18.690992,...,0.500868,0.148767,0.339452,0.273521,0.297239,0.401181,0.189042,0.165973,0.405339,0.476051
min,0.0,4.0,6712.0,6724.0,2.0,2.0,1.0,250.0,0.0,37.0,...,-0.991,-0.9989,0.119,0.0,0.0,0.0,0.0,0.0,0.0,0.0
25%,51.0,192.0,471253.0,471281.0,84.0,15.0,13.0,9522.0,63.0,124.0,...,-0.3333,-0.195,0.2482,0.0,0.0,0.0,0.0,0.0,0.0,0.0
50%,435.0,467.0,1204128.0,1204136.0,174.0,26.0,25.0,19202.0,90.0,127.0,...,0.0952,-0.133,0.5063,0.0,0.0,0.0,0.0,0.0,0.0,0.0
75%,1053.0,1072.0,2183073.0,2183084.0,822.0,84.0,83.0,83011.0,106.0,140.0,...,0.5116,-0.0666,0.9998,0.0,0.0,0.0,0.0,0.0,0.0,1.0
max,1705.0,1713.0,12987660.0,12987690.0,152655.0,10449.0,18152.0,11591410.0,203.0,253.0,...,0.9917,0.6421,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0


In [18]:
#initialize standard scaler
scaler = StandardScaler()

In [25]:
#drop categorical attributes
data = data.drop(columns = ['type_a300', 'type_a400'])

#set input and output features
X = data.iloc[:, :25]
X = scaler.fit_transform(X)
#first seven features will be output
y = data.iloc[:, 25:]

In [26]:
#examine y
y

Unnamed: 0,pastry,z_scratch,k_scratch,stains,dirtiness,bumps,other_faults
0,1,0,0,0,0,0,0
1,1,0,0,0,0,0,0
2,1,0,0,0,0,0,0
3,1,0,0,0,0,0,0
4,1,0,0,0,0,0,0
...,...,...,...,...,...,...,...
1936,0,0,0,0,0,0,1
1937,0,0,0,0,0,0,1
1938,0,0,0,0,0,0,1
1939,0,0,0,0,0,0,1


# ANN

In [27]:
#convert df features to numpy arrays
X_numpy = X.values if isinstance(X, pd.DataFrame) else X
y_numpy = y.values if isinstance(y, pd.DataFrame) else y 

In [28]:
#split data
X_train, X_test, y_train, y_test = train_test_split(X_numpy, y_numpy, test_size = 0.3, random_state = 2)

In [29]:
#convert split arrays to tensors
X_train_tensor = torch.tensor(X_train, dtype=torch.float32)
X_test_tensor = torch.tensor(X_test, dtype=torch.float32)

y_train_tensor = torch.tensor(y_train, dtype=torch.float32)
y_test_tensor = torch.tensor(y_test, dtype=torch.float32)

In [30]:
#initialize model w/ 2 layers with rectified linear unit (ReLU) activation function for speed, and sigmoid function
#for mapping probability for multiclass problem, 25 input variables, 7 neurons
neural_network = nn.Sequential(
                 nn.Linear(25, 7), 
                 nn.ReLU(), 
                 nn.Linear(7,7),  
                 nn.Sigmoid())

neural_network

Sequential(
  (0): Linear(in_features=25, out_features=7, bias=True)
  (1): ReLU()
  (2): Linear(in_features=7, out_features=7, bias=True)
  (3): Sigmoid()
)

In [31]:
#define loss function as binary cross entropy (BCELoss)
loss_criterion = nn.BCELoss()
optimizer = optim.SGD(neural_network.parameters(), lr=0.01)

In [34]:
#set number of epochs
n_epochs = 1000
#set number of batches of data points
batch_s = 100

#train model
for epoch in range(n_epochs):
    neural_network.train()
    for i in range (0, len(X_train_tensor), batch_s):
        #separate x, y into batches, make preds, and calculate loss
        X_batch = X_train_tensor[i:i+batch_s]
        y_batch = y_train_tensor[i:i+batch_s]
        
        y_pred = neural_network(X_batch)
        
        loss = loss_criterion(y_pred, y_batch)
        
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
    print(f'epoch: {epoch+1}, loss was: {loss.item()}')

epoch: 1, loss was: 0.4454731047153473
epoch: 2, loss was: 0.44378459453582764
epoch: 3, loss was: 0.44211891293525696
epoch: 4, loss was: 0.44047263264656067
epoch: 5, loss was: 0.4388485252857208
epoch: 6, loss was: 0.43724486231803894
epoch: 7, loss was: 0.4356614053249359
epoch: 8, loss was: 0.4340986907482147
epoch: 9, loss was: 0.4325554370880127
epoch: 10, loss was: 0.43103185296058655
epoch: 11, loss was: 0.4295293390750885
epoch: 12, loss was: 0.42804640531539917
epoch: 13, loss was: 0.42658179998397827
epoch: 14, loss was: 0.4251358211040497
epoch: 15, loss was: 0.42371082305908203
epoch: 16, loss was: 0.4223022758960724
epoch: 17, loss was: 0.4209120571613312
epoch: 18, loss was: 0.41954129934310913
epoch: 19, loss was: 0.41818967461586
epoch: 20, loss was: 0.4168568551540375
epoch: 21, loss was: 0.41554248332977295
epoch: 22, loss was: 0.41424688696861267
epoch: 23, loss was: 0.41296935081481934
epoch: 24, loss was: 0.41170814633369446
epoch: 25, loss was: 0.410463392734527

epoch: 245, loss was: 0.29427799582481384
epoch: 246, loss was: 0.29407748579978943
epoch: 247, loss was: 0.2938784658908844
epoch: 248, loss was: 0.29368114471435547
epoch: 249, loss was: 0.29348570108413696
epoch: 250, loss was: 0.2932918667793274
epoch: 251, loss was: 0.2930999994277954
epoch: 252, loss was: 0.2929101288318634
epoch: 253, loss was: 0.2927214801311493
epoch: 254, loss was: 0.29253461956977844
epoch: 255, loss was: 0.2923491597175598
epoch: 256, loss was: 0.2921653389930725
epoch: 257, loss was: 0.2919827103614807
epoch: 258, loss was: 0.29180192947387695
epoch: 259, loss was: 0.2916226089000702
epoch: 260, loss was: 0.29144465923309326
epoch: 261, loss was: 0.2912682890892029
epoch: 262, loss was: 0.2910933792591095
epoch: 263, loss was: 0.2909198999404907
epoch: 264, loss was: 0.29074785113334656
epoch: 265, loss was: 0.2905772626399994
epoch: 266, loss was: 0.29040807485580444
epoch: 267, loss was: 0.2902400493621826
epoch: 268, loss was: 0.29007330536842346
epoch:

epoch: 545, loss was: 0.26459556818008423
epoch: 546, loss was: 0.2645314037799835
epoch: 547, loss was: 0.26446732878685
epoch: 548, loss was: 0.26440325379371643
epoch: 549, loss was: 0.2643395960330963
epoch: 550, loss was: 0.26427584886550903
epoch: 551, loss was: 0.2642119228839874
epoch: 552, loss was: 0.2641480565071106
epoch: 553, loss was: 0.26408472657203674
epoch: 554, loss was: 0.2640236020088196
epoch: 555, loss was: 0.26396268606185913
epoch: 556, loss was: 0.2639019191265106
epoch: 557, loss was: 0.2638412117958069
epoch: 558, loss was: 0.26378053426742554
epoch: 559, loss was: 0.2637196183204651
epoch: 560, loss was: 0.2636588215827942
epoch: 561, loss was: 0.2635982036590576
epoch: 562, loss was: 0.2635377049446106
epoch: 563, loss was: 0.26347723603248596
epoch: 564, loss was: 0.2634163796901703
epoch: 565, loss was: 0.2633554935455322
epoch: 566, loss was: 0.26329460740089417
epoch: 567, loss was: 0.26323387026786804
epoch: 568, loss was: 0.2631731927394867
epoch: 56

epoch: 779, loss was: 0.25266793370246887
epoch: 780, loss was: 0.25262773036956787
epoch: 781, loss was: 0.2525878846645355
epoch: 782, loss was: 0.25254806876182556
epoch: 783, loss was: 0.2525083124637604
epoch: 784, loss was: 0.25246864557266235
epoch: 785, loss was: 0.2524288594722748
epoch: 786, loss was: 0.25238925218582153
epoch: 787, loss was: 0.25234994292259216
epoch: 788, loss was: 0.25231072306632996
epoch: 789, loss was: 0.2522715926170349
epoch: 790, loss was: 0.25223270058631897
epoch: 791, loss was: 0.25219377875328064
epoch: 792, loss was: 0.2521548271179199
epoch: 793, loss was: 0.25211647152900696
epoch: 794, loss was: 0.25207817554473877
epoch: 795, loss was: 0.25203990936279297
epoch: 796, loss was: 0.2520018219947815
epoch: 797, loss was: 0.2519637942314148
epoch: 798, loss was: 0.2519257664680481
epoch: 799, loss was: 0.25188764929771423
epoch: 800, loss was: 0.2518499493598938
epoch: 801, loss was: 0.25181230902671814
epoch: 802, loss was: 0.25177472829818726
e

In [33]:
#first round after epoch =100
#evaluate model
neural_network.eval()
#compare probability predictions to y_test values
with torch.no_grad():
    test_out = neural_network(X_test_tensor)
    test_loss = loss_criterion(test_out, y_test_tensor)

print(f'test loss: {test_loss.item()}')

test loss: 0.4478584825992584


In [37]:
#after changing epoch = 1000
#evaluate model
neural_network.eval()
#compare probability predictions to y_test values
with torch.no_grad():
    test_output = neural_network(X_test_tensor)
    test_loss = loss_criterion(test_output, y_test_tensor)

print(f'test loss: {test_loss.item()}')

test loss: 0.2420000433921814


In [40]:
#evaluate accuracy of second model
accuracy = (test_output.round() == y_test_tensor).float().mean()
accuracy

tensor(0.8961)

# Conclusions

Initially, I had mislabeled the input and output variables in the data, where the outputs were already in binary format. Due to my standardization of the data points in the wrong format, the loss values of the neural network were coming out negative, which is unusual and indicates an error within the data processing or build of the model itself since the values should range between 0 and 1 when applying the binary cross entropy loss function I did due to the outcome of predicting the (multiclass) probabilities of having a specific fault in the steel plate. 

After fixing the variable assignment, and maintaining the standardization of the numerical independent features, the BCELoss outputs ranged within the expected 0-1 values. I initally ran my model with 100 epochs and while the epoch and batch iteration the model's continuous improvement became evident, where the training loss decreased to 0.45, down from 0.7 in the first epoch, (the test loss was also 0.45). Although this loss value indicated the model performed better than random guessing/predicting, I decided to run the code again with 1,000 epochs to compare the improvement after many more iterations of learning. The training of the model improved significantly, where the last loss output was 0.2448, and then the test loss came out to  0.2420, which was slightly better than the last training round. 

Since the loss value can be a bit hard to interpret, I computed the accuracy as well, by comparing the output to the original label and found the model to have an accuracy of about 90%, which is an impressive outcome considering no extreneous fine-tuning was done up to this point, and considering the 1,000 epochs didn't seem to present too much of a computational problem, I could imagine the neural network model achieving close to 100% accuracy with more training/learning. 