A neural network is a system built to mimic the human brain's neurons by training with data. The general steps for building a Neural Network include: collecting and preparing the data, defining the model used for the network, training the network by epochs and checking the accuracy on a test set.

To check the performance of a NN, we can use a performance test to get the loss score and/or accuracy score. This is the case because of we need to make sure the Network is good enough for data prediction by comparing it to already existing data.

In [414]:
#pip install tensorflow

In [415]:
#pip install keras

In [416]:
import pandas as pd
import numpy as np
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from tensorflow.keras.models import Sequential 
from tensorflow.keras.layers import Dense
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import DataLoader, TensorDataset

In [417]:
df = pd.read_csv("arrhythmia.data")
df.head()

Unnamed: 0,75,0,190,80,91,193,371,174,121,-16,...,0.0.38,9.0,-0.9,0.0.39,0.0.40,0.9.2,2.9.1,23.3,49.4,8
0,56,1,165,64,81,174,401,149,39,25,...,0.0,8.5,0.0,0.0,0.0,0.2,2.1,20.4,38.8,6
1,54,0,172,95,138,163,386,185,102,96,...,0.0,9.5,-2.4,0.0,0.0,0.3,3.4,12.3,49.0,10
2,55,0,175,94,100,202,380,179,143,28,...,0.0,12.2,-2.2,0.0,0.0,0.4,2.6,34.6,61.6,1
3,75,0,190,80,88,181,360,177,103,-16,...,0.0,13.1,-3.6,0.0,0.0,-0.1,3.9,25.4,62.8,7
4,13,0,169,51,100,167,321,174,91,107,...,-0.6,12.2,-2.8,0.0,0.0,0.9,2.2,13.5,31.1,14


In [418]:
null_data_sum = df.isnull().sum().sum()
object_columns = df.select_dtypes(include=['object']).columns

null_data_sum 
object_columns

Index(['13', '64', '-2', '?', '63'], dtype='object')

In [419]:
df.replace('?',np.nan,inplace=True)

In [420]:
df[object_columns] = df[object_columns].apply(pd.to_numeric)

In [421]:
df.drop('?',axis=1)

Unnamed: 0,75,0,190,80,91,193,371,174,121,-16,...,0.0.38,9.0,-0.9,0.0.39,0.0.40,0.9.2,2.9.1,23.3,49.4,8
0,56,1,165,64,81,174,401,149,39,25,...,0.0,8.5,0.0,0.0,0.0,0.2,2.1,20.4,38.8,6
1,54,0,172,95,138,163,386,185,102,96,...,0.0,9.5,-2.4,0.0,0.0,0.3,3.4,12.3,49.0,10
2,55,0,175,94,100,202,380,179,143,28,...,0.0,12.2,-2.2,0.0,0.0,0.4,2.6,34.6,61.6,1
3,75,0,190,80,88,181,360,177,103,-16,...,0.0,13.1,-3.6,0.0,0.0,-0.1,3.9,25.4,62.8,7
4,13,0,169,51,100,167,321,174,91,107,...,-0.6,12.2,-2.8,0.0,0.0,0.9,2.2,13.5,31.1,14
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
446,53,1,160,70,80,199,382,154,117,-37,...,0.0,4.3,-5.0,0.0,0.0,0.7,0.6,-4.4,-0.5,1
447,37,0,190,85,100,137,361,201,73,86,...,0.0,15.6,-1.6,0.0,0.0,0.4,2.4,38.0,62.4,10
448,36,0,166,68,108,176,365,194,116,-85,...,0.0,16.3,-28.6,0.0,0.0,1.5,1.0,-44.2,-33.2,2
449,32,1,155,55,93,106,386,218,63,54,...,-0.4,12.0,-0.7,0.0,0.0,0.5,2.4,25.0,46.6,1


In [422]:
df.fillna(df.median(numeric_only=True), inplace=True)

na_after_cleaning = df.isna().sum().sum()
na_after_cleaning

np.int64(0)

In [423]:
X = df.iloc[:, :-1].values
y = df.iloc[:, :-1].values

X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.3,random_state=42)

In [424]:
X_train_tensor = torch.FloatTensor(X_train)
X_test_tensor = torch.FloatTensor(X_test)

y_test_tensor = torch.LongTensor(y_test)
y_train_tensor = torch.LongTensor(y_train)


In [425]:
if len(y_train.shape) > 1:
    y_train_tensor = torch.argmax(torch.from_numpy(y_train), dim=1)

In [426]:
train_df = TensorDataset(X_train_tensor, y_train_tensor)
train_loader = DataLoader(train_df, batch_size=32, shuffle=True)

In [427]:
class ANN_Model(nn.Module):
    def __init__(self, input_size=X_train.shape[1], hidden_size=20, output_size=len(np.unique(y_train))):
        super(ANN_Model, self).__init__() 
        self.layer_1_connection = nn.Linear(input_size, hidden_size)
        self.layer_2_connection = nn.ReLU()
        self.out = nn.Linear(hidden_size, output_size)

    def forward(self, x):
        x = x.view(x.size(0),-1)
        x = F.relu(self.layer_1_connection(x))
        x = F.relu(self.layer_2_connection(x))
        x = self.out(x)
        return x

In [428]:
torch.manual_seed(42)
ann = ANN_Model(input_size, hidden_size, output_size)
loss_function = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(ann.parameters(), lr = 0.01)

In [429]:
print(f"X_train_tensor shape: {X_train_tensor.shape}")
print(f"fc1 input size: {ann.layer_1_connection}")
print(f"fc1 output size: {ann.out}")

X_train_tensor shape: torch.Size([315, 279])
fc1 input size: Linear(in_features=279, out_features=64, bias=True)
fc1 output size: Linear(in_features=64, out_features=1650, bias=True)


In [None]:
final_loss = []
n_epochs = 500
for epoch in range(n_epochs):
    y_pred = ann.forward(X_train_tensor)
    loss = loss_function(y_pred, y_train_tensor)
    final_loss.append(loss) 

    if epoch % 10 == 1:
        print(f'Epoch number: {epoch} with loss {loss}')

    optimizer.zero_grad() 
    loss.backward() 
    optimizer.step()

Epoch number: 1 with loss 0.4062550365924835
Epoch number: 11 with loss 0.3516708016395569
Epoch number: 21 with loss 0.8409609794616699
Epoch number: 31 with loss 0.6896562576293945
Epoch number: 41 with loss 0.33448296785354614
Epoch number: 51 with loss 0.05656949058175087
Epoch number: 61 with loss 0.003020011354237795
Epoch number: 71 with loss 1.506164295506096e-07
Epoch number: 81 with loss 1.8976976207341067e-05
Epoch number: 91 with loss 1.1777136023738421e-05
Epoch number: 101 with loss 1.0548417776590213e-05
Epoch number: 111 with loss 5.4623151299892925e-06
Epoch number: 121 with loss 2.1701578134525334e-06
Epoch number: 131 with loss 1.1684709306791774e-06
Epoch number: 141 with loss 8.056321689764445e-07
Epoch number: 151 with loss 6.368754839058965e-07
Epoch number: 161 with loss 5.396287861003657e-07
Epoch number: 171 with loss 4.737871108773106e-07
Epoch number: 181 with loss 4.245943614478165e-07
Epoch number: 191 with loss 3.8599657159466005e-07
Epoch number: 201 wit

In [449]:

with torch.no_grad():
    y_test_pred = ann(X_test_tensor)

if len(y_test_tensor.shape) > 1:
    y_test_tensor = torch.argmax(y_test_tensor, dim=1)

test_accuracy = (y_test_pred_class == y_test_tensor).float().mean().item()
print(test_accuracy)

0.9926470518112183


After comparing the loss and accuracy scores of this neural network vs the one I created for Quiz 12, I can say this one had a better loss score, being the last number 9.0 on this and accuracy of 0.99, which was way higher than the network created with the diabetes df, even after changin it to adam optimizer. I think this one performed better because of how I handled the data and because it was mostly numeric to begin with.