1) What is a neural network? What are the general steps required to build a neural network? 

A neural network is a type of data modeling that mimics human thought processes in order to identify and understand patterns in data.  It is especially good at accounting for interactions between input variables.  The basic steps of a neural network start with dataset variables, which are fed through one or more hidden layers, ultimately yielding your output or outputs.  The model can be fine-tuned through forward and backward propagation.

2) Generally, how do you check the performance of a neural network? Why? 

3) Create a neural network using keras to predict the outcome of either of these datasets: 

Cardiac Arrhythmia: https://archive.ics.uci.edu/ml/datasets/Arrhythmia 

Abalone age: https://archive.ics.uci.edu/ml/datasets/Abalone

In [1]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
from sklearn.datasets import make_classification
from sklearn.metrics import classification_report
from sklearn.model_selection import cross_validate
from sklearn.model_selection import RepeatedStratifiedKFold
from sklearn.ensemble import AdaBoostClassifier
from imblearn.combine import SMOTEENN
from imblearn.under_sampling import EditedNearestNeighbours

arr_data = pd.read_csv("arrhythmia.csv", header=None)
arr_df = pd.DataFrame(arr_data)

arr_df

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,270,271,272,273,274,275,276,277,278,279
0,75,0,190,80,91,193,371,174,121,-16,...,0.0,9.0,-0.9,0.0,0.0,0.9,2.9,23.3,49.4,8
1,56,1,165,64,81,174,401,149,39,25,...,0.0,8.5,0.0,0.0,0.0,0.2,2.1,20.4,38.8,6
2,54,0,172,95,138,163,386,185,102,96,...,0.0,9.5,-2.4,0.0,0.0,0.3,3.4,12.3,49.0,10
3,55,0,175,94,100,202,380,179,143,28,...,0.0,12.2,-2.2,0.0,0.0,0.4,2.6,34.6,61.6,1
4,75,0,190,80,88,181,360,177,103,-16,...,0.0,13.1,-3.6,0.0,0.0,-0.1,3.9,25.4,62.8,7
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
447,53,1,160,70,80,199,382,154,117,-37,...,0.0,4.3,-5.0,0.0,0.0,0.7,0.6,-4.4,-0.5,1
448,37,0,190,85,100,137,361,201,73,86,...,0.0,15.6,-1.6,0.0,0.0,0.4,2.4,38.0,62.4,10
449,36,0,166,68,108,176,365,194,116,-85,...,0.0,16.3,-28.6,0.0,0.0,1.5,1.0,-44.2,-33.2,2
450,32,1,155,55,93,106,386,218,63,54,...,-0.4,12.0,-0.7,0.0,0.0,0.5,2.4,25.0,46.6,1


In [2]:
#give columns headings to make them easier to manipulate
arr_df.columns = ['N'+str(x) for x in range(0,280)]
arr_df

Unnamed: 0,N0,N1,N2,N3,N4,N5,N6,N7,N8,N9,...,N270,N271,N272,N273,N274,N275,N276,N277,N278,N279
0,75,0,190,80,91,193,371,174,121,-16,...,0.0,9.0,-0.9,0.0,0.0,0.9,2.9,23.3,49.4,8
1,56,1,165,64,81,174,401,149,39,25,...,0.0,8.5,0.0,0.0,0.0,0.2,2.1,20.4,38.8,6
2,54,0,172,95,138,163,386,185,102,96,...,0.0,9.5,-2.4,0.0,0.0,0.3,3.4,12.3,49.0,10
3,55,0,175,94,100,202,380,179,143,28,...,0.0,12.2,-2.2,0.0,0.0,0.4,2.6,34.6,61.6,1
4,75,0,190,80,88,181,360,177,103,-16,...,0.0,13.1,-3.6,0.0,0.0,-0.1,3.9,25.4,62.8,7
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
447,53,1,160,70,80,199,382,154,117,-37,...,0.0,4.3,-5.0,0.0,0.0,0.7,0.6,-4.4,-0.5,1
448,37,0,190,85,100,137,361,201,73,86,...,0.0,15.6,-1.6,0.0,0.0,0.4,2.4,38.0,62.4,10
449,36,0,166,68,108,176,365,194,116,-85,...,0.0,16.3,-28.6,0.0,0.0,1.5,1.0,-44.2,-33.2,2
450,32,1,155,55,93,106,386,218,63,54,...,-0.4,12.0,-0.7,0.0,0.0,0.5,2.4,25.0,46.6,1


In [3]:
#last column is the arrhythmia classification (aka y_actual), so we'll need that
arr_class_df = arr_df['N279']
arr_class_df

0       8
1       6
2      10
3       1
4       7
       ..
447     1
448    10
449     2
450     1
451     1
Name: N279, Length: 452, dtype: int64

In [4]:
#drop columns with extraneous/duplicate info
#probably not best practice, but all the columns after the first dozen or so are too subject-specific for me to get
#so let's drop them
arr_df_2 = arr_df.iloc[0:451, 0:13]
arr_df_2

Unnamed: 0,N0,N1,N2,N3,N4,N5,N6,N7,N8,N9,N10,N11,N12
0,75,0,190,80,91,193,371,174,121,-16,13,64,-2
1,56,1,165,64,81,174,401,149,39,25,37,-17,31
2,54,0,172,95,138,163,386,185,102,96,34,70,66
3,55,0,175,94,100,202,380,179,143,28,11,-5,20
4,75,0,190,80,88,181,360,177,103,-16,13,61,3
...,...,...,...,...,...,...,...,...,...,...,...,...,...
446,20,1,157,57,81,151,363,166,80,43,42,72,42
447,53,1,160,70,80,199,382,154,117,-37,4,40,-27
448,37,0,190,85,100,137,361,201,73,86,66,52,79
449,36,0,166,68,108,176,365,194,116,-85,-19,-61,-70


In [5]:
#concatenate 1st dozen-ish columns with N279 (aka arrhythmia class, aka y_pred)
arr_df_3 = pd.concat([arr_df_2, arr_class_df], axis=1)
arr_df_3

Unnamed: 0,N0,N1,N2,N3,N4,N5,N6,N7,N8,N9,N10,N11,N12,N279
0,75.0,0.0,190.0,80.0,91.0,193.0,371.0,174.0,121.0,-16.0,13,64,-2,8
1,56.0,1.0,165.0,64.0,81.0,174.0,401.0,149.0,39.0,25.0,37,-17,31,6
2,54.0,0.0,172.0,95.0,138.0,163.0,386.0,185.0,102.0,96.0,34,70,66,10
3,55.0,0.0,175.0,94.0,100.0,202.0,380.0,179.0,143.0,28.0,11,-5,20,1
4,75.0,0.0,190.0,80.0,88.0,181.0,360.0,177.0,103.0,-16.0,13,61,3,7
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
447,53.0,1.0,160.0,70.0,80.0,199.0,382.0,154.0,117.0,-37.0,4,40,-27,1
448,37.0,0.0,190.0,85.0,100.0,137.0,361.0,201.0,73.0,86.0,66,52,79,10
449,36.0,0.0,166.0,68.0,108.0,176.0,365.0,194.0,116.0,-85.0,-19,-61,-70,2
450,32.0,1.0,155.0,55.0,93.0,106.0,386.0,218.0,63.0,54.0,29,-22,43,1


In [6]:
#not sure where that NaN row came from??  anyway let's drop it
arr_df_4 = arr_df_3.iloc[0:451, 0:19]
arr_df_4

Unnamed: 0,N0,N1,N2,N3,N4,N5,N6,N7,N8,N9,N10,N11,N12,N279
0,75.0,0.0,190.0,80.0,91.0,193.0,371.0,174.0,121.0,-16.0,13,64,-2,8
1,56.0,1.0,165.0,64.0,81.0,174.0,401.0,149.0,39.0,25.0,37,-17,31,6
2,54.0,0.0,172.0,95.0,138.0,163.0,386.0,185.0,102.0,96.0,34,70,66,10
3,55.0,0.0,175.0,94.0,100.0,202.0,380.0,179.0,143.0,28.0,11,-5,20,1
4,75.0,0.0,190.0,80.0,88.0,181.0,360.0,177.0,103.0,-16.0,13,61,3,7
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
446,20.0,1.0,157.0,57.0,81.0,151.0,363.0,166.0,80.0,43.0,42,72,42,1
447,53.0,1.0,160.0,70.0,80.0,199.0,382.0,154.0,117.0,-37.0,4,40,-27,1
448,37.0,0.0,190.0,85.0,100.0,137.0,361.0,201.0,73.0,86.0,66,52,79,10
449,36.0,0.0,166.0,68.0,108.0,176.0,365.0,194.0,116.0,-85.0,-19,-61,-70,2


In [7]:
#replace ?s with zeroes
arr_df_5 = arr_df_4.replace("?", 0)
arr_df_5

Unnamed: 0,N0,N1,N2,N3,N4,N5,N6,N7,N8,N9,N10,N11,N12,N279
0,75.0,0.0,190.0,80.0,91.0,193.0,371.0,174.0,121.0,-16.0,13,64,-2,8
1,56.0,1.0,165.0,64.0,81.0,174.0,401.0,149.0,39.0,25.0,37,-17,31,6
2,54.0,0.0,172.0,95.0,138.0,163.0,386.0,185.0,102.0,96.0,34,70,66,10
3,55.0,0.0,175.0,94.0,100.0,202.0,380.0,179.0,143.0,28.0,11,-5,20,1
4,75.0,0.0,190.0,80.0,88.0,181.0,360.0,177.0,103.0,-16.0,13,61,3,7
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
446,20.0,1.0,157.0,57.0,81.0,151.0,363.0,166.0,80.0,43.0,42,72,42,1
447,53.0,1.0,160.0,70.0,80.0,199.0,382.0,154.0,117.0,-37.0,4,40,-27,1
448,37.0,0.0,190.0,85.0,100.0,137.0,361.0,201.0,73.0,86.0,66,52,79,10
449,36.0,0.0,166.0,68.0,108.0,176.0,365.0,194.0,116.0,-85.0,-19,-61,-70,2


In [65]:
X = arr_df_5.drop('N279', axis=1)
X = np.asarray(X).astype(np.float32)
y = arr_df_5['N279']

predictors = X
target = y

In [66]:
from tensorflow.keras.layers import Dense
from tensorflow.keras.models import Sequential

n_cols = predictors.shape[1]

model = Sequential()
model.add(Dense(100, activation='relu', input_shape=(n_cols,)))
model.add(Dense(100, activation='relu'))
model.add(Dense(1))

In [67]:
model.compile(optimizer='adam', loss='mean_squared_error')

In [68]:
model.fit(predictors, target, validation_split=0.3, epochs=10)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x7f7ee2aa3640>

4) Write another algorithm to predict the same result as the previous question using either KNN or logistic regression.

In [69]:
X = arr_df_5.drop('N279', axis=1)
X = np.asarray(X).astype(np.float32)
y = arr_df_5['N279']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5, random_state=18)

In [70]:
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.fit_transform(X_test)

In [71]:
from sklearn.linear_model import LogisticRegression

regression = LogisticRegression()
regression.fit(X_train, y_train)

LogisticRegression()

In [72]:
regression.fit(X_train, y_train)
y_pred = regression.predict(X_test)
regression.score(X_test, y_test)

0.5398230088495575

In [76]:
from sklearn.metrics import mean_squared_error as MSE
regression_mse = MSE(y_test, y_pred)
print(regression_mse)

25.18141592920354


5) Create a neural network using pytorch to predict the same result as question 3. 

In [20]:
import torch
import torch.nn as nn
import torch.nn.functional as F 

In [26]:
X = arr_df_5.drop('N279', axis=1)
X = np.asarray(X).astype(np.float32)
y = arr_df_5['N279']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5, random_state=18)

In [28]:
#create tensors
X_train = torch.FloatTensor(X_train)
X_test = torch.FloatTensor(X_test)

y_train = torch.LongTensor(y_train.to_numpy())
y_test = torch.LongTensor(y_test)

print(X_train)

tensor([[ 46.,   0., 168.,  ...,  14.,  57.,   0.],
        [ 32.,   1., 158.,  ...,  36.,  72.,  57.],
        [ 81.,   1., 165.,  ...,  52.,  51.,   7.],
        ...,
        [ 69.,   1., 154.,  ...,  57.,  60.,  27.],
        [ 64.,   1., 156.,  ...,  14.,  51.,  12.],
        [ 12.,   1., 165.,  ...,  14.,   0.,  10.]])


In [51]:
class ANN_Model(nn.Module):
    def __init__(self, input_features = 13, hidden1 = 25, hidden2 = 25, out_features = 1):
        super().__init__()
        self.layer_1_connection = nn.Linear(input_features, hidden1)
        self.layer_2_connection = nn.Linear(hidden1, hidden2)
        self.out = nn.Linear(hidden2, out_features)
    
    #how we predict output
    def forward(self, x):
        #apply activation functions
        x = F.relu(self.layer_1_connection(x))
        x = F.relu(self.layer_2_connection(x))
        x = self.out(x)
        return x

In [30]:
#set random seed for reproducibility
torch.manual_seed(18)

<torch._C.Generator at 0x7f7effbc92d0>

In [55]:
#instantiate model
model = ANN_Model()

#loss function
loss_function = nn.MSELoss()

#optimizer
optimizer = torch.optim.Adam(model.parameters(), lr = 0.01)

y_test = y_test.type(torch.LongTensor)
y_pred = y_pred.type(torch.LongTensor)
y_train = y_train.type(torch.LongTensor)

In [62]:
final_loss = []
n_epochs = 500
for epoch in range(n_epochs):
    y_pred = model.forward(X_train)
    loss = loss_function(y_pred.float(), y_train.float())
    final_loss.append(loss)
    
    if epoch % 10 == 1:
        print(f'Epoch number: {epoch} with loss: {loss.item()}')
    
    optimizer.zero_grad() #zero the gradient before running backwards propagation
    loss.backward() #for backward propagation 
    optimizer.step() #performs one optimization step each epoch

Epoch number: 1 with loss: 24.74974250793457
Epoch number: 11 with loss: 20.744943618774414
Epoch number: 21 with loss: 20.261167526245117
Epoch number: 31 with loss: 19.697378158569336
Epoch number: 41 with loss: 19.47505760192871
Epoch number: 51 with loss: 19.384180068969727
Epoch number: 61 with loss: 19.365341186523438
Epoch number: 71 with loss: 19.348392486572266
Epoch number: 81 with loss: 19.33385467529297
Epoch number: 91 with loss: 19.324216842651367
Epoch number: 101 with loss: 19.31501007080078
Epoch number: 111 with loss: 19.307214736938477
Epoch number: 121 with loss: 19.299856185913086
Epoch number: 131 with loss: 19.293474197387695
Epoch number: 141 with loss: 19.287490844726562
Epoch number: 151 with loss: 19.282142639160156
Epoch number: 161 with loss: 19.27786636352539
Epoch number: 171 with loss: 19.273799896240234
Epoch number: 181 with loss: 19.270244598388672
Epoch number: 191 with loss: 19.26707649230957
Epoch number: 201 with loss: 19.264219284057617
Epoch num

6) Compare the performance of the neural networks to the other model you created. Which performed better? Why do you think that is?

Keras final loss = 34.0549
Logistic Regression loss = 25.18
Pytorch neural network final loss = 19.23

The pytorch model performed the best of the three with the smallest loss.  I believe the forward/backward propogation steps gave it an edge over the other 2 models.