## One Neuron to Another

* Highly based on https://towardsdatascience.com/how-to-build-your-own-neural-network-from-scratch-in-python-68998a08e4f6

### Introduction
In this notebook, I created a simple, three layer neural network (input, hidden, output). This was based off of the link provided above. Due to the activation function used and setup, the NN can accomplish binary classification problems. With this in mind, I chose two datasets: one with multiclass output but it altered to focus on only one of the classes as well as a pure binary classification set.

In [1]:
#Import libraries
import numpy as np
import pandas as pd

### Neural Network
This neural network is comprised of three layers (input,hidden,output). The hidden layer is fully connected and the number of nodes is determined by the user as the data depends on how many should be selected. In order to learn these weights between each of the layers, feed forward and back propagation is leveraged. Feed forward updates the output based on the weights already calculated. On the first instance, the weights are randomized as we do not yet know anything about the data aside from the input and output vectors that were passed. 

These randomized weights are used for the feed forward algorithm which leverages sigmoid activation function of the dot product of the input layer to the weights and that result to the second layer of weights which result in the output. 

Now we leverage back propagation to evaluate the difference between our expected values(output layer) opposed to our calculated layer(self.output). This helps readjust the weights to better approximate the expected value.

The first iteration of feed forward and backward propagation does not fully learn the input, thus multiple iterations are required in order to learn the data. Furthermore, depending on the amount of data, it may be highly possible to overtrain or undertrain. A specifed learning rate can be passed in order to achieve the desired result. There are learning rates that are adaptive, but for the extent of this project, I decided to chose a value based on what I saw in the NN output accuracy.

Lastly, I added another function that can be used to evaluate the trained neural network. This requires new input that the user wants to classify based on the previous weights. This only involves feed forward with the passed data once and should not update the existing weights.

In [2]:
#Define activation functions
def sigmoid(x):
    return 1.0/(1+ np.exp(-x))

def sigmoid_derivative(x):
    return x * (1.0 - x)

#Define neural network
class NeuralNetwork:
    def __init__(self,x,y,nodeNumbers,learningrate):
        self.input      = x
        self.weights1   = np.random.rand(self.input.shape[1],nodeNumbers) 
        self.weights2   = np.random.rand(nodeNumbers,1)                 
        self.y          = y
        self.output     = np.zeros(self.y.shape)
        self.learningRate= learningrate

    def feedforward(self):
        self.layer1 = sigmoid(np.dot(self.input, self.weights1))
        self.output = sigmoid(np.dot(self.layer1, self.weights2))
    
    def backprop(self):
        # application of the chain rule to find derivative of the loss function with respect to weights2 and weights1
        d_weights2 = np.dot(self.layer1.T, (2*(self.y - self.output) * (self.learningRate*sigmoid_derivative(self.output))))
        d_weights1 = np.dot(self.input.T,  (np.dot(2*(self.y - self.output) * (self.learningRate*sigmoid_derivative(self.output)), self.weights2.T) * (self.learningRate*sigmoid_derivative(self.layer1))))

        # update the weights with the derivative (slope) of the loss function
        self.weights1 += d_weights1
        self.weights2 += d_weights2
    
    def evalNN(self, newInput):
        layer1 = sigmoid(np.dot(newInput, self.weights1))
        output2 = sigmoid(np.dot(layer1, self.weights2))
        return output2
        

### Parsing DataFrame
In order to train a neural network on randomized data, I need to shuffle then pass the respective X and Y values to parse into a training, testing and validation set. With the assumption of X and Y should be shuffled prior, we can simply break the DataFrame based on indices.

In [3]:
#Define parsing frames into training, testing, and validation
def parser(X,Y):
    #split data into training 80 and testing 10 and validation 10
    #since we always shuffle, we can pick consequtively instead of randomly sampling on a random shuffle
    trainingSetIndex= int(0.80*len(Y))#first 80%
    testingSetIndex= int(0.10*len(Y)+trainingSetIndex) #next 10%
    validationSetIndex= int(0.10*len(Y)+testingSetIndex)#last 10%
    print("Training Index: " + str(trainingSetIndex))
    print("Testing Index: " + str(testingSetIndex))
    print("Validation Index: " + str(validationSetIndex))

    #Split dataset
    training=X.iloc[0:trainingSetIndex,:]
    trainingY=Y[0:trainingSetIndex]
    testing=X.iloc[trainingSetIndex:testingSetIndex,:]
    testingY=Y[trainingSetIndex:testingSetIndex]
    validation=X.iloc[testingSetIndex:len(X),:]
    validationY=Y[testingSetIndex:len(X)]
    print("-------------------------")
    print("Training: ")
    print(training[:5])
    print(trainingY[:5])
    print("-------------------------")
    print("Testing: ")
    print(testing[:5])
    print(testingY[:5])
    print("-------------------------")
    print("Validation: ")
    print(validation[:5])
    print(validationY[:5])
    
    return training,trainingY,testing,testingY,validation,validationY
   

### Checking Accuracy
In order to understand the validity of the neural network, I created a function that compares the NN output and the expected values and calculate the percentage of the correctly guessed values. Due to the NN outputting numbers in type double, I must round and calculate appropriately as zero means one class and 1 means the other.

In [4]:
#Check accuracy of neural network
def checkAccuracy(NNoutput,Y):
    #round output vector
    checker=np.round(NNoutput)
    #print(checker[:5])
    #substract output vector with actual values
    checker=np.subtract(checker,Y)
    #count how many values are not 0
    count=np.count_nonzero(checker)
    #convert to percentage guessed correctly
    percentage=((len(Y)-count)/len(Y))*100
    return percentage


### DataSet 1 : Iris Dataset
Iris dataset involves 

In [5]:
#Read in dataset 1
#Set inputs to
iris=pd.read_csv('../data/external/iris-species/Iris.csv')
iris=iris.drop(columns=['Id'])
iris=iris.dropna()
irisnew=iris.sample(frac=1).reset_index(drop=True)

print(irisnew.head())
print(iris.Species.unique())

   SepalLengthCm  SepalWidthCm  PetalLengthCm  PetalWidthCm          Species
0            6.6           2.9            4.6           1.3  Iris-versicolor
1            6.1           2.8            4.0           1.3  Iris-versicolor
2            6.3           2.3            4.4           1.3  Iris-versicolor
3            4.6           3.1            1.5           0.2      Iris-setosa
4            5.0           2.0            3.5           1.0  Iris-versicolor
['Iris-setosa' 'Iris-versicolor' 'Iris-virginica']


In [6]:
#Distribute lengths and widths to one-hot encoding
from sklearn.preprocessing import OneHotEncoder
enc = OneHotEncoder(handle_unknown='ignore')
#X = irisnew.iloc[:,[0,1,2,3]]
X = irisnew.iloc[:,[2,3]]
print(X.head())
enc.fit(X)

#Output the species to a mapping 
outputDictionary={}
for index,entry in enumerate(iris.Species.unique()):
    if entry=='Iris-virginica':
        outputDictionary[entry]=1
    else:
        outputDictionary[entry]=0
Y0=irisnew.iloc[:,4]
print("Output")
print(Y0[:5])
Y=[]
for value in Y0:
    if value=='Iris-virginica':
        Y.append([1])
    else:
        Y.append([0])
print("Converted Output")
Y=np.asarray(Y)
print(Y[:5])

print("Output Mapping")
print(outputDictionary)

   PetalLengthCm  PetalWidthCm
0            4.6           1.3
1            4.0           1.3
2            4.4           1.3
3            1.5           0.2
4            3.5           1.0
Output
0    Iris-versicolor
1    Iris-versicolor
2    Iris-versicolor
3        Iris-setosa
4    Iris-versicolor
Name: Species, dtype: object
Converted Output
[[0]
 [0]
 [0]
 [0]
 [0]]
Output Mapping
{'Iris-setosa': 0, 'Iris-versicolor': 0, 'Iris-virginica': 1}


In [7]:
#split data into training 80 and testing 10 and validation 10
#since we always shuffle, we can pick consequtively instead of randomly sampling on a random shuffle
training,trainingY,testing,testingY,validation,validationY=parser(X,Y)

Training Index: 120
Testing Index: 135
Validation Index: 150
-------------------------
Training: 
   PetalLengthCm  PetalWidthCm
0            4.6           1.3
1            4.0           1.3
2            4.4           1.3
3            1.5           0.2
4            3.5           1.0
[[0]
 [0]
 [0]
 [0]
 [0]]
-------------------------
Testing: 
     PetalLengthCm  PetalWidthCm
120            1.5           0.1
121            4.8           1.8
122            1.4           0.3
123            5.1           1.5
124            4.4           1.4
[[0]
 [0]
 [0]
 [1]
 [0]]
-------------------------
Validation: 
     PetalLengthCm  PetalWidthCm
135            4.4           1.2
136            4.3           1.3
137            5.6           2.1
138            6.7           2.2
139            4.1           1.3
[[0]
 [0]
 [1]
 [1]
 [0]]


In [8]:
nn = NeuralNetwork(training,trainingY,4,0.025)

iterations=1200
for i in range(iterations):
    nn.feedforward()
    nn.backprop()
    
#Print neural network output
print(nn.output[:5])
#Compare actual output to learning
print("Training: "+str(checkAccuracy(nn.output,trainingY))+"%")

#Compare with testing
output=nn.evalNN(testing)
print("Testing: "+str(checkAccuracy(output, testingY))+"%")

#Compare with validation
output=nn.evalNN(validation)
print("Validation: "+str(checkAccuracy(output, validationY))+"%")

[[0.2058228 ]
 [0.25998109]
 [0.2229524 ]
 [0.01250293]
 [0.09207396]]
Training: 95.83333333333334%
Testing: 86.66666666666667%
Validation: 100.0%


### Dataset 2 : Statlog German

In [9]:
#Dataset 2
#Source: http://archive.ics.uci.edu/ml/machine-learning-databases/statlog/german/
credit=pd.read_csv('../data/external/german/german.data',header=None,sep=' ')
credit.columns=["ExistingChecking","MonthDuration","CreditHist","Purpose","CreditAmt","SavingsAcct","PresentEmploy","InstallmentRate","PersonalStatus","Other","PresentResidence","Property","Age","OtherInstallPlan","Housing","ExistingCredits","Job","PeopleLiable","Telephone","Foreign","label"]

credit=credit.dropna() #remove nulls
credit=credit.replace({'label': {1: 4, 2: 5}}) #set labels between 0 and 1
credit=credit.replace({'label': {4: 0, 5: 1}})
creditnew=credit.sample(frac=1).reset_index(drop=True) #shuffle


print("Shape: ")
print(creditnew.shape)
print("-----------------------------------")
print("DataFrame:")
print(creditnew.head())


Shape: 
(1000, 21)
-----------------------------------
DataFrame:
  ExistingChecking  MonthDuration CreditHist Purpose  CreditAmt SavingsAcct  \
0              A14             30        A34     A43       5954         A61   
1              A11             12        A32     A43       1680         A63   
2              A14             24        A34     A41       2346         A61   
3              A14             21        A34     A42       2288         A61   
4              A14              6        A34     A40       6761         A61   

  PresentEmploy  InstallmentRate PersonalStatus Other  ...   Property Age  \
0           A74                3            A93  A102  ...       A123  38   
1           A75                3            A94  A101  ...       A121  35   
2           A74                4            A93  A101  ...       A123  35   
3           A72                4            A92  A101  ...       A122  23   
4           A74                1            A93  A101  ...       A124  45 

In [10]:
#Distribute lengths and widths to one-hot encoding
from sklearn.preprocessing import OneHotEncoder
enc = OneHotEncoder(handle_unknown='ignore')
X = creditnew
enc.fit(X)

#Set Y values

Y=creditnew.iloc[:,-1]
print("Output")
print(Y[:5])

Output
0    0
1    0
2    0
3    0
4    0
Name: label, dtype: int64


In [11]:
nn = NeuralNetwork(training,trainingY,4,0.025)

iterations=1200
for i in range(iterations):
    nn.feedforward()
    nn.backprop()
    
#Print neural network output
print(nn.output[:5])
#Compare actual output to learning
print("Training: "+str(checkAccuracy(nn.output,trainingY))+"%")

#Compare with testing
output=nn.evalNN(testing)
print("Testing: "+str(checkAccuracy(output, testingY))+"%")

#Compare with validation
output=nn.evalNN(validation)
print("Validation: "+str(checkAccuracy(output, validationY))+"%")

[[0.20783627]
 [0.26768125]
 [0.22683163]
 [0.01457052]
 [0.09738911]]
Training: 95.83333333333334%
Testing: 86.66666666666667%
Validation: 100.0%
