## Introduction 

In this notebook, we will try and build a neural network from scratch. The problem that we will aim to solve is the XOR Classification problem. 

### The XOR classification problem

#### What is an XOR Gate?

#### The Truth Table

#### The Problem

In [1]:
from numpy import exp, array, random, dot
import numpy as np
import pickle

#### Data

In [2]:
X=np.array([[1,0,1],[0,1,0],[0,1,1], [1,0,0]])

In [3]:
y=np.array([[0],[1],[0],[1]])

#### Activation (Sigmoid)

In [4]:
#Sigmoid Function
def sigmoid (x):
    return 1/(1 + np.exp(-x))

#Derivative of Sigmoid Function
def derivatives_sigmoid(x):
    return x * (1 - x)

#### Architecture

In [5]:
#Variable initialization
epoch=5000 #Setting training iterations
inputlayer_neurons = X.shape[1] ## Dimension of the Input
hiddenlayer_neurons = 4 ## number of hidden layers neurons
output_neurons = 1 ## number of neurons at output layer

#### Training

In [11]:
def train(epoch, step, lr=0.1):
    
    ## Weights of the hidden layer
    wh=np.random.uniform(size=(inputlayer_neurons,hiddenlayer_neurons))
    ## print("wh shape", wh.shape)
    ## print("initial hidden layer weights", wh)

    ## Weights of the output layer
    wout=np.random.uniform(size=(hiddenlayer_neurons,output_neurons))
    ## print("wout shape", wout.shape)
    ## print("initial output layer weights", wout)
    
    for i in range(epoch):

        ## Forward Pass
        hidden_layer_input=np.dot(X,wh)
        hiddenlayer_activations = sigmoid(hidden_layer_input)
        
        output_layer_input=np.dot(hiddenlayer_activations,wout)
        output = sigmoid(output_layer_input)

        ## Backpropagation
        E = y-output
        
        #print("y", y)
        #print("output", output)
        #print("E", E)
        
        slope_output_layer = derivatives_sigmoid(output)    
        delta_out = E * slope_output_layer
        wout += hiddenlayer_activations.T.dot(delta_out) *lr
        
        slope_hidden_layer = derivatives_sigmoid(hiddenlayer_activations)
        Error_at_hidden_layer = delta_out.dot(wout.T)
        delta_hiddenlayer = Error_at_hidden_layer * slope_hidden_layer
        
        wh += X.T.dot(delta_hiddenlayer) *lr
        
        if(i % step == 0 ):
            print("Error:", np.mean(np.abs(Error_at_hidden_layer)))

    print("===============================================")
    print("Model:")
    print("===============================================")
    print("hidden layer weights", wh)
    print("output layer weights", wout)
    print("===============================================")
    
    print("output")
    print("===============================================")
    print(output)
    print("===============================================")
    ## Save the models
    np.save(open("HiddenLayerWeights.npy", "wb"), wh)
    np.save(open("OutputLayerWeights.npy", "wb"), wout)

In [12]:
a = np.array([[1,1,1], [2,3,1]])
print(a.shape)
b = np.array([[1,1], [1,2], [1,3]])
print(b.shape)
np.dot(a,b)

(2, 3)
(3, 2)


array([[ 3,  6],
       [ 6, 11]])

In [13]:
train(50000, 50000)

Error: 0.03387358020917815
Model:
hidden layer weights [[ 1.93698339 -0.44981753  0.37765154 -0.75526182]
 [ 1.83558848 -0.53715841  0.50201906 -0.72512734]
 [-4.63092804  2.46651515 -1.76917984  2.96655418]]
output layer weights [[ 6.78661194]
 [-2.79117827]
 [ 2.06506143]
 [-3.65760879]]
output
[[0.00725579]
 [0.99272828]
 [0.00740039]
 [0.99261355]]


#### Predict

In [9]:
wh = np.load((open("HiddenLayerWeights.npy", "rb")))
wout = np.load((open("OutputLayerWeights.npy", "rb")))

print(wh)
print(wout)

[[-1.44338564  1.46861384  0.91939304 -0.19118869]
 [-1.32663223  1.36373852  0.97329631 -0.50852512]
 [ 3.71370757 -3.70695875 -2.65093602  2.22453266]]
[[-4.56546558]
 [ 5.30870698]
 [ 3.44093213]
 [-2.12177877]]


In [10]:
newX = [[0,1,1], [0,1,0]]

hidden_layer_input=np.dot(newX,wh)
hiddenlayer_activations = sigmoid(hidden_layer_input)
output_layer_input=np.dot(hiddenlayer_activations,wout)
output = sigmoid(output_layer_input)
print(output)

[[0.00687502]
 [0.9931099 ]]


References: 

1. [Neural network](https://www.analyticsvidhya.com/blog/2017/05/neural-network-from-scratch-in-python-and-r/)
2. [Build a neural netowrk in 4 minuts](https://www.youtube.com/watch?v=h3l4qz76JhQ)
3. [Gradient Descent](https://www.analyticsvidhya.com/blog/2017/03/introduction-to-gradient-descent-algorithm-along-its-variants/)