# Importing Libraries

In [2]:
import numpy as np
import pandas as pd

# Building data

The dataset we gonna train our Neural Network is Logical Gate AND.

The output would be '1' if both input 'x1' and 'x2' is 1 else it will be zero if one of the input would be zero.

In [3]:
AND = pd.DataFrame({'x1': (0,0,1,1), 'x2': (0,1,0,1), 'y': (0,0,0,1)})
AND

Unnamed: 0,x1,x2,y
0,0,0,0
1,0,1,0
2,1,0,0
3,1,1,1


# Building the network

We start by first intializing random weights which gonna be updated during the process of training. But we choose 3 random weight for the beginning

In [4]:
w = np.random.randn(3)*1e-4

Let's see what weights look like.

In [6]:
w

array([ 3.78054957e-05, -1.68848826e-04,  7.56758901e-05])

Here, we define our own activation function with our own logic.

It takes data(input) and weight(w) as input and calculate the dot product between both of them. 

Activation function here works like an output it generates for the final result. If the value at that index of the array is greater than zero then it will set that as 1 else it will set that as 0

For ex:

If it generates the dor product between weights and data as:
[4.5464565476, -1.35435346, 2.3423453, -5.3425435345]

Then the output it will give will be:
[1, 0, 1, 0]




In [7]:
g = lambda inputs, weights: np.where(np.dot(inputs, weights)>0, 1, 0)

Now, we define a training function which iterates through the number of iterations provided by user and adjust the weight till it's adpated to the input.

In [9]:
def train(data, targets, weights, lr, n_iterations):

    # Adding the bias term
    data = np.c_[data, -np.ones((len(data), 1))]

    for n in range(n_iterations):

        activations = g(data, weights);
        weights -= lr*np.dot(np.transpose(data), activations - targets)
        
    return(weights)

First, we add the bias term in our data by adding new columns with all '-1' in it.

Then we iterate through the algorithm with number of iterations.

We will get the result generated from activation function and we assign it to 'Activation function'

Here in activation we will get the input which we substract with actual answer(activation). We can also consider it as an error term as well like `(y - y_hat)`

And we update the weight by multiplying the learning rate with dot product between data and loss(activation - target). 

The weight won't update if we reach the desire output, where activation will be equal to target so their difference would be zero and it won't update any further because it will only substract zero to current weight.

# Checking Performace

We pass data which is independent variable and target which is dependent variable to our network and get the weight which it adjusted during the training.

We run it for 10 iteration with learning rate of 0.25

In [10]:
data = AND[['x1','x2']]
target = AND['y']

w = train(data, target, w, 0.25, 10)

Check the result we get when we mulitply those weight with our data.

In [16]:
g(np.c_[data, -np.ones((len(data), 1))], w)

array([0, 0, 0, 1])

We get the desired output of the AND gate.

# Visualise the training process

Let's see how it updates the weight at every iteration and check at what iteration it changes it's value when we change the learning rate.

In [34]:
w = np.random.randn(3)*1e-4

In [35]:
def train(data, targets, weights, lr, n_iterations):

    # Adding the bias term
    data = np.c_[data, -np.ones((len(data), 1))]
    print('Output and Weights at particular iteration: ')
    for n in range(n_iterations):

        activations = g(data, weights)
        weights -= lr*np.dot(np.transpose(data), activations - targets)
        print(n)
        print("Output: ", activations, "Weights: ", weights)
        print()
        
    return(weights)

In [36]:
w = train(data, target, w, 0.25, 10)

Output and Weights at particular iteration: 
0
Output:  [1 1 1 1] Weights:  [-0.25005838 -0.2499776   0.74991194]

1
Output:  [0 0 0 0] Weights:  [-5.83810722e-05  2.23991734e-05  4.99911943e-01]

2
Output:  [0 0 0 0] Weights:  [0.24994162 0.2500224  0.24991194]

3
Output:  [0 1 1 1] Weights:  [-5.83810722e-05  2.23991734e-05  7.49911943e-01]

4
Output:  [0 0 0 0] Weights:  [0.24994162 0.2500224  0.49991194]

5
Output:  [0 0 0 1] Weights:  [0.24994162 0.2500224  0.49991194]

6
Output:  [0 0 0 1] Weights:  [0.24994162 0.2500224  0.49991194]

7
Output:  [0 0 0 1] Weights:  [0.24994162 0.2500224  0.49991194]

8
Output:  [0 0 0 1] Weights:  [0.24994162 0.2500224  0.49991194]

9
Output:  [0 0 0 1] Weights:  [0.24994162 0.2500224  0.49991194]



As, we can see the weights stop updating from 5th iteration because that's when we reached the desired output so the loss would be 0. 

Let's change the learning rate with different scale. We will intialise random weights again else it will start through the current one.

In [37]:
w = np.random.randn(3)*1e-4
w = train(data, target, w, 10, 10)

We got the desired output at 7th iteration with the learning rate as 10.

Let's experiment with learning rate which is so less.

In [39]:
w = np.random.randn(3)*1e-4
w = train(data, target, w, 0.00001, 10)

Output and Weights at particular iteration: 
0
Output:  [0 0 1 1] Weights:  [ 2.18614589e-04 -6.90725304e-05  1.43275103e-04]

1
Output:  [0 0 1 1] Weights:  [ 2.08614589e-04 -6.90725304e-05  1.53275103e-04]

2
Output:  [0 0 1 0] Weights:  [ 2.08614589e-04 -5.90725304e-05  1.53275103e-04]

3
Output:  [0 0 1 0] Weights:  [ 2.08614589e-04 -4.90725304e-05  1.53275103e-04]

4
Output:  [0 0 1 1] Weights:  [ 1.98614589e-04 -4.90725304e-05  1.63275103e-04]

5
Output:  [0 0 1 0] Weights:  [ 1.98614589e-04 -3.90725304e-05  1.63275103e-04]

6
Output:  [0 0 1 0] Weights:  [ 1.98614589e-04 -2.90725304e-05  1.63275103e-04]

7
Output:  [0 0 1 1] Weights:  [ 1.88614589e-04 -2.90725304e-05  1.73275103e-04]

8
Output:  [0 0 1 0] Weights:  [ 1.88614589e-04 -1.90725304e-05  1.73275103e-04]

9
Output:  [0 0 1 0] Weights:  [ 1.88614589e-04 -9.07253042e-06  1.73275103e-04]



Despite many iteration it still didn't generate the desired output. Because the learning is slow and updation of weight is also very small as well. There won't be big jump from first iteration to second in terms of loss because there's not much difference in weight.

# OR Gate

Let's train our network on different dataset. We will use OR gate dataset for this and follow the same steps we did for AND

In [41]:
OR = pd.DataFrame({'x1': (0,0,1,1), 'x2': (0,1,0,1), 'y': (0,1,1,1)})
OR

Unnamed: 0,x1,x2,y
0,0,0,0
1,0,1,1
2,1,0,1
3,1,1,1


In [42]:
w = np.random.randn(3)*1e-4

In [45]:
def train(data, targets, weights, lr, n_iterations):

    # Adding the bias term
    data = np.c_[data, -np.ones((len(data), 1))]
    for n in range(n_iterations):

        activations = g(data, weights)
        weights -= lr*np.dot(np.transpose(data), activations - targets)
        
    return(weights)

In [46]:
inputs = OR[['x1','x2']]
target = OR['y']

w = train(inputs, target, w, 0.25, 20)

In [66]:
g(np.c_[inputs, -np.ones((len(inputs), 1))], w)

array([0, 0, 0, 0])

The output is generated as desired.