<a href="https://colab.research.google.com/github/iriyagupta/GENAI-BA-CPlus/blob/main/NN_Example_2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Example 1 Code**

In [58]:
import numpy as np
X = np.array([  [0,0,1], #Since the first value is 0, the output value is 0
                [0,1,1], #Since the first value is 0, the output value is 0
                [1,0,1], #Since the first value is 1, the output value is 1
                [1,1,1]  #Since the first value is 1, the output value is 1
                ])
y = np.array([[0,0,1,1]]).T #Output array

def sigmoid(x,deriv=False):
    if deriv:
        return sigmoid(x)*(1-sigmoid(x))
    return 1/(1+np.exp(-x))

def run_net(X,y,activation_function=sigmoid,passes=10):
    np.random.seed(1) #seed the random numbers
    syn0 = 2*np.random.random((3,1)) - 1    #Calculate initial weights
    for i in range(0,passes):
        level_0 = X  #Input to the nn
        level_1 = activation_function(np.dot(level_0,syn0)) #New weights
        level_1_error = y - level_1 #error (note: y is 1/0; level_1 is (0,1))
        #Get the derivative of the sigmoid (the change) and multiply by the error
        level_1_delta = level_1_error * activation_function(level_1,True)
        syn0 += np.dot(level_0.T,level_1_delta) #Update the weights (level_0 * deltas)
    return syn0

**Training and testing inputs and outputs for Example 2**
* We'll see how well our example 1 network does for example 2

In [59]:
import numpy as np
#TRAINING INPUT
X = np.array([
    [0,0,1],
    [0,1,1],
    [1,0,1],
    [1,1,1],
    [1,1,0],
    [0,1,0],
    [1,0,0],
    [1,0,0]])

#TRAINING OUTPUT
y = np.array([[0],[1],[1],[1],[1],[0],[0],[0]])

#TESTING INPUT
test_X = np.array([[1,1,1],[0,1,1],[1,0,0],[0,0,1]])

#TESTING OUTPUT
test_y = np.array([1,1,0,0])


In [60]:
#Train the network
final_weights = run_net(X,y,sigmoid,10000)

#Predict using the testing data
probabilities = sigmoid(np.dot(test_X,final_weights))
predictions = (probabilities > 0.5).reshape(len(test_y))
accuracy = 1-(sum(abs(predictions-test_y)))/len(test_y)
print(accuracy)
print(test_y,predictions)

0.75
[1 1 0 0] [ True  True False  True]


**Not so good this time**

* The pattern here is non-linear
* Nonlinearities can be captured by adding layers to the network
* We can try adding a hidden layer to our network


**Three layer network</h2>
* Input layer: 3 nodes
* Hidden layer: 4 nodes (the structure of the hidden layer is our choice)
* Output: 1 node


**Initialize**
* The network has two sets of weights
* 1. set 1 between the input and the hidden layer
* 2. set 2 between the hidden and the output layer
*randomly assign weights at each level

In [47]:
syn0 = 2*np.random.random((3,4)) - 1
syn1 = 2*np.random.random((4,1)) - 1


**feed forward network**
* Calculate the node outputs at the hidden layer level
* These become the inputs to the next layer (could be a hidden layer or, as in our case, the output layer)
* In this way we can construct our **feed forward network**


In [48]:
level_0 = X
level_1 = sigmoid(np.dot(level_0,syn0)) #Hidden layer values
level_2 = sigmoid(np.dot(level_1,syn1)) #Output layer values (probabilities)
level_2

array([[0.5053376 ],
       [0.50687089],
       [0.51655106],
       [0.51587768],
       [0.46972561],
       [0.45029194],
       [0.46710261],
       [0.46710261]])

**Backpropagation**
* First calculate the error at the output layer (we know the true values)


In [49]:
level_2_error = y - level_2
level_2_error

array([[-0.5053376 ],
       [ 0.49312911],
       [ 0.48344894],
       [ 0.48412232],
       [ 0.53027439],
       [-0.45029194],
       [-0.46710261],
       [-0.46710261]])

* Then, calculate the change in weights using the derivative)
* This step is similar to our no hidden layer neural network
* level_2_delta is our change factor for the weights between the hidden layer and the output layer

In [50]:
level_2_delta = level_2_error*sigmoid(level_2,deriv=True)
level_2_delta

array([[-0.11860027],
       [ 0.11569104],
       [ 0.11314541],
       [ 0.11332227],
       [ 0.12551678],
       [-0.10705403],
       [-0.11063065],
       [-0.11063065]])

**Next, propagate the deltas back toward the input layer**
* This step is tricky. We don't know what the true values of the hidden layer nodes are so we'll just use level_2_delta as a proxy for the error

In [51]:
level_1_error = level_2_delta.dot(syn1.T)
level_1_delta = level_1_error * sigmoid(level_1,deriv=True)

**Calculate the new weights**
* We have the deltas, we can calculate the new weights

In [52]:
syn1 += level_1.T.dot(level_2_delta)
syn0 += level_0.T.dot(level_1_delta)

**Putting it all together**

In [63]:
def sigmoid(x,deriv=False):
    if deriv:
        return x*(1-x)
    return 1/(1+np.exp(-x))

def run_net(X,y,activation_function=sigmoid,passes=10):
    import time
    np.random.seed(1)
    syn0 = 2*np.random.random((3,4)) - 1
    syn1 = 2*np.random.random((4,1)) - 1

    for i in range(passes):
        level_0 = X
        level_1 = activation_function(np.dot(level_0,syn0))
        level_2 = activation_function(np.dot(level_1,syn1))

        level_2_error = y - level_2

        level_2_delta = level_2_error*activation_function(level_2,deriv=True)

        level_1_error = level_2_delta.dot(syn1.T)

        level_1_delta = level_1_error * activation_function(level_1,deriv=True)

        syn1 += level_1.T.dot(level_2_delta)
        syn0 += level_0.T.dot(level_1_delta)
    return syn0,syn1

In [64]:
syn0,syn1 = run_net(X,y,activation_function=sigmoid,passes=100)


**Applying the net to test cases**


In [None]:
test_X

array([[1, 1, 1],
       [0, 1, 1],
       [1, 0, 0],
       [0, 0, 1]])

In [65]:
level_0 = test_X
level_1 = sigmoid(np.dot(level_0,syn0))
level_2 = sigmoid(np.dot(level_1,syn1))
predictions = (level_2 > 0.5).reshape(len(test_y))
accuracy = 1-(sum(abs(predictions-test_y)))/len(test_y)
print(accuracy)
print(test_y,predictions)

1.0
[1 1 0 0] [ True  True False False]


**In Summary**
* By adding more hidden layers, the net can find patterns in higher dimensions
* However, as we make the network more complex, the computational power required increases because both feed forward as well as back propagation will be multiplying increasingly larger matrices
* But, because computing power has become cheap, and more accessible thanks to GPUs, neural networks are transforming AI
