# Neural Networks Part 2

Referenced Textbook: https://cobweb.cs.uga.edu/~jam/scalation_guide/comp_data_science.pdf

Specifically Chapter 10 - Section 4 (Starting on page 319)

## 2 Layer Neural Nets

So, now that we understand what a percptrons. Lets complicate things a little more by now expanding to 2 layers. Show below is the basic 2L NN structure

![](../pics/2l_nn/2l_nn.png)



From here, lets turn this into a model equation.

we need to start buy summing the dot product of each of the b vectors with the x vectors. Then apply some function (activation function in this case) and add that our error. This is shown more concretely below:

![](../pics/2l_nn/mod_eqn1.png)

Now, instead of looking as this at the vector level lets condense the above expression down to the matrix level. 

![](../pics/2l_nn/mod_eqn2.png)


At this point, things are very similar to what we did with perceptron. Only real difference is the now instead of working with vectors we are working with matrices. 

Predicted Value Matrix:

![](../pics/2l_nn/pred.png)

Negative Error Matrix

![](../pics/2l_nn/neg_err.png)

Delta Matrix

![](../pics/2l_nn/delta.png)   

Below you will notice a strange symbol of a dot surrouned by a circle. This is donoting what is called the Hadamard product. The Hadamard product is just a fancy way of saying element-wise multiplication. 

Element-wise multiplication for matrices looks like the following:

![](../pics/2l_nn/had_prod.png) 

Parameter (Weights) Update Equation

![](../pics/2l_nn/update.png)

## Code Implimentation

### Imports

In [None]:
import numpy as np
from tensorflow.keras import activations
import matplotlib.pyplot as plt
import numpy as np

### Creating Inputs

In [None]:
# 9 data points: one x1 x2 y

xy = np.matrix (
[[1.0, 0.0, 0.0, 0.5],
[1.0, 0.0, 0.5, 0.3],
[1.0, 0.0, 1.0, 0.2],
[1.0, 0.5, 0.0, 0.8],
[1.0, 0.5, 0.5, 0.5],
[1.0, 0.5, 1.0, 0.3],
[1.0, 1.0, 0.0, 1.0],
[1.0, 1.0, 0.5, 0.8],
[1.0, 1.0, 1.0, 0.5]]
)

In [None]:
# Taking first 3 columns of xy as matrix
X = xy[:,0:3]

In [None]:
# Taking last column of xy as array
y = np.array(xy[:, 3])

# Squaring y and saving as array
ysq = np.array(y ** 2)

In [None]:
# Concatenating y and ysq as columns to for Y matrix
Y = np.concatenate((y, ysq), axis=1)
Y

In [None]:
# Initializing B matrix
B = np.matrix (
[[0.1, 0.1],
[0.2, 0.1],
[0.1, 0.1]]
)

### Testing NN Parts

In [None]:
# Pre-activation matrix
# U = np.dot(X, B)
U = X.dot(B)
U

In [None]:
# Predicted value matrix
Y_hat = activations.sigmoid(U)
Y_hat

In [None]:
# Negative error matrix
E = Y_hat - Y
E

In [None]:
# f'(U) * E -> Hadamard Product (element wise multiplication)
# f'(U) will be depend on activation and be different for tanh

# sigmoid -> y_hat(1-y_hat) 

# Correction matrix
Delta = np.multiply(np.multiply(Y_hat,(1-Y_hat)),E)
Delta

In [None]:
# Gradients matrix
G = np.transpose(X) * Delta
G

In [None]:
# Eta is learning rate
n = 1

# Updated values matrix
B = B - G 
B

### Putting 2L NN In Loop

In [None]:
# Setting B as new variable so it will not over written in loop
new_params = B

In [None]:
# Lists to hold values for plotting
x_list = []
sse_list = []
rsq_list = []

In [None]:
# Loop version for calulating SST

sst = 0
for column in Y.T:
    # print(np.mean(column))
    for each in column:
        sst = sst + (each - np.mean(column)) ** 2
        
sst

In [None]:
# Condensed version of SST

sst = np.sum(np.subtract(Y, np.mean(Y, axis=0))**2)

sst

In [None]:
# 20 epochs for 2L NN

for i in range(0, 25):
    
    # Pre-activation matrix
    U = np.dot(X, new_params)
    
    # Predicted value matrix
    Y_hat = activations.sigmoid(U)
    
    # Negative error matrix
    E = Y_hat - Y
    
    # Correction matrix
    Delta = np.multiply(np.multiply(Y_hat,(1-Y_hat)),E)
    
    # Gradients matrix
    G = np.transpose(X) * Delta

    # Eta preliminarily set to 1
    n = 1
    
    # Updated values matrix
    new_params = new_params - G 
    
    # Sum of squared errors
    sse = (np.linalg.norm(E)) ** 2
    # sse = np.sum(E ** 2)

    x_list.append(i)
    sse_list.append(sse)
    rsq_list.append(1-sse/sst)


In [None]:
# Plotting our findings

plt.figure(figsize=(20, 10))

plt.subplot(1, 2, 1)
plt.plot(x_list, sse_list)
plt.title('SSE vs Iterations')
plt.xlabel('Iterations')
plt.ylabel('SSE')

plt.subplot(1, 2, 2)
plt.plot(x_list, rsq_list)
plt.title("R^2 vs Iterations")
plt.xlabel('Iterations')
plt.ylabel('R^2')

plt.show()