<a href="https://colab.research.google.com/github/bryaanabraham/deep_learning_and_reinforcement_learning/blob/main/feedforward_network_as_matrix_computation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [25]:
import warnings
warnings.simplefilter('ignore')
import numpy as np
import matplotlib.pyplot as plt

In [26]:
def sigmoid(x):
    return 1.0 / (1.0 + np.exp(-x))

Provided below are the following:

Three weight matrices W_1, W_2 and W_3 representing the weights in each layer. The convention for these matrices is that each  𝑊𝑖,𝑗
  gives the weight from neuron  𝑖
  in the previous (left) layer to neuron  𝑗
  in the next (right) layer.
A vector x_in representing a single input and a matrix x_mat_in representing 7 different inputs.
Two functions: soft_max_vec and soft_max_mat which apply the soft_max function to a single vector, and row-wise to a matrix.
The goals for this exercise are:

For input x_in calculate the inputs and outputs to each layer (assuming sigmoid activations for the middle two layers and soft_max output for the final layer.
Write a function that does the entire neural network calculation for a single input
Write a function that does the entire neural network calculation for a matrix of inputs, where each row is a single input.
Test your functions on x_in and x_mat_in.
This illustrates what happens in a NN during one single forward pass. Roughly speaking, after this forward pass, it remains to compare the output of the network to the known truth values, compute the gradient of the loss function and adjust the weight matrices W_1, W_2 and W_3 accordingly, and iterate. Hopefully this process will result in better weight matrices and our loss will be smaller afterwards.

In [27]:
W_1 = np.array([[2,-1,1,4],[-1,2,-3,1],[3,-2,-1,5]])
W_2 = np.array([[3,1,-2,1],[-2,4,1,-4],[-1,-3,2,-5],[3,1,1,1]])
W_3 = np.array([[-1,3,-2],[1,-1,-3],[3,-2,2],[1,2,1]])
x_in = np.array([.5,.8,.2])

#toy dataset
x_mat_in = np.array([[.5,.8,.2],[.1,.9,.6],[.2,.2,.3],
                     [.6,.1,.9],[.5,.5,.4],[.9,.1,.9],[.1,.8,.7]])

def soft_max_vec(vec):
    return np.exp(vec)/(np.sum(np.exp(vec)))

def soft_max_mat(mat):
    return np.exp(mat)/(np.sum(np.exp(mat),axis=1).reshape(-1,1))

print('the matrix W_1')
print(W_1)
print('-'*30)
print('vector input x_in')
print(x_in)
print ('-'*30)
print('matrix input x_mat_in -- starts with the vector `x_in`')
print(x_mat_in)

the matrix W_1
[[ 2 -1  1  4]
 [-1  2 -3  1]
 [ 3 -2 -1  5]]
------------------------------
vector input x_in
[0.5 0.8 0.2]
------------------------------
matrix input x_mat_in -- starts with the vector `x_in`
[[0.5 0.8 0.2]
 [0.1 0.9 0.6]
 [0.2 0.2 0.3]
 [0.6 0.1 0.9]
 [0.5 0.5 0.4]
 [0.9 0.1 0.9]
 [0.1 0.8 0.7]]


In [28]:
z_2 = np.dot(x_mat_in,W_1)
z_2

array([[ 0.8,  0.7, -2.1,  3.8],
       [ 1.1,  0.5, -3.2,  4.3],
       [ 1.1, -0.4, -0.7,  2.5],
       [ 3.8, -2.2, -0.6,  7. ],
       [ 1.7, -0.3, -1.4,  4.5],
       [ 4.4, -2.5, -0.3,  8.2],
       [ 1.5,  0.1, -3. ,  4.7]])

In [29]:
a_2 = sigmoid(z_2)
a_2

array([[0.68997448, 0.66818777, 0.10909682, 0.97811873],
       [0.75026011, 0.62245933, 0.03916572, 0.98661308],
       [0.75026011, 0.40131234, 0.33181223, 0.92414182],
       [0.97811873, 0.09975049, 0.35434369, 0.99908895],
       [0.84553473, 0.42555748, 0.19781611, 0.98901306],
       [0.98787157, 0.07585818, 0.42555748, 0.99972542],
       [0.81757448, 0.52497919, 0.04742587, 0.9909867 ]])

In [30]:
z_3 = np.dot(a_2,W_2)
z_3

array([[ 3.55880727,  4.01355384,  0.48455118, -1.55014198],
       [ 3.92653518,  4.10921334,  0.18688365, -0.94879275],
       [ 3.88876887,  2.2842146 ,  0.4885584 , -1.58990857],
       [ 5.37777836,  1.31317855, -0.14871063, -0.19351275],
       [ 4.4547123 ,  2.94332939,  0.11913329, -0.8567627 ],
       [ 5.38551712,  1.01435726, -0.04904456, -0.44362315],
       [ 4.32829928,  3.76620031, -0.02433132, -0.52848494]])

In [31]:
a_3 = sigmoid(z_3)
a_3

array([[0.97231549, 0.98225163, 0.61882199, 0.17506576],
       [0.98066919, 0.9838446 , 0.54658541, 0.27912767],
       [0.9799401 , 0.90756123, 0.61976677, 0.16939676],
       [0.99540316, 0.78804456, 0.46289071, 0.45177222],
       [0.98850989, 0.94994727, 0.52974815, 0.29801615],
       [0.99543843, 0.73387201, 0.48774132, 0.39087798],
       [0.98698175, 0.97738352, 0.49391747, 0.37087032]])

In [32]:
z_4 = np.dot(a_3,W_3)
z_4

array([[ 2.04146788,  1.04718238, -3.47867612],
       [ 1.92205929,  1.42324752, -3.54057369],
       [ 1.9563182 ,  1.13151906, -3.27363361],
       [ 1.63308573,  2.17592794, -2.97738636],
       [ 1.84869797,  1.55211843, -3.46934914],
       [ 1.59253551,  2.05871663, -2.82613226],
       [ 1.8430245 ,  1.73746743, -3.5474088 ]])

In [33]:
y_out = soft_max_vec(z_4)
y_out

array([[0.09423345, 0.03486522, 0.00037743],
       [0.08362702, 0.05078266, 0.00035478],
       [0.08654163, 0.03793319, 0.00046333],
       [0.06263931, 0.10779543, 0.00062308],
       [0.07771166, 0.05776747, 0.00038097],
       [0.06015008, 0.09587296, 0.00072483],
       [0.07727201, 0.06953114, 0.00035236]])

An alternate/shorter method would be:

In [43]:
def comp_vec(x):
    return soft_max_vec(sigmoid(sigmoid(np.dot(x,W_1)).dot(W_2)).dot(W_3))

def comp_mat(x):
    return soft_max_mat(sigmoid(sigmoid(np.dot(x,W_1)).dot(W_2)).dot(W_3))

In [41]:
comp_vec(x_in)

array([0.72780576, 0.26927918, 0.00291506])

In [44]:
comp_mat(x_mat_in)

array([[0.72780576, 0.26927918, 0.00291506],
       [0.62054212, 0.37682531, 0.00263257],
       [0.69267581, 0.30361576, 0.00370844],
       [0.36618794, 0.63016955, 0.00364252],
       [0.57199769, 0.4251982 , 0.00280411],
       [0.38373781, 0.61163804, 0.00462415],
       [0.52510443, 0.4725011 , 0.00239447]])