<a href="https://colab.research.google.com/github/hsarfraz/Tiny-Machine-Learning/blob/main/0_5_weights_and_bias_in_neural_network.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Introduction

In this notebook I will be revisitng the single-layer neural network that I created in notebook 0.3

I will now add a additional layer to this neural network (making it a two-layered network) and will be adding a additional neuron to the first layer.

In [None]:
### IMPORTING LIBRARIES ###
import sys
import numpy as np
import tensorflow as tf
from tensorflow import keras

### MAKING SURE THAT USER HAS TENSORFLOW 2 AND PYTHON 3 ###

# This script requires TensorFlow 2 and Python 3.
if tf.__version__.split('.')[0] != '2':
    raise Exception((f"The script is developed and tested for tensorflow 2. "
                     f"Current version: {tf.__version__}"))

if sys.version_info.major < 3:
    raise Exception((f"The script is developed and tested for Python 3. "
                     f"Current version: {sys.version_info.major}"))

# Defining Key Concepts

Before I write the code of the single and double-layer neural networks I am going to define sone essential concepts/terms that will help in understanding how a neural net works.

## Neurons

Neurons are the basic units of a artificial neural network (ANN) or simulated neural network (SNN). Each neuron is connected to some/all of the neurons in the next layer. When inputs are transferred between neurons the weights are applied to the inputs along with the bias.

## Weights

Weights contol the signal (the connection strength) between two neurons. In other words, a weight decides how much influence the input would have on the neuron output.

## Bias

Biases are constant and always set to be 1 (this value can be changed). They are a additional input to the hidden and output layers but are not influenced by any layers behind them (they do not have any connections with the neurons in the previous layers). Biases are essentially constants associated with one neuron and their purpose is to ensure that when all the inputs, of the neuron, are zero that the neuron will still be activated.

Biases are added to each individual neuron. I have included a illustration that shows how biases are added to each neuron in a 3-layered neural network:

![illustration of how biases are added to each neuron in a neural net](images/0.5_bias_in_each_layer.jpg)

## Linear Transformation

Every neuron performs a linear transformation of its input using weights and biases. The linear transformation model is a equation of a straight line is slope-intercept form that looks like this:
$$
y= (weight*x) +bias
$$

It is important to ensure that a linear transformation is not the only thing that is used in each neuron because all layers in the neural network will behave in the same way since the composition of two linear functions is a linear function. A neural network will not be able to learn any complex task if linear transformations are only used in each neuron without anything else (such as activation functions).

## Activation Functions

**NOTE:** In this notebook, the neural networks that are created do not have activation functions which means that the linear transformation is only used in each neuron. I will be utlizing activation functions in future notebooks but wanted to define the concept here.

Activation functions are a additional step to each layer and run after the linear transformation, of each neuron from the previous layer, occurs. An activation function decides whether a neuron should be activated ("fired"). In other words, deciding whether sending the neuron's input to the next layer of the neural network is important.

There are many types of activation functions, some of them are:

*  Binary Step Activation Functions
*  Linear Activation Functions
*  Sigmoid Activation Functions
*  ReLU Activation Functions
*  Softmax Activation Functions

The image below illustrates how activation functions work. As you can see, the primary role of the activation function is to transform the summed weighted input from the neurons, in the previous layer, into a ouput value that can be fed into the next hidden layer or be used as final the neural networks final output.

![illustration of activation functions](images/0.5_activation_function.jpg)


# Retraining single layer network

I am re-training the original single layer network that was created in notebook 0.3 and will display the ML model prediction when x =10. I will also display the learned weights of the single layer network.

I am also re-sharing the illustration of the single layer neural network from notebook 0.3 to show how the neural netural works/functions

![illustration of single layer neural network in code](images/0.3_Neural_Network_illustration.jpg)

In [None]:
my_layer = keras.layers.Dense(units=1, input_shape=[1])
model = tf.keras.Sequential([my_layer])
model.compile(optimizer='sgd', loss='mean_squared_error')

xs = np.array([-1.0,  0.0, 1.0, 2.0, 3.0, 4.0], dtype=float)
ys = np.array([-3.0, -1.0, 1.0, 3.0, 5.0, 7.0], dtype=float)

model.fit(xs, ys, epochs=500, verbose=0)

<keras.src.callbacks.History at 0x7b17180dfd60>

In [None]:
k = 250
 print(f"User pressed the: {k}")
print(model.predict([10.0]))

[[18.97694]]


In [None]:
print(my_layer.get_weights())

[array([[1.996658]], dtype=float32), array([-0.98963875], dtype=float32)]


#### Next lets train a 2-layer network and see what its prediction and weights are.

In [None]:
my_layer_1 = keras.layers.Dense(units=2, input_shape=[1])
my_layer_2 = keras.layers.Dense(units=1)
model = tf.keras.Sequential([my_layer_1, my_layer_2])
model.compile(optimizer='sgd', loss='mean_squared_error')

xs = np.array([-1.0,  0.0, 1.0, 2.0, 3.0, 4.0], dtype=float)
ys = np.array([-3.0, -1.0, 1.0, 3.0, 5.0, 7.0], dtype=float)

model.fit(xs, ys, epochs=500, verbose=0)

<keras.src.callbacks.History at 0x7b170b699120>

In [None]:
print(model.predict([10.0]))

[[19.]]


In [None]:
print(my_layer_1.get_weights())
print(my_layer_2.get_weights())

[array([[1.5996603 , 0.04464365]], dtype=float32), array([-0.43972898,  0.17022227], dtype=float32)]
[array([[ 1.2641675],
       [-0.4981434]], dtype=float32), array([-0.35931277], dtype=float32)]


#### Finally we can manually compute the output for our 2-layer network to better understand how it works.

In [None]:
value_to_predict = 10.0

layer1_w1 = (my_layer_1.get_weights()[0][0][0])
layer1_w2 = (my_layer_1.get_weights()[0][0][1])
layer1_b1 = (my_layer_1.get_weights()[1][0])
layer1_b2 = (my_layer_1.get_weights()[1][1])


layer2_w1 = (my_layer_2.get_weights()[0][0])
layer2_w2 = (my_layer_2.get_weights()[0][1])
layer2_b = (my_layer_2.get_weights()[1][0])

neuron1_output = (layer1_w1 * value_to_predict) + layer1_b1
neuron2_output = (layer1_w2 * value_to_predict) + layer1_b2

neuron3_output = (layer2_w1 * neuron1_output) + (layer2_w2 * neuron2_output) + layer2_b

print(neuron1_output)
print(neuron2_output)
print(neuron3_output)

2.293332114815712
11.552650034427643
[18.999996]
