# 📌 Implementation of a Neural Network

##### Coding a neural network from scratch is a great way to understand how they work. In this notebook, I will implement a neural network from scratch with basic python libraries such as **numpy** and **matplotlib**. There will also be implementations of **forward pass**, **backward pass**, and **gradient descent**.

In [2]:
# Imports
import numpy as np
import sys
import matplotlib as plt

def versions():
    print("Python version: {}".format(sys.version))
    print("Numpy version: {}".format(np.__version__))
    print("Matplotlib version: {}".format(plt._version.version)) # type: ignore
    
versions()

Python version: 3.9.13 (main, Aug 25 2022, 23:51:50) [MSC v.1916 64 bit (AMD64)]
Numpy version: 1.21.5
Matplotlib version: 3.5.2


## A Single Neuron

##### Neuron which is a part of the **hidden layer** of a neural network is a mathematical function that takes in some inputs, performs some calculations on them, and produces an output.

In [3]:
inputs = [1.0, 2.5, 3.7]
weights = [4.5 , 2.1, 8.7]

bias = 2.7      # There is only one bias for a neuron.

output = inputs[0]*weights[0] + inputs[1]*weights[1] + inputs[2]*weights[2] + bias

output = round(output, 3)       # Reducing the size of the output
print(f"Output: {output}")

Output: 44.64


##### We could define a function for the neuron.

In [4]:
# Function of a single neuron with any input size

def neuron(inputs, weights, bias):
    output = 0
    for i in range(len(inputs)):
        output += inputs[i]*weights[i]
    output += bias
    
    return round(output, 3)

neuron(
    [1.0, 2.5, 3.7],
    [4.5 , 2.1, 8.7],
    4.7
)

46.64

## A Basic Layer

##### A layer is a **collection of neurons**. We can define a layer with specific amount of neurons with specific amount of inputs.

In [5]:
def layer(inputs, weights, biases):     # A layer is a collection of neurons
    output = []
    for i in range(len(biases)):
        output.append(neuron(inputs, weights[i], biases[i]))
        
    return output  # Returns a list of outputs from each neuron in the layer

layer(
    [1.0, 2.5, 3.7],
    [[4.5 , 2.1, 8.7], [1.2, 3.4, 5.6], [7.8, 9.1, 2.3]],   # 3 neurons with 3 weights each
    [4.7, 2.1, 8.9]
)

[46.64, 32.52, 47.96]

##### This layer function can be improved with extra parameters that describes the layer in more depth.

In [6]:
def layer(neuron_count, inputs, weights, biases):
    output = []
    for i in range(neuron_count):
        try:
            output.append(neuron(inputs, weights[i], biases[i]))
        except IndexError:
            print("Size of weights, inputs and biases must be equal to the number of neurons!")
            break
    return output  # Returns a list of outputs

layer(
    4,  # Number of neurons
    [1.0, 2.5, 3.7, 5.0],   # Number of inputs
    [[4.5, 2.1, 8.7, 0.5], [1.2, 3.4, 5.6, 1.3], [7.8, 9.1, 2.3, 5,3], [8.8, 9.9, 1.1, 3,3]],   # 4 neurons with 4 weights each
    [4.7, 2.1, 8.9, 1.2]   # 4 neurons with 1 bias each
)

[49.14, 39.02, 72.96, 53.82]

##### It's possible to define a layer using numpy library.

In [7]:
def layer_numpy(inputs, weights, biases):
    return np.dot(weights, inputs) + biases

layer_numpy(
    [1.0, 2.5, 3.7],
    [[4.5 , -2.1, 8.7], [-1.2, 3.4, 5.6], [7.8, 9.1, -2.3],],   # 3 neurons with 3 weights each
    [4.7, 2.1, 8.9]
)

array([36.14, 30.12, 30.94])

## Batch of Inputs

##### Multiple inputs can be passed to the layer at once. This is called a **batch**. A batch is a collection of inputs. **Batch size** effects the generalization of the model.

In [8]:
sample = [1.0, 2.0, 3.0, 2.5]   # Single sample with 4 features.

samples = [
    [1.0, 2.0, 3.0, 2.5],
    [2.0, 5.0, -1.0, 2.0],  # 3 samples with 4 features each.
    [-1.5, 2.7, 3.3, -0.8]
]

weights = [
    [0.2, 0.8, -0.5, 1.0],
    [0.5, -0.91, 0.26, -0.5],   # 4 neurons with 4 weights each.
    [-0.26, -0.27, 0.17, 0.87]
]

biases = [2.0, 3.0, 0.5]    # 3 neurons with 1 bias each.

print("Shape of samples: {}\nShape of weights: {}\nShape of biases: {}"
      .format(np.shape(samples), np.shape(weights), np.shape(biases)))

Shape of samples: (3, 4)
Shape of weights: (3, 4)
Shape of biases: (3,)


##### To be able to get the dot product of the batch and the weights, we need to transpose the weights.

In [9]:
weights = np.array(weights).T   # Transposing the weights matrix.
print("Shape of samples: {}\nShape of weights: {}\nShape of biases: {}"
      .format(np.shape(samples), np.shape(weights), np.shape(biases)))

Shape of samples: (3, 4)
Shape of weights: (4, 3)
Shape of biases: (3,)


In [10]:
output_l1 = np.dot(samples, weights) + biases   # Output of the first layer.
print(f"{output}\nShape of the first layer's output: {np.shape(output_l1)}")

44.64
Shape of the first layer's output: (3, 3)


### Adding Another Layer

In [11]:
# Second Layer
weights_2 = [
    [0.1, -0.14, 0.5],
    [-0.5, 0.12, -0.33],   # 3 neurons with 3 weights each.
    [-0.44, 0.73, -0.13]
]

biases_2 = [-1.0, 2.0, -0.5]    # 3 neurons with 1 bias each.

In [12]:
weights_2 = np.array(weights_2).T   # Transposing the weights matrix.

output_l2 = np.dot(output_l1, weights_2) + biases_2 # Output of the second layer.
print(f"{output_l2}\nShape of the second layer's output: {np.shape(output_l2)}")

[[ 0.5031  -1.04185 -2.03875]
 [ 0.2434  -2.7332  -5.7633 ]
 [-0.99314  1.41254 -0.35655]]
Shape of the second layer's output: (3, 3)


##### Layers can be defined as an object with a **forward** method. This method takes in a batch of inputs and returns a batch of outputs.

In [13]:
np.random.seed(42)

X = [
    [1, 2, 3, 2.5],
    [2.0, 5.0, -1.0, 2.0],
    [-1.5, 2.7, 3.3, -0.8]
]

class Layer_Dense:
    def __init__(self, n_inputs, n_neurons):
        self.weights = 0.1 * np.random.randn(n_inputs, n_neurons)  # Random weights with the same shape as the input and the number of neurons.
        self.biases = np.zeros((1, n_neurons))  # Biases with the shape of (1, n_neurons)
    def forward(self, inputs):
        self.output = np.dot(inputs, self.weights) + self.biases

In [14]:
l1 = Layer_Dense(4, 5)
l2 = Layer_Dense(5, 2)

l1.forward(X)

l2.forward(l1.output)

In [15]:
print(f"Output of the first layer: \n{l1.output}\n\nOutput of the second layer: \n{l2.output}")

Output of the first layer: 
[[-0.27675317 -0.09091057  0.36940631 -0.74258198 -0.7854546 ]
 [-0.08384138  0.6059603   0.55190831  0.07959199  0.11448039]
 [-0.24566894  0.37446278  0.16476186 -0.91395312 -0.27462437]]

Output of the second layer: 
[[ 0.07136211  0.01831101]
 [-0.05427832 -0.0786683 ]
 [ 0.07924331 -0.07230373]]


## Implementing the Activation Function

### Implementing the ReLU (Rectified Linear Unit) Activation Function

In [16]:
inputs = [0, 2, -1, 3.3, -2.7, 1.1, 2.2, -100]   # Input data
output = []

for i in inputs:    # ReLU activation function
    if i > 0:
        output.append(i)
    else:
        output.append(0)
print(f"Output of the ReLU activation function: {output}")

Output of the ReLU activation function: [0, 2, 0, 3.3, 0, 1.1, 2.2, 0]


In [17]:
output = []

for i in inputs:
    output.append(max(0, i))    # ReLU activation function using max() function

print(f"Output of the ReLU activation function: {output}")

Output of the ReLU activation function: [0, 2, 0, 3.3, 0, 1.1, 2.2, 0]


In [18]:
class Activation_ReLU:
    def forward(self, inputs):
        self.output = np.maximum(0, inputs)

### Implementing the Softmax Activation Function

In [19]:
layer_outputs = [[4.8, 1.21, 2.385], [8.9, -1.81, 0.2], [1.41, 1.051, 0.026]]
exp_values = []

for i in layer_outputs:
    exp_values.append(np.exp(i))
    
print("Exponential values: {}".format(exp_values))

Exponential values: [array([121.51041752,   3.35348465,  10.85906266]), array([7.33197354e+03, 1.63654137e-01, 1.22140276e+00]), array([4.0959554 , 2.8605102 , 1.02634095])]


#### Normalization

In [20]:
norm_base = sum(exp_values)

norm_values = []
for i in exp_values:
    norm_values.append(i / norm_base)
    
print("Normalized values: {}".format(norm_values))

Normalized values: [array([0.01629355, 0.52581832, 0.82850561]), array([0.98315722, 0.02566057, 0.09318843]), array([0.00054923, 0.44852111, 0.07830595])]


#### Softmax includes two steps of operation which are normalization and exponentiation.

In [21]:
exp_values = np.exp(layer_outputs)
normalized_exp = exp_values / np.sum(exp_values, axis=1, keepdims=True)
print("Softmax normalized values: \n{}".format(normalized_exp.round(3)))    # Rounded values after softmax activation.

Softmax normalized values: 
[[0.895 0.025 0.08 ]
 [1.    0.    0.   ]
 [0.513 0.358 0.129]]


In [24]:
def softmax(inputs):
    exp_values = np.exp(inputs)
    return exp_values / np.sum(exp_values, axis = 1, keepdims = True)

In [25]:
layer_outputs = [
    [4.8, 1.21, 2.385], 
    [8.9, -1.81, 0.2], 
    [1.41, 1.051, 0.026]
]
layer_outputs_softmax = softmax(layer_outputs)
print("Layer outputs after softmax activation: \n{}".format(layer_outputs_softmax.round(3)))

Layer outputs after softmax activation: 
[[0.895 0.025 0.08 ]
 [1.    0.    0.   ]
 [0.513 0.358 0.129]]


#### To prevent overflowing with rising exponential values, we can subtract the maximum value from the input (overflow prevention). With and without subtraction of the maximum value, the output of the softmax function is the same.

In [26]:
class Layer_Dense:
    def __init__(self, n_inputs, n_neurons):
        self.weights = 0.1 * np.random.randn(n_inputs, n_neurons)   # Random weights with the same shape as the input and the number of neurons.
        self.biases = np.zeros((1, n_neurons))      # Biases with the shape of (1, n_neurons)
    def forward(self, inputs):
        self.output = np.dot(inputs, self.weights) + self.biases

class Activation_ReLU:
    def forward(self, inputs):
        self.output = np.maximum(0, inputs)
        
class Activation_Softmax:
    def forward(self, inputs):
        exp_values = np.exp(inputs - np.max(inputs, axis = 1, keepdims = True))     # Subtracting the maximum value from the inputs to avoid overflow.
        probabilities = exp_values / np.sum(exp_values, axis = 1, keepdims = True)      # Normalizing the values.
        self.output = probabilities