- Artificial neural networks are inspired by the operational structure of brains.

- "Each nerve cell in the brain is known as a _neuron_. Neurons in the brain are networked to one another via connections known as _synapses_. Electricity passes through synapses to power these networks of neurons - also known as _neural networks_. (This is a simplification.)

- It is a form of supervised machine learning.

Below we will look at a _feed-forward_ network with _backpropagation_ - the same type we will later be developing. _Feed forward_ just means a signal (input value) is passing through the network in one direction (forward). _Backpropagation_ means we calculate the error of the output of the network and try to distribute fixes on these errors back through the network, especially on neurons that were most responsible for them.

## Neurons

- A _neuron_ will hold a vector of weights. A vector of inputs (floats) is passed to the neuron, with the output being the dot product with the weight. An _activation function_ on the product then transforms the result as a final output.

- An activation function is almost always non-linear so that neural networks can help solve non-linear problems.

- If there are no activation functions, the entire neural network would just be a series of linear transformations.

![image.png](attachment:image.png)

## Layers

- Each layer consists of a certain number of neurons. The neurons from each layer send their outputs to be used as inputs to the neurons in the next layer. Every neuron is connected to every neuron in the next layer.

- First layer: _input layer_
- Last layer: _output layer_
- All layers in between the input and output layers: _hidden layers_

![image.png](attachment:image.png)

- The inputs to the input layer could represent the intensity of the pixels of an image and the outputs of the output layer could represent the probability of the image being one of the many possible classfications.

## Backpropagation

- Backpropagation finds the error in a neural network's output and use it to adjust the weights of the neurons. The neurons most responsible for the error are most affected.

- The whole process of calculating errors and adjusting weights is known as _training_.

- We must provide the correct outputs for the inputs during the training phase.

Steps:
1) Calculate error of each output neuron.

2) Apply the derivative of the output neuron's activation function to its input (pre-activation function output). This result is multiplied by the neuron's error to find its _delta_. This involves the use of partial derivatives.

3) Calculate the delta for every neuron of every layer. The deltas of one layer are used to calculate the deltas of the preceding layer.

4) Modifying the weight of each neuron by multiplying it by its delta and a small number called the _learning rate_. This is added to the existing weight to form the new weight. This modification is known as _gradient descent_. It is difficult to determine a good learning rate for an unknown problem without trial and error.

5) Repeat steps 1) to 4) until the network is deemed well trained by the neural network.

![image.png](attachment:image.png)

# Code

## The Dot Product
   To go in util.py

In [1]:
from typing import List
from math import exp

# dot product of two vectors
def dot_product(xs: List[float], ys: List[float]) -> float:
    return sum(x * y for x, y in zip(xs, ys))

## The activation function

- There are three main requirements of an activation function: it needs to be non-linear, its output needs to be bounded within a certain range, and it has a clear derivative for backpropagation.

The sigmoid function with output bounded between 0 and 1 is a popular choice.

![image.png](attachment:image.png)

To go in util.py

In [2]:
def sigmoid(x: float) -> float:
    return 1.0 / (1.0 + exp(-x))

def derivatie_sigmoid(x: float) -> float:
    sig: float = sigmoid(x)
    return sig * (1 - sig)