# Biological Neural Networks

Consider the functioning of a biological neuron system

It would consist of multiple neurons connected to each other, which would send signals to each other.
The signal transmitted could either excite or inhibit the target neuron.

And the output signal of the neuron can vary in magnitude based on the amount of excitation.

The output is always non-negative as a neuron is incapable of producing a "negative signal"

# Mathematical Representation
Thus, we aim to represent the network as a mathematical function.

Suppose that we model the neuron as having non-negative output, which models the biological neuron where it is fires a signal or not; where there is no "negative" signal.

We denote the n inputs as $x_i$ for 1 to n

A simple way to accumulate the signals at the neuron (where y is the signal received by the neuron) would simply be 

$$y =  \sum ^ n _ 1 x_i$$

However, this fails to capture the possible inhibitory nature of the input signal. And this model treats all input signals as having equal effect on the output, which is not the case in the biological neuron.

Thus, to account for this behaviour, we would incorporate weights ($w_i$) to each of the input signal, resulting in 

$$y =  \sum ^ n _ 1 x_i w_i$$

Since we allow the weights to be negative, we can allow certain input signals to be treated as inhibitory to the neuron.

Now, consider the case where there is many inhibitory signals. This may result in the resulting sum to be negative, which would violate our model that requires the output to be positive.

Secondly, also consider the case where there is many excitory input signals, it may result in an unboundedly large output signal, which may not model real-life neurons which have a limit to how much output it can produce.

Thus, to closely model these traits, we incorporate a "squash" function that aims to clamp the values of the output between 0 and 1. Squash functions includes step function and sigmoid function.

We also incorporate a bias term into the squash function $\varphi$ to act as the threshold for that specific neuron to fire.

$$y =  \varphi (\sum ^ n _ 1 x_i w_i - b)$$

Thus, this concludes our mathematical model of the neuron.

For ease of notation, in a particular neuron k,

$$u_k = \sum ^m_{j=1} w_{kj}x_j$$

$$y_k = \varphi (u_k+b_k)$$

$x_1\dots x_m$: the input signals

$w_{k1}\dots w_{km}$: the synaptic weights of neurons

$u_k$: the linear combiner output due to input signals

$b_k$: the bias

$\varphi (\cdot)$: the activation function

$y_k$: the output signal

$v_k = u_k + b_k$: induced local field


# Mathematical Simplification
Alternatively, we can formulate the model as:

$$v_k = \sum ^m _{j=0} w_{kj}x_j$$

$$y_k = \varphi(v_k)$$

Where we incorparate the bias as part of the input signal with $x_0 = 1$ and $w_{k0} = b_k$

This allows us to compute the bias together with in the input rather than being a seperate step.


# Network Architectures

## Layered Feedforward Networks

* Nodes are separated into subsets called layers

* There are no connections from layer $i$ to layer $j$ if $i > j$

* Intra-layer connections (connections within the layer) may exist

## Single-Layer Feedforward Networks

The layer refers to the output layer, we do not count the input layer because there is no computation there.

## Multi-layer Feedforward Networks

Input of each layer is the output signal of the preceding layer only.

There will be one or more hidden layers between the input and output layers, which contains hidden neurons that will provide useful computation before the signal reaches the output layer.

## Recurrent Neural Networks

As feedback is common in a biological nervous system, we can have networks where the neuron's output is used as inputs in the previous layer or even in the same neuron.

# Learning
Neural network acquires knowledge via a **learning process**, where knowledge is stored as **synaptic weights**
## Supervised Learning
A "teacher" will provide the desired output for a given input and the neural network will adjust according to the error signal
## Reinforcement Learning
Network learns by making various actions, and a learning system will reward/penalized the network based on its actions. There is no explicit error signal
## Unsupervised/Self-organized Learning
Weights are purely adjusted based on input signals


# Benefits of Neural Networks

## High computational power

1. Generalization is possible, to produce reasonable outputs for inputs never encountered during training
2. Has massively parallel distributed structure which allows for parallel computation

## Properties

1. Nonlinearity
2. Adaptivity (plasticity): Able to adapt their weights to changes in environment
3. Fault tolerance: If a neuron/synaptic link is damaged, the overall response may be unaffected because the information is stored in a distributed manner.