# Biological Fundamentals

#### Leading points
- Scientific study estimates the amount of neurons in an adult brain to be more than 100 billion. 
- All these neurons are connected and interconnected. 
- Information flows between the neurons via these information-link connections which go to explaining human capabilities such as walking, reading, typing, understanding, questioning and so on. 
- These connections control communications, emotions, creativity etc.. 

This leads to defining a neural network as `a network of neurons that exchange information`. 

# Basic components of a Neuron

![neuron](https://www.simplilearn.com/ice9/free_resources_article_thumb/diagram-of-a-biological-neuron.jpg)

- Dendrites
    - receive data from other neurons.
- Cell body 
    - processes the data received in an information transfer. The information is the flow of electrical signals and its transfer is called `synapse`. The `synapse` is the journey from the Dendrites to the point of continued transfer from the Axon via the terminals. After the process of `synapse` biological chemicals enter the Dendrites for the purpose of increasing/decreasing the electrical potential of the cell body.
    - The electrical flow in a biological neuron is what gives the potential of the cell which will lead to decision making. 
    - Therefore, we can say that new connections (and new learning) is formed from these potentials.  
- Axon
    - transmits signals to other neurons using the Axon terminals.
- Axon terminals

# The Artificial Neuron

The artificial neuron mimics the biological structure. We have the equivalence of `Dendrites`, `Cell bodies` & `Axom terminals` in the artificial setting. 

- It is entirely possible for an an indefinite amount of inputs and outputs to an artificial neuron. 
- The inputs are information, data or datums from environment.
- The outputs are the final response of the perceptron such as a decision or prediction.

#### Example consideration

In order to predict a persons salary we might reasonably expect that to be based on two key attributes:
- age
- Educational Background

The perceptron receives the age as an input to the equivalent of the Dendrites, this will typically be represented by a figure, followed by another number for years of study or numerical indicator of depth of education. This is processed and the output will also be a number that indicates/predicts the salary of the profile based on the inputs.

There is a `black box` around the `Cell body` of a neuron because it is not that easy to interpret what happens during this process. 

**Important disclaimers**
- it is not known truly how the human brain works but there are significant insights which form the opinions on which the work of all artificial neural technology is based. 
- Artificial neural networks are merely an abstraction of what is known/accepted in this field of study. 
- They are nothing more than a simulation of a brain, or thought process.
- We can depict the artificial neuron as follows: 
![neuron](https://www.researchgate.net/profile/Mike_Riley/publication/299490278/figure/fig1/AS:626481235517442@1526376174991/Artificial-Neuron-Structure.png)

#### key takeaways:
- Inputs are of an indeterminate number
- each input is weighted, weighting dictates importance/credence factors.
- We then have the `sum function` and the `activation function` which equates to the `black box` of above. 
- It works on the basis of $sum = \sum\limits_{i=1}^n xi \cdot wi$
- In an example with 4 inputs this means that what is passed to the sum function would be: $x1 \cdot w1 + x2 \cdot w2 + x3 \cdot w3 + x4 \cdot w4$

# The Perceptron

The Perceptron is the combination of the inputs & weights passed to the sum function and the activation function. Above we seen that the sum function has the job of taking each input & multiplying it by the associated weight for that input, adding to the results of other $input \cdot weight$ calculations. 

## Example case 1
If we take a two-input example case for age and education, we could have the following attributes:
- age: input=35, weight=0.8
- education: input=25, weight=0.1

#### First simplification
The sum function is now: 
- $sum = (35 \cdot 0.8) + (25 \cdot 0.1)$

#### Second simplification
The first simplification is: 
- $sum = (28) + (2.5)$ = $30.5$

#### Summary
We can now apply the activation function. This indicates whether a neuron was `fired or not` or `activated or not`. This `synapse` will change the electrical potential in the biological example but we have no electrical signal in the artificial example so how we represent that in the simplest terms is a `step function` that makes a simple fork decision:
- Greater or equal to 1 = 1 (neuron activated)
- Otherwise = 0

In the sample above we have: 
$(35 \cdot 0.8) + (25 \cdot 0.1) \sum f = 1$

In this simple example that firing is decision tree and in our example analysis we can say a: 
- `1` indicates the person might receive a salary increase. 
- `0` indicates not. 

You can see in this example our step/activation function is trivial and in real cases we will have a more complex set of decision forks.

## Example case 2
For example case two we will the same structure of a two-input perceptron, we will re-use the age & education factors and only change the weights. We now have the following attributes:
- age: input=35, weight=-0.8
- education: input=25, weight 0.1

#### Simplification 1
$sum = (35 \cdot -0.8) = (25 \cdot 0.1) = (-28 +2.5) = -25.5$

#### Summary
Meaning that under the same `step function` decision fork the nueron will have the negative value and be aggregated to a zero (unfired) in our case and indicate that person may not receive a salary increase in the current scoring. 

#### Conclusion

We can see that the graph of our step function has only a window between zero and one. At both zero and one we have straight lines of cut-off meaning the values above or below do not matter. Depending on the application we can define the step function thresholds in order to create categorised returns. 

#### Complimentary theoretical definitions of a Perceptron
- Positive weight indicates an exciting synapse (electrical increase of the cell body, or greater likelihood of activation)
- Negative weight indicates an inhibitory synpase. Lessening the chances of activation.
- Weights are considered synapses
- Weights amplify or reduce the input signal. _(see differentiation between value in `ex1 & ex2` purely based on weighting)_
- The knowledge of a neural network _is_ the weights. _(The goal of a neural net is to learn the best set of weights that fits a given dataset)_


# The Single Layer Perceptron - Version 1

In accordance to the lessons above we need to define and implement the step and sum functions. 

#### Functions

In [1]:
# sum function. 
def sum(inputs, weights):
    # checks length f params is good
    if len(inputs) == len(weights):
        s = 0
        for i in range(2):
            s += inputs[i] * weights[i]
        return s
    else:
        print(f"ERROR: inputs length={len(inputs)} : weights length={len(weights)}")

In [2]:
# step function
def step_function(sum):
    if sum >= 1 : return 1
    return 0

#### Execution example 1

In [3]:
# create a list of the input scores for 
# age, education respectively 

inputs = [35,25]

# create a list of the weightings to 
# apply to each input 

weights = [0.8, 0.1]

# call the step function passing in the result
# of the sum function call that takes in the 
# lists for inputs and weights. 

step_function(sum(inputs, weights))

1

In our example here we return a `1`. This means the neuron `is fired`. In the definition of our example the employee would have qualified for a salary increase based on the decision forks implemented and the parameters passed.

#### Execution example 2 

In [4]:
# Keep same params for age, education.
inputs = [35,25]

# change the weights to a negative on age
weights = [-0.8, 0.1]

# call the step with updated weights.
step_function(sum(inputs, weights))

0

In this example we are returning a `0`. The neuron is `not fired`. In this case the employee would not qualify for a salary increase based on the decision forks and parameters passed.

# The Single Layer Perceptron - Version 2

In version 1 we use a standard python loop and standard python lists and for a trivial example like this one this wold be fine, but neural nets will often run on high volumes of data ad therefore performance is a key aspect to remember a every stage otherwise our solution may not be credible or usable in a production setting. We are going to rewrite the single layer example using numpy and taking otimisations into account.

In [5]:
import numpy as np

In [6]:
def sum(inputs, weights):
    # use the np builtin on ndarrays
    # to return the product
    return inputs.dot(weights)

In [7]:
def step_function(sum):
    if sum >= 1: return 1
    return 0

In [8]:
# check the positive weights v2 edition
# creating inputs and weights as np arrays
inputs = np.array([35,25])
weights = np.array([0.8, 0.1])

# execute v2 
step_function(sum(inputs, weights))

1

In [9]:
# check the negative weights v2 edition
# creating inputs and weights as np arrays
inputs = np.array([35,25])
weights = np.array([-0.8, 0.1])

# execute v2 
step_function(sum(inputs, weights))

0

# Updating Weights

As was stated earlier, the goal of a neural network is to determine the best set of weights to apply in order to classify some data. 

Using the age, education example above the `nn` needs to find the best weights. To make it easier we will use the `binary and` operator to determine values from a tabular example. 

| input value 1 | input value 2 | Class |
| -- |--- |-------|
| 0  | 0  | 0 |
| 0  | 1  | 0 |
| 1  | 0  | 0 |
| 1  | 1  | 1 |

To process this table with our Perceptron we need to 'give' it each of our values in the table. 
- $(0 \cdot 0) + (0 \cdot 0) = 0$ - This outcome matches the result table 
- $(0 \cdot 0) + (1 \cdot 0) = 0$ - This outcome matches the result table 
- $(1 \cdot 0) + (0 \cdot 0) = 0$ - This outcome matches the result table 
- $(1 \cdot 0) + (1 \cdot 0) = 0$ - This outcome does not match the result table and we have an error. 

Simple division shows that our model for this example is currently 75% correct because we achieved 3 of 4 expected results, but the correct way to calculate a models accuracy is to table the expected and actual with error: 

| class | prediction | error |
| -- |--- |-------|
| 0  | 0  | 0 |
| 0  | 0  | 0 |
| 0  | 0  | 0 |
| 1  | 0  | 1 |

#### Error calculation formula
- error = correct - prediction

#### Reducing the error
Our goal is to reduce the errors and achieve a higher accuracy for out Perceptron. We update our model by changing the weights and using the formula: 
- weight(n + 1) = weight(n) + (learning_rate * input * error)

The `learning rate` of neural networks is typically a fixed value of low increment. eg. 0.1, 0.01, 0.001 etc. This parameter indicates the speed at which the network will learn how much the value of the weights will be changed. 

# The Learning Rate

keeping the same example going we will use the sample from our previous test which was wrong. 
- $(1 \cdot 0) + (1 \cdot 0) = 0$ 

To update the weights in line with the learning rate `(0.1)` we need to apply the formula to both sides of sample, $x1, x2$. 
- $x1$ weight$(n+1) = 0 + (0.1 \cdot 1 \cdot 1)$
- $x1$ weight$(n+1) = 0.1$
- $x2$ weight$(n+1) = 0 + (0.1 \cdot 1 \cdot 1)$
- $x2$ weight$(n+1) = 0.1$

**important note** the weight shifts for _all_ examples, not just the sample that are in error, so our example above has changed the entire calculations applied across all 4 samples in our dataset. 



Our new weight is `0.1`. This is applied and a new result is calculated and passed to the step function. 
```python
def step_function(sum):
    if sum >= 1: return 1
    return 0
```

- $0 \cdot 0.1 + 0 \cdot 0.1$  or $0 + 0 = 0$ _(<1 = 0)_

- $0 \cdot 0.1 + 1 \cdot 0.1$  or $0 + 0.1 = 0.1$ _(<1 = 0)_

- $1 \cdot 0.1 + 0 \cdot 0.1$  or $0.1 + 0 = 0.1$ _(<1 = 0)_

- $1 \cdot 0.1 + 1 \cdot 0.1$  or $0.1 + 0.1 = 0.2$ _(<1 = 0)_


### What is an epoch?
We have incremented the weights, but still achieved the same result and still have a 75% accuracy. Each iteration of this weight shifting process is called an **epoch**.

### Jumping forward to _n_th epoch
Let's jump forward to after **5 epochs** for our example. 
- $0 \cdot 0.5 + 0 \cdot 0.5$  or $0 + 0 = 0$ _(<1 = 0)_

- $0 \cdot 0.5 + 1 \cdot 0.5$  or $0 + 0.5 = 0.5$ _(<1 = 0)_

- $1 \cdot 0.5 + 0 \cdot 0.5$  or $0.5 + 0 = 0.5$ _(<1 = 0)_

- $1 \cdot 0.5 + 1 \cdot 0.5$  or $0.5 + 0.5 = 1.0$ _(1 = 1) the desired answer_

### Updating the Error table 

| class | prediction | error |
| -- |--- |-------|
| 0  | 0  | 0 |
| 0  | 0  | 0 |
| 0  | 0  | 0 |
| 1  | 1  | 0 |


#### Summary notes

We can say that the intention of the Perceptron is to find weights that are shared between all instances in the dataset in order to correctly classify all instances (_or most instances_). Given we have a tiny dataset in this example we have found a way to get to 100% correct classification but that is not always possible and the goal is to find the best weights where perfect ones cannot be identified.  

_As an aside... It's probably worth noting the scale of the problem here, we have a simple Perceptron with two inputs, which we have admitted is tiny. In a commercial application setting the size of an input selection will be significantly larger. For example imagine we are doing image classification on an image of `800 x 600` pixels. Each pixel would be an input and this means for that one image we have `480,000` inputs alone. Now we are applying the training weight shifts across all of those._


# Example 1

# Implementing the Learning Rate with `binary and` operator

In this section we will implement the learning rates weight adjustments 

In [10]:
import numpy as np

In [11]:
# for the inputs we will use a matrix format instead of a vector format. 
inputs = np.array([[0,0], [0,1], [1,0], [1,1]])

In [12]:
# check the shape of the inputs 
inputs.shape

(4, 2)

In [13]:
outputs = np.array([0,0,0,1])

In [14]:
# check outputs shape 
outputs.shape

(4,)

In [15]:
# define the weights. Given we have a two input perceptron
# we need a two-weight vector. 
weights = np.array([0.0, 0.0])

In [16]:
# define the learning rate, ie the parameter that dictates 
# the learning speed increments. 
learning_rate = 0.1 

In [17]:
# create the step function to determine
# an artificail being fired or not. 
def step_function(sum):
    if sum >= 1: return 1
    return 0

In [18]:
# the calculate output functions
def calculate_output(instance):
    s = instance.dot(weights)
    return step_function(s)

In [19]:
calculate_output(np.array([[0,0]]))

0

In [20]:
# create the training steps (weight updates)
def train():
    total_error = 1
    while total_error != 0:
        total_error = 0
        for i in range(len(outputs)):
            prediction = calculate_output(inputs[i])
            error = abs(outputs[i] - prediction)
            total_error += error
            
            if error > 0:
                for j in range(len(weights)):
                    weights[j] = weights[j] + (learning_rate * inputs[i][j] * error)
                    print("Weighting updated: " + str(weights[j]))
        print('Total error: ' + str(total_error))


In [21]:
train()

Weighting updated: 0.1
Weighting updated: 0.1
Total error: 1
Weighting updated: 0.2
Weighting updated: 0.2
Total error: 1
Weighting updated: 0.30000000000000004
Weighting updated: 0.30000000000000004
Total error: 1
Weighting updated: 0.4
Weighting updated: 0.4
Total error: 1
Weighting updated: 0.5
Weighting updated: 0.5
Total error: 1
Total error: 0


In [22]:
weights

array([0.5, 0.5])

In [23]:
calculate_output(np.array([0,0]))

0

In [24]:
calculate_output(np.array([0,1]))

0

In [25]:
calculate_output(np.array([1,0]))

0

In [26]:
calculate_output(np.array([1,1]))

1

# Example 2

# Implementing the Learning Rate with `binary or` operator

In [27]:
import numpy as np

In [28]:
# for the inputs we will again use a matrix 
# format instead of a vector format. 
inputs = np.array([[0,0], [0,1], [1,0], [1,1]])

In [29]:
# we are using a binary or operator in this example
# that means the check will return a positive if
# either input of the peceptron is positive 
outputs = np.array([0,1,1,1])

In [30]:
# Weights initialised as zeros once more
weights = np.array([0.0, 0.0])

In [31]:
# learning rate
learning_rate = 0.1 

In [32]:
def step_function(sum):
    if sum >= 1: return 1
    return 0

In [33]:
# the calculate output functions (same as for binary and)
def calculate_output(instance):
    s = instance.dot(weights)
    return step_function(s)

In [34]:
# create the training steps (weight updates, same as for binary and)
def train():
    total_error = 1
    while total_error != 0:
        total_error = 0
        for i in range(len(outputs)):
            prediction = calculate_output(inputs[i])
            error = abs(outputs[i] - prediction)
            total_error += error
            
            if error > 0:
                for j in range(len(weights)):
                    weights[j] = weights[j] + (learning_rate * inputs[i][j] * error)
                    print("Weighting updated: " + str(weights[j]))
        print('Total error: ' + str(total_error))

In [35]:
train()

Weighting updated: 0.0
Weighting updated: 0.1
Weighting updated: 0.1
Weighting updated: 0.1
Weighting updated: 0.2
Weighting updated: 0.2
Total error: 3
Weighting updated: 0.2
Weighting updated: 0.30000000000000004
Weighting updated: 0.30000000000000004
Weighting updated: 0.30000000000000004
Weighting updated: 0.4
Weighting updated: 0.4
Total error: 3
Weighting updated: 0.4
Weighting updated: 0.5
Weighting updated: 0.5
Weighting updated: 0.5
Total error: 2
Weighting updated: 0.5
Weighting updated: 0.6
Weighting updated: 0.6
Weighting updated: 0.6
Total error: 2
Weighting updated: 0.6
Weighting updated: 0.7
Weighting updated: 0.7
Weighting updated: 0.7
Total error: 2
Weighting updated: 0.7
Weighting updated: 0.7999999999999999
Weighting updated: 0.7999999999999999
Weighting updated: 0.7999999999999999
Total error: 2
Weighting updated: 0.7999999999999999
Weighting updated: 0.8999999999999999
Weighting updated: 0.8999999999999999
Weighting updated: 0.8999999999999999
Total error: 2
Weight

In [36]:
# check the weights, post training 
weights

array([1.1, 1.1])

In [37]:
calculate_output(np.array([0,0]))

0

In [38]:
calculate_output(np.array([0,1]))

1

In [39]:
calculate_output(np.array([1,0]))

1

In [40]:
calculate_output(np.array([1,1]))

1

### Example results
Taking out weights we can see: 
- inputs (0,0) = $(0 \cdot1.1) + (0 \cdot1.1) = (0 + 0) = 0$
- inputs (0,1) = $(0 \cdot1.1) + (1 \cdot1.1) = (0 + 1.1) = (1.1 > 1) = 1$
- inputs (1,0) = $(1 \cdot1.1) + (0 \cdot1.1) = (1.1 + 0) = (1.1 > 1) = 1$
- inputs (1,1) = $(1 \cdot1.1) + (1 \cdot1.1) = (1.1 + 1.1) = (2.2 > 1) = 1$


#### Summary notes 
Although a very simple, trivial neural network we have already managed to make some interesting classifications. We can highlight that different datasets can require different weights. In the comparison of examples between the `binary and` & `binary or` operators we can see the weights of `0.5` and `1.1` respectively. What we can also highlight is that in our examples we have matched weights between inputs 1 and 2. This is not typical in real world (more complex datasets).

# Example 3

# Implementing Learning Rates with the `binary xor` operator

In [41]:
import numpy as np 

inputs = np.array([[0,0], [0,1], [1,0], [1,1]])

# we are using a binary xor operator in this example
# that means the check will return a positive if
# either input of the peceptron is positive BUT NOT
# if both are positive 
outputs = np.array([0,1,1,0])

# Weights initialised as zeros once more
weights = np.array([0.0, 0.0])

# learning rate
learning_rate = 0.1 

In [42]:
def step_function(sum):
    if sum >= 1: return 1
    return 0

In [43]:
# the calculate output functions (same as for binary and)
def calculate_output(instance):
    s = instance.dot(weights)
    return step_function(s)

In [44]:
# Note the implementation of epochs control as simply creating the training 
# steps with (weight updates, same as for binary and and binary or) as we did
# in exaples 1 & 2 will set off an infinite unresolved loop whic will eventually
# segment fault, and crash the machine as the memory os depleted. 
def train():
    total_error = 1
    while total_error != 0:
        total_error = 0
        for i in range(len(outputs)):
            prediction = calculate_output(inputs[i])
            error = abs(outputs[i] - prediction)
            total_error += error
            
            if error > 0:
                for j in range(len(weights)):
                    weights[j] = weights[j] + (learning_rate * inputs[i][j] * error)
                    print("Weighting updated: " + str(weights[j]))
        print('Total error: ' + str(total_error))

What we see in the fault above is that for `binary and` and `binary or` they are **_linearly separable_** problems. The `binary xor` is not - it is a **_non-linearly separable_** problem. This is a drawback of the single layer perceptron neural network, it is capable of working on linearly separable problems only. This typically means simple problems with a strong correlation between the inputs and the weights. 

In order to solve the `binary xor` non-linearly separable problem we are required to use more complex networks. This is basically a `multi-layer perceptron`, or `many layers perceptron`. This will be the subject of the next lesson. 

# Additional reading resources

1. ["A logical calculus of the ideias immanent in nervous activity"](https://www.cs.cmu.edu/~./epxing/Class/10715/reading/McCulloch.and.Pitts.pdf)

2. 