## What is a Neural Network?

A neural network is a simplified computational model inspired by the way the human brain processes information. While not biologically precise, it mimics key learning mechanisms observed in neural systems. The brain consists of a vast network of neurons connected by synapses, which vary in **strength**. Stronger connections, formed through **repeated activation**, facilitate faster and more efficient signal transmission. For instance, touching a hot pan triggers a learned response via a well-established neural pathway that prompts immediate withdrawal. Similarly, neural networks in ML strengthen certain connections - represented by weights - through repeated exposure and learning, **favoring pathways that lead to more accurate predictions**.

This is a highly simplified explanation, but hopefully it helps you understand the basic concept.

## What is a Neuron?

Now let’s see how exactly a single neuron operates.
<center><img src="img/neural_network_2a.png" alt="Neural Network Basic Layers" width="395" height="390" /></center>
<p style="text-align: center; font-size: small;"><i><b>Figure 1.</b> How single neuron operates</i></p>

$$ H = x_1*w_1 + x_2*w_2 + x_3*w_3 + b $$

Here we take the example of what’s going on with a **single node** in the network. 

* $x_i$ are the **input values**.
* $w_i$ are the **weights** that express the importance of each $x_i$ input value for the ouput.
* $b$ is a **constant** bias. Bias is essentially a weight without an input term. It’s useful for having an **extra bit of adjustability** which is not dependant on the weights.
* $H$ is the **final output value**.

So finally, the output value of this node will be:

$$ (0.8 \times 0.3) + (0.1 \times 0.5) + (0.3 \times 0.6) + 0.1 = 0.24 + 0.05 + 0.18 + 0.1 = 0.57$$

## Perceptron
A Perceptron is a basic, **single-layer** neural network used for binary classification. 

* **Structure**: Consists of a single layer of neurons (or perceptrons). 
* **Function**: Can only learn linear decision boundaries, meaning it can only classify data that can be separated by a straight line. 
* **Training**: Uses a simple learning algorithm (Perceptron learning algorithm) that adjusts weights based on errors in classification. 
* **Limitations**: Cannot solve non-linear problems like the XOR problem. 

## Implementation
Now let's implement a basic perceptron.

In [1]:
import numpy as np

First, let's implement the multiplication between weights and $x_i$ values.

$$ H = x_1*w_1 + x_2*w_2 + x_3*w_3 + b $$

In [2]:
def weighted_sum(X: np.ndarray, w: np.ndarray, b: float) -> float:
    return np.dot(X, w) + b

Now, let's implement a prediction function which calculates an output prediction (based on current $w_i$ and $b$ values). We will use $0$ as a threshold between the positive and negative answer.

In [3]:
def predict(X: np.ndarray, w: np.ndarray, b: float) -> float:
    H = weighted_sum(X, w, b)
    return np.where(H >= 0.5, 1, 0) # Binary classification (0 or 1)

Now, let's implement a function that calculates the error between **predicted** and **actual** value. The learning rate is usually a small number between $0.001$ and $0.1$ that helps us iteratively with small steps to improve the **weights** and **bias** coefficients during training.

In [4]:
def calculate_error(y_true: np.ndarray, y_pred: np.ndarray, learning_rate: float) -> float:
    return learning_rate * (y_true - y_pred)

And of course we need to have methods that update the **weights** and **bias** coefficients.

In [5]:
def update_weights(w: np.ndarray, x: np.ndarray, error: float) -> np.ndarray:
    return w + error * x

def update_bias(b: float, error: float) -> float:
    return b + error

Finally, we need to add a method that executes the Perceptron training.

In [6]:
def train_perceptron(X: np.ndarray, Y: np.ndarray, w: np.ndarray, b: float, learning_rate: float, epochs: int) -> tuple[np.ndarray, float]:
    for _ in range(epochs):
        total_error = 0
        for i in range(len(X)):
            y_pred = predict(X[i], w, b)
            error = calculate_error(Y[i], y_pred, learning_rate)
            w = update_weights(w, X[i], error)
            b = update_bias(b, error)
            total_error += error
        print(f"Epoch {_ + 1} - Total Error: {total_error}")
    return w, b

And now, let's test what we've created. We would like to work on the AND Gate problem. The gate returns if and only if both inputs are true.

| $x_1$ | $x_2$ | $y$ |
| --- | --- | --- |
| 0 | 0 | 0 |
| 0 | 1 | 0 |
| 1 | 0 | 0 |
| 1 | 1 | 1 |

In [7]:
# Training data
X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
Y = np.array([0, 0, 0, 1])

# Create zero weights and bias
w = np.zeros(2)
b = 0
learning_rate = 0.1
epochs = 3

w, b = train_perceptron(X, Y, w, b, learning_rate, epochs)

print(f"Final weights: {w} and final bias: {b}")

Epoch 1 - Total Error: 0.1
Epoch 2 - Total Error: 0.1
Epoch 3 - Total Error: 0.0
Final weights: [0.2 0.2] and final bias: 0.2


We needed only 3 epochs to reach a zero error. The optimal weigts and bias are:
$$w_1 = 0.2 $$
$$w_2 = 0.2 $$
$$b = 0.2 $$

In this case the model comply with the GATE rule (note that our threshold value is $0.5$ - our perceptron considers everything below this value as $0$):

| $x_1$ | $x_2$ | $H$ | $y$ |
| --- | ---| --- | --- |
| 0 | 0 | **0.2** | 0 |
| 0 | 1 | **0.4** | 0 |
| 1 | 0 | **0.4** | 0 |
| 1 | 1 | **0.6** | 1 |