# Neural Networks

## A mathematical model of a neural network is a collection of neurons that are connected in layers

### A neural network is a collection of neurons that are connected in layers.
- A neural network has an input layer, hidden layers, and an output layer.
- The input layer is the first layer of the neural network.
    - The input layer has neurons that take the input to the neural network.
    - **Note**: The input to the neural network is the data that the network is trained on. \
    Not the data that the network is predicting. The input sits at the beginning of the neurons \
    dendrites and aggregates the input signals via dot product to the nucleus to which we \
    apply an activation function, add the bias and pass the output to the axon terminals of \
    other neurons.
- The hidden layers are the layers between the input and output layers.
- The output layer is the last layer of the neural network.
    - The output layer has neurons that produce the output of the neural network.
    - The output of the neural network is the prediction of the network.

---

### A neural network’s forward pass:
- The forward pass is the process of calculating the output of a neural network given an input.
- The output of the neural network is calculated using the weights and biases of the neurons in the network.
- The output of the neural network is the prediction of the network.
- The prediction of the network is the output of the last layer of neurons.

---

$$
\begin{equation}
L = -\sum_{l=1}^{N} y_l \log \left( \frac{\exp\left(\sum_{i=1}^{n_2}\left(\forall_{j=1}^{n_2} \max\left(0,\sum_{i=1}^{n_1}\left(\forall_{j=1}^{n_1} \max\left(0,\sum_{i=1}^{n_0} X_i w_{1,i,j} + b_{1,j}\right)\right)w_{2,i,j} + b_{2,j}\right)\right)w_{3,i,j} + b_{3,j}\right)}{\sum_{k=1}^{n_3} \exp\left(\sum_{i=1}^{n_2}\left(\forall_{j=1}^{n_2} \max\left(0,\sum_{i=1}^{n_1}\left(\forall_{j=1}^{n_1} \max\left(0,\sum_{i=1}^{n_0} X_i w_{1,i,j} + b_{1,j}\right)\right)w_{2,i,j} + b_{2,k}\right)\right)w_{3,i,k} + b_{3,k}\right)}\right)
\end{equation}
$$

---

### The Forward Pass can be represented as a series of matrix multiplications 

In [None]:
import numpy as np
from src.functions.activation import Sigmoid


# create the input data
X = np.array([[1.0, 2.0, 3.0, 2.5],
                [2.0, 5.0, -1.0, 2.0],
                [-1.5, 2.7, 3.3, -0.8]])

# create the Expected output data
y = np.array([[0, 1, 0],
                [1, 0, 1],
                [0, 1, 0]])

# create the weights
w1 = np.array([[0.2, 0.8, -0.5, 1.0],
                [0.5, -0.91, 0.26, -0.5],
                [-0.26, -0.27, 0.17, 0.87]])
w2 = np.array([[0.1, -0.14, 0.5],
                [-0.5, 0.12, -0.33],
                [-0.44, 0.73, -0.13]])
w3 = np.array([[-0.1, -0.14, -0.5],
                [0.5, 0.12, -0.33],
                [-0.44, 0.73, -0.13]])

# create the biases
b1 = np.array([2.0, 3.0, 0.5])
b2 = np.array([-1.0, 2.0, -0.5])
b3 = np.array([2.0, 3.0, 0.5])

# create the activation function
sigmoid = Sigmoid()

# calculate the output/loss of the neural network
loss = -np.log(  # cross-entropy loss
    np.sum(  # sum over the output neurons
        y * np.exp(  # element-wise multiplication with the exponential of the output
            np.dot(  # dot product of the output
                np.maximum(  # ReLU activation
                    0,  # ReLU
                    np.dot(  # dot product of the hidden layer
                        np.maximum(  # ReLU activation
                            0,  # ReLU
                            np.dot(  # dot product of the input layer
                                X,  # input data
                                w1.T  # transpose of the weights
                            ) + b1 # add the bias
                        ),  # ReLU
                        w2.T  # transpose of the weights
                    ) + b2  # add the bias
                ),  # ReLU
                w3.T  # transpose of the weights
            ) + b3  # add the bias
        ) /  # divide by the sum of the exponential of the output
        np.sum(  # sum over the output neurons
            np.exp(  # exponential of the output
                np.dot(  # dot product of the output
                    np.maximum(  # ReLU activation
                        0,  # ReLU
                        np.dot(  # dot product of the hidden layer
                            np.maximum(  # ReLU activation
                                0,  # ReLU
                                np.dot(  # dot product of the input layer
                                    X,  # input data
                                    w1.T  # transpose of the weights
                                ) + b1  # add the bias
                            ),  # ReLU
                            w2.T  # transpose of the weights
                        ) + b2  # add the bias
                    ),  # ReLU
                    w3.T  # transpose of the weights
                ) + b3  # add the bias
            ),  # exponential of the output
        axis=1, # sum over the output neurons
        keepdims=True  # keep the dimensions of the output
        ) # sum over the output neurons
    )  # sum over the output neurons
)  # cross-entropy loss

# verify the accuracy of the loss
# assert loss.shape == (3, 3), "The shape of the loss is incorrect"

# print the loss
print(loss)

# Layer of Neurons

### A layer of neurons is a collection of neurons that take the same number of inputs and produce the same number of outputs.

#### The output of each neuron is calculated as follows:

---

$$
\begin{align*}
\text{``Predictions``} & = \text{Activation Function}(\text{weights} \cdot \text{inputs} + \text{Bias}) \\
\text{``Weighted Sum of Inputs w/ Bias``} & = \sigma(\sum_{i=1}^{n} w_i \cdot x_i + b) \\
\text{``Weighted Sum of Inputs w/ Bias``} & = \sigma(w_1 \cdot x_1 + w_2 \cdot x_2 + . . . + w_n \cdot x_n + b)
\end{align*}
$$

---

#### The Predictions are the output of the layer of neurons.
- The weights are the weights of the neurons in the layer.
- The inputs are the inputs to the layer.
- The bias is the bias of the neurons in the layer.
- The activation function is the activation function of the neurons in the layer.
- The weighted sum of inputs w/ bias is the weighted sum of the inputs to the layer plus the bias.
- The weighted sum of inputs w/ bias is the dot product of the weights and inputs plus the bias.

#### The output of each neuron is calculated as follows:
- The weighted sum of inputs and bias is calculated.
- The activation function is applied to the weighted sum of inputs and bias.
- The result is the prediction of the neuron.
- The predictions of all the neurons in the layer are returned as a list.
- The output of the layer of neurons is the list of predictions.

#### The output of the layer of neurons is a list of predictions, one for each neuron in the layer.

### Using Dot Product

#### Calculate the weighted sum of inputs and add the bias

---

$$
\begin{align*}
\text{Weighted Sum w/ Bias} & = \sum_{i=1}^{n} w_i \cdot x_i + b \\
\text{Weighted Sum w/ Bias} & = w_1 \cdot x_1 + w_2 \cdot x_2 + . . . + w_n \cdot x_n + b
\end{align*}
$$

---

```python
inputs = [1.0, 2.0, 3.0, 2.5]
weights = [0.2, 0.8, -0.5, 1.0]
bias = 2.0

# calculate the weighted sum of inputs and add the bias for each neuron
output = [
# Neuron 1: 
inputs[0]*weights[0] + inputs[1]*weights[1] + inputs[2]*weights[2] + inputs[3]*weights[3] + bias,
]

predictions = ActivationFunction(output)
```

In the context of binary classification using a sigmoid activation function, a prediction close to "1" typically indicates a positive class, while a prediction close to "0" indicates a negative class. Whether "1" or "0" is considered good or bad depends on the true label of the data point:

- If the true label is "1" (positive class), a prediction close to "1" is good, and a prediction close to "0" is bad.
- If the true label is "0" (negative class), a prediction close to "0" is good, and a prediction close to "1" is bad.



In [28]:
# Layer of Neurons Example 
import numpy as np
from src.functions.activation import Sigmoid
from src.encoder.label import encode as encode_labels

# initialize the activation function
sigmoid = Sigmoid()

# convert the words to sums of Unicode values
input_words = np.array(['Cat', 'Dog', 'Rabbit', 'Horse'])
encoded_inputs = encode_labels(input_words)
print(f"Encoded Inputs: \n{encoded_inputs}")

# initialize seed for reproducibility
np.random.seed(100)


# TODO: Our weights define the number of neurons in the layer. This layer has 4 neurons with 4 inputs each to match your input data. In the 'np.random.rand(4, 4)' function, the first argument is the number of neurons in the layer, and the second argument is the number of inputs to each neuron.
# initialize random weights and biases
weights = np.random.rand(4, 1)
print(f"Weights: \n{weights}")

# initialize random bias
# This layer has 4 neurons, so we need 4 biases.
# The bias is a 4x1 matrix. Meaning we have 4 biases for the 4 neurons in the layer, in the shape of a 4x1 matrix and 1 input per neuron.
bias  = np.random.rand(4, 1)
print(f"Bias: \n{bias}")

# convert the words to sums of Unicode values
labels = np.array(['Cat', 'Dog', 'Rabbit', 'Horse'])
encoded_labels = encode_labels(labels)
print(f"Encoded Labels: \n{encoded_labels}")

# get the weighted sums of the inputs and add the bias
outputs = np.dot(weights, encoded_inputs)  + bias
print(f"Outputs: \n{outputs}")

# apply the activation function and get the predictions
# The prediction is a sort of transformation of the output of the neurons in the layer.
predictions = sigmoid(outputs)
print(f"Predictions: \n{predictions}")

# Error
error_rate = 1 - predictions
print(f"Error Ratio: \n{error_rate}")

Encoded Inputs: 
[[104.       104.666664 104.666664 109.      ]]
Weights: 
[[0.54340494]
 [0.27836939]
 [0.42451759]
 [0.84477613]]
Bias: 
[[0.00471886]
 [0.12156912]
 [0.67074908]
 [0.82585276]]
Encoded Labels: 
[[104.       104.666664 104.666664 109.      ]]
Outputs: 
[[56.5188328  56.88110138 56.88110138 59.23585751]
 [29.07198517 29.25756405 29.25756405 30.4638321 ]
 [44.82057852 45.10358917 45.10358917 46.94316648]
 [88.68257052 89.24575246 89.24575246 92.90645118]]
Predictions: 
[[1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]]
Error Ratio: 
[[0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00]
 [2.36699549e-13 1.96509475e-13 1.96509475e-13 5.88418203e-14]
 [0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00]
 [0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00]]


In [3]:
from src.encoder.label import encode as encode_labels

# Test the label encoder
labels = ["cat", "dog", "fish", "elephant", "lion", "tiger", "bear"]
encoded_labels = encode_labels(labels)
print(f"Encoded labels: \n{encoded_labels}\n")
print(f"Data type: \n{encoded_labels.dtype}\n")
print(f"Shape: \n{encoded_labels.shape}\n")
print(f"Size: \n{encoded_labels.size}\n")
print(f"Number of dimensions: \n{encoded_labels.ndim}\n")
print(f"Item size: \n{encoded_labels.itemsize}\n")
print(f"Total bytes: \n{encoded_labels.nbytes}\n")
print(f"Strides: \n{encoded_labels.strides}\n")
print(f"Flags: \n{encoded_labels.flags}\n")
print(f"Ctypes: \n{encoded_labels.ctypes}\n")
print(f"Base: \n{encoded_labels.base}\n")
print(f"Data: \n{encoded_labels.data}\n")
print(f"Transpose: \n{encoded_labels.T}\n")
print(f"Real part: \n{encoded_labels.real}\n")
print(f"Imaginary part: \n{encoded_labels.imag}\n")
print(f"Flat: \n{encoded_labels.flat}\n")
print(f"Item: \n{encoded_labels.item}\n")
print(f"List: \n{encoded_labels.tolist()}\n")
print(f"Bytes: {encoded_labels.tobytes()}\n")

Encoded labels: 
[[104.       104.666664 106.5      106.125    108.5      107.8
  102.5     ]]

Data type: 
float32

Shape: 
(1, 7)

Size: 
7

Number of dimensions: 
2

Item size: 
4

Total bytes: 
28

Strides: 
(28, 4)

Flags: 
  C_CONTIGUOUS : True
  F_CONTIGUOUS : True
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  WRITEBACKIFCOPY : False


Ctypes: 
<numpy._core._internal._ctypes object at 0x1088246b0>

Base: 
None

Data: 
<memory at 0x1088cd7d0>

Transpose: 
[[104.      ]
 [104.666664]
 [106.5     ]
 [106.125   ]
 [108.5     ]
 [107.8     ]
 [102.5     ]]

Real part: 
[[104.       104.666664 106.5      106.125    108.5      107.8
  102.5     ]]

Imaginary part: 
[[0. 0. 0. 0. 0. 0. 0.]]

Flat: 
<numpy.flatiter object at 0x7fa8c0f42e00>

Item: 
<built-in method item of numpy.ndarray object at 0x106e46310>

List: 
[[104.0, 104.66666412353516, 106.5, 106.125, 108.5, 107.80000305175781, 102.5]]

Bytes: b'\x00\x00\xd0BUU\xd1B\x00\x00\xd5B\x00@\xd4B\x00\x00\xd9B\x9a\x99\xd7B\x00\

In [None]:
# Graph the error on a line plot
from plotly import graph_objects as go

fig = go.Figure()
fig.add_trace(go.Scatter(x=labels, y=error_rate, mode='lines+markers'))
fig.update_layout(title='Error Ratio', xaxis_title='Labels', yaxis_title='Error')
fig.show()

In [None]:

inputs=[1, 2, 3, 2.5]

weights1 = [0.2, 0.8, -0.5, 1] 
weights2 = [0.5, -0.91, 0.26, -0.5] 
weights3 = [-0.26, -0.27, 0.17, 0.87]

bias1 = 2 
bias2 = 3 
bias3 = 0.5

# calculate the weighted sum of inputs and add the bias for three neurons.
outputs = np.array([
    # Neuron 1:
    inputs[0]*weights1[0] + inputs[1]*weights1[1] + inputs[2]*weights1[2] + inputs[3]*weights1[3] + bias1,
    # Neuron 2: 
    inputs[0]*weights2[0] + inputs[1]*weights2[1] + inputs[2]*weights2[2] + inputs[3]*weights2[3] + bias2,
    # Neuron 3: 
    inputs[0]*weights3[0] + inputs[1]*weights3[1] + inputs[2]*weights3[2] + inputs[3]*weights3[3] + bias3
    ])
print(outputs)

# apply the activation function and get the predictions
predictions = sigmoid(outputs)
print(predictions)

# [4.8, 1.21, 2.385]
# [0.99194602, 0.77015115, 0.9158585]

# Error ratio is the difference between the prediction and the actual value
error_ratio = 1 - predictions
print(error_ratio)