# Neural Networks

## A mathematical model of a neural network is a collection of neurons that are connected in layers

### A neural network is a collection of neurons that are connected in layers.
- A neural network has an input layer, hidden layers, and an output layer.
- The input layer is the first layer of the neural network.
    - The input layer has neurons that take the input to the neural network.
    - **Note**: The input to the neural network is the data that the network is trained on. \
    Not the data that the network is predicting. The input sits at the beginning of the neurons \
    dendrites and aggregates the input signals via dot product to the nucleus to which we \
    apply an activation function, add the bias and pass the output to the axon terminals of \
    other neurons.
- The hidden layers are the layers between the input and output layers.
- The output layer is the last layer of the neural network.
    - The output layer has neurons that produce the output of the neural network.
    - The output of the neural network is the prediction of the network.

---

### A neural network’s forward pass:
- The forward pass is the process of calculating the output of a neural network given an input.
- The output of the neural network is calculated using the weights and biases of the neurons in the network.
- The output of the neural network is the prediction of the network.
- The prediction of the network is the output of the last layer of neurons.

---

$
\begin{equation*}
L = - \sum_{l=1}^{N} y_l \log \left( \forall_{j=1}^{n_3} \left[ \frac{e^{\sum_{i=1}^{n_2} \max \left(0, \sum_{i=1}^{n_1} \max \left(0, \sum_{i=1}^{n_0} X_i w_{1,i,j} + b_{1,j} \right) w_{2,i,j} + b_{2,j} \right) w_{3,i,j} + b_{3,j}}}
{\sum_{k=1}^{n_3} e^{\sum_{i=1}^{n_2} \max \left(0, \sum_{i=1}^{n_1} \max \left(0, \sum_{i=1}^{n_0} X_i w_{1,i,k} + b_{1,k} \right) w_{2,i,k} + b_{2,k} \right) w_{3,i,k} + b_{3,k}}} \right] \right)  \text{,} \\
\text{where L}  = \text{Loss Function} 
\end{equation*}
$

---

### The Forward Pass can be represented as a series of matrix multiplications 

In [92]:
import numpy as np
from src.functions.activation import Sigmoid
from src.utils.logger import getLogger

log = getLogger(__name__)


# create the input data
X = np.array([[1.0, 2.0, 3.0, 2.5],
                [2.0, 5.0, -1.0, 2.0],
                [-1.5, 2.7, 3.3, -0.8]])

# create the Expected output data
y = np.array([[0, 1, 0],
                [1, 0, 1],
                [0, 1, 0]])

# create the weights
w1 = np.array([[0.2, 0.8, -0.5, 1.0],
                [0.5, -0.91, 0.26, -0.5],
                [-0.26, -0.27, 0.17, 0.87]])
w2 = np.array([[0.1, -0.14, 0.5],
                [-0.5, 0.12, -0.33],
                [-0.44, 0.73, -0.13]])
w3 = np.array([[-0.1, -0.14, -0.5],
                [0.5, 0.12, -0.33],
                [-0.44, 0.73, -0.13]])

# create the biases
b1 = np.array([2.0, 3.0, 0.5])
b2 = np.array([-1.0, 2.0, -0.5])
b3 = np.array([2.0, 3.0, 0.5])

# create the activation function
sigmoid = Sigmoid()

# calculate the output/loss of the neural network
loss = -np.log(  # cross-entropy loss
    np.sum(  # sum over the output neurons
        y * np.exp(  # element-wise multiplication with the exponential of the output
            np.dot(  # dot product of the output
                np.maximum(  # ReLU activation
                    0,  # ReLU
                    np.dot(  # dot product of the hidden layer
                        np.maximum(  # ReLU activation
                            0,  # ReLU
                            np.dot(  # dot product of the input layer
                                X,  # input data
                                w1.T  # transpose of the weights
                            ) + b1 # add the bias
                        ),  # ReLU
                        w2.T  # transpose of the weights
                    ) + b2  # add the bias
                ),  # ReLU
                w3.T  # transpose of the weights
            ) + b3  # add the bias
        ) /  # divide by the sum of the exponential of the output
        np.sum(  # sum over the output neurons
            np.exp(  # exponential of the output
                np.dot(  # dot product of the output
                    np.maximum(  # ReLU activation
                        0,  # ReLU
                        np.dot(  # dot product of the hidden layer
                            np.maximum(  # ReLU activation
                                0,  # ReLU
                                np.dot(  # dot product of the input layer
                                    X,  # input data
                                    w1.T  # transpose of the weights
                                ) + b1  # add the bias
                            ),  # ReLU
                            w2.T  # transpose of the weights
                        ) + b2  # add the bias
                    ),  # ReLU
                    w3.T  # transpose of the weights
                ) + b3  # add the bias
            ),  # exponential of the output
        axis=1, # sum over the output neurons
        keepdims=True  # keep the dimensions of the output
        ) # sum over the output neurons
    )  # sum over the output neurons
)  # cross-entropy loss

# verify the accuracy of the loss
# assert loss.shape == (3, 3), "The shape of the loss is incorrect"

# print the loss
log.debug(loss)

[36m[2024-09-01 05:05:49][DEBUG][root][87]: 
-0.5631323300324019[0m


# Layer of Neurons

### A layer of neurons is a collection of neurons that take the same number of inputs and produce the same number of outputs.

#### The output of each neuron is calculated as follows:

---

$
\begin{align*}
\text{``Predictions``} &= \text{Activation Function}(\text{weights} \cdot \text{inputs} + \text{Bias}) \\
\text{``Weighted Sum of Inputs w/ Bias``} &= \sigma(\sum_{i=1}^{n} w_i \cdot x_i + b) \\
\text{``Weighted Sum of Inputs w/ Bias``} &= \sigma(w_1 \cdot x_1 + w_2 \cdot x_2 + . . . + w_n \cdot x_n + b)
\end{align*}
$

---

#### The Predictions are the output of the layer of neurons.
- The weights are the weights of the neurons in the layer.
- The inputs are the inputs to the layer.
- The bias is the bias of the neurons in the layer.
- The activation function is the activation function of the neurons in the layer.
- The weighted sum of inputs w/ bias is the weighted sum of the inputs to the layer plus the bias.
- The weighted sum of inputs w/ bias is the dot product of the weights and inputs plus the bias.

#### The output of each neuron is calculated as follows:
- The weighted sum of inputs and bias is calculated.
- The activation function is applied to the weighted sum of inputs and bias.
- The result is the prediction of the neuron.
- The predictions of all the neurons in the layer are returned as a list.
- The output of the layer of neurons is the list of predictions.

#### The output of the layer of neurons is a list of predictions, one for each neuron in the layer.

### Using Dot Product

#### Calculate the weighted sum of inputs and add the bias

---

$
\begin{align*}
\text{Weighted Sum w/ Bias} &= \sum_{i=1}^{n} w_i \cdot x_i + b \\
\text{Weighted Sum w/ Bias} &= w_1 \cdot x_1 + w_2 \cdot x_2 + . . . + w_n \cdot x_n + b
\end{align*}
$

---

```python
inputs = [1.0, 2.0, 3.0, 2.5]
weights = [0.2, 0.8, -0.5, 1.0]
bias = 2.0

# calculate the weighted sum of inputs and add the bias for each neuron
output = [
# Neuron 1: 
inputs[0]*weights[0] + inputs[1]*weights[1] + inputs[2]*weights[2] + inputs[3]*weights[3] + bias,
]

predictions = ActivationFunction(output)
```

In the context of binary classification using a sigmoid activation function, a prediction close to "1" typically indicates a positive class, while a prediction close to "0" indicates a negative class. Whether "1" or "0" is considered good or bad depends on the true label of the data point:

- If the true label is "1" (positive class), a prediction close to "1" is good, and a prediction close to "0" is bad.
- If the true label is "0" (negative class), a prediction close to "0" is good, and a prediction close to "1" is bad.



### This example demonstrates how to calculate the output of a layer of neurons using a dot product.

- The input data is a 1x4 matrix. Meaning we have 1 sample with 4 inputs.
- The weights define the number of neurons in the layer. This layer has 3 neurons with 4 inputs each to match your input data.
- The bias is a 1x3 matrix. Meaning we have 3 biases for the 3 neurons in the layer, in the shape of a 1x3 matrix and 1 input per neuron.
- The output of the layer of neurons is a 1x3 matrix. Meaning we have 1 sample with 3 predictions.
- The prediction is a sort of transformation of the output of the neurons in the layer.

$
\begin{align*}
O &= XW^T + B = \begin{bmatrix} 1.0 & 2.0 & 3.0 & 2.5 \end{bmatrix} \begin{bmatrix} 0.2 & 0.5 & -0.26 \\ 0.8 & -0.91 & -0.27 \\ -0.5 & 0.26 & 0.17 \\ 1.0 & -0.5 & 0.87 \end{bmatrix} + \begin{bmatrix} 0.0 & 0.0 & 0.0 \end{bmatrix} \\
O &= \begin{bmatrix} 2.8 & -1.79 & 1.885 \end{bmatrix}
\end{align*}
$

Next, lets take a closer look at this matrix multiplication in algebraic form:

$
\begin{align*}
O &= \begin{bmatrix} 1.0 \cdot 0.2 + 2.0 \cdot 0.8 + 3.0 \cdot -0.5 + 2.5 \cdot 1.0 & 1.0 \cdot 0.5 + 2.0 \cdot -0.91 + 3.0 \cdot 0.26 + 2.5 \cdot -0.5 & 1.0 \cdot -0.26 + 2.0 \cdot -0.27 + 3.0 \cdot 0.17 + 2.5 \cdot 0.87 \end{bmatrix} + \begin{bmatrix} 0.0 & 0.0 & 0.0 \end{bmatrix} \\
O &= \begin{bmatrix} 2.8 & -1.79 & 1.885 \end{bmatrix} + \begin{bmatrix} 0.0 & 0.0 & 0.0 \end{bmatrix} \\
O &= \begin{bmatrix} 2.8 & -1.79 & 1.885 \end{bmatrix}
\end{align*}
$


In [93]:
# Layer of Neurons Example 
import numpy as np
from src.functions.activation import Sigmoid

# initialize the activation function
sigmoid = Sigmoid()

# Create the encoding sample input data
encoded_inputs = np.array(
    # Input sample 1: 1x4 matrix; 4 inputs for 4 neurons
    [1.0, 2.0, 3.0, 2.5]
)
log.debug(f"Encoded Inputs: \n{encoded_inputs}")


# Our weights define the number of neurons in the layer. This 
# layer has 3 neurons with 4 inputs each to match your input data.
# initialize  weights and biases
weights = np.array([
    # Neuron 1: 4 inputs
    [0.2, 0.8, -0.5, 1.0],
    # Neuron 2: 4 inputs
    [0.5, -0.91, 0.26, -0.5],
    # Neuron 3: 4 inputs
    [-0.26, -0.27, 0.17, 0.87]])
log.debug(f"Weights: \n{weights}")

# initialize random bias
# This layer has 4 neurons, so we need 4 biases.
# The bias is a 4x1 matrix. Meaning we have 4 biases for the 4 
# neurons in the layer, in the shape of a 4x1 matrix and 1 input per neuron.
bias  = np.array([[0.0, 0.0, 0.0]])
log.debug(f"Bias: \n{bias}")

# get the weighted sums of the inputs and add the bias
# Here we model the neurons output using sample input data, 
# a vector that is passed to each neuron in the layer.
outputs = np.dot(weights, encoded_inputs) + bias
log.debug(f"Outputs: \n{outputs}") # Outputs: [[2.8, -1.79, 1.885]]

# apply the activation function and get the predictions
# The prediction is a sort of transformation of the output of the 
# neurons in the layer.
predictions = sigmoid(outputs)
log.debug(f"Predictions: \n{predictions}")

# Error
error_rate = 1 - predictions
log.debug(f"Error Ratio: \n{error_rate}")

[36m[2024-09-01 05:05:49][DEBUG][root][13]: 
Encoded Inputs: 
[1.  2.  3.  2.5][0m
[36m[2024-09-01 05:05:49][DEBUG][root][26]: 
Weights: 
[[ 0.2   0.8  -0.5   1.  ]
 [ 0.5  -0.91  0.26 -0.5 ]
 [-0.26 -0.27  0.17  0.87]][0m
[36m[2024-09-01 05:05:49][DEBUG][root][33]: 
Bias: 
[[0. 0. 0.]][0m
[36m[2024-09-01 05:05:49][DEBUG][root][39]: 
Outputs: 
[[ 2.8   -1.79   1.885]][0m
[36m[2024-09-01 05:05:49][DEBUG][root][45]: 
Predictions: 
[[0.94267582 0.14307272 0.86818438]][0m
[36m[2024-09-01 05:05:49][DEBUG][root][49]: 
Error Ratio: 
[[0.05732418 0.85692728 0.13181562]][0m



### This example demonstrates how to calculate the output of a layer of neurons using a dot product.

- The input data is a 3x4 matrix. Meaning we have 3 samples with 4 inputs each.
- The weights define the number of neurons in the layer. This layer has 3 neurons with 4 inputs each to match your input data.
- The bias is a 3x1 matrix. Meaning we have 3 biases for the 3 neurons in the layer, in the shape of a 3x1 matrix and 1 input per neuron.
- The output of the layer of neurons is a 3x3 matrix. Meaning we have 3 samples with 3 predictions each.


$
\begin{align*}
O &= XW^T + B = \begin{bmatrix} 1.0 & 2.0 & 3.0 & 2.5 \\ 2.0 & 5.0 & -1.0 & 2.0 \\ -1.5 & 2.7 & 3.3 & -0.8 \end{bmatrix} \begin{bmatrix} 0.2 & 0.5 & -0.26 \\ 0.8 & -0.91 & -0.27 \\ -0.5 & 0.26 & 0.17 \\ 1.0 & -0.5 & 0.87 \end{bmatrix} + \begin{bmatrix} 0.0 & 0.0 & 0.0 \end{bmatrix} \\
O &= \begin{bmatrix} 2.8 & -1.79 & 1.885 \\ 6.9 & -4.81 & -0.3 \\ -0.59 & -1.949 & -0.474 \end{bmatrix} + \begin{bmatrix} 0.0 & 0.0 & 0.0 \end{bmatrix} \\\\
O &= \begin{bmatrix} 2.8 & -1.79 & 1.885 \\ 6.9 & -4.81 & -0.3 \\ -0.59 & -1.949 & -0.474 \end{bmatrix}
\end{align*}
$

In [94]:
# Layer of Neurons Example 
import numpy as np
from src.functions.activation import Sigmoid

# initialize the activation function
sigmoid = Sigmoid()

# Create the encoding sample input data
# The input data is a 3x4 matrix. Meaning we have 3 samples with 4 inputs each.
encoded_inputs = np.array([
    # Input sample 1: 1x4 matrix; 4 inputs for 4 neurons
    [1.0, 2.0, 3.0, 2.5],
    # Input sample 2: 1x4 matrix; 4 inputs for 4 neurons
    [2.0, 5.0, -1.0, 2.0],
    # Input sample 3: 1x4 matrix; 4 inputs for 4 neurons
    [-1.5, 2.7, 3.3, -0.8]
])
log.debug(f"Encoded Inputs: \n{encoded_inputs}")


# Our weights define the number of neurons in the layer. This 
# layer has 3 neurons with 4 inputs each to match your input data.
# initialize  weights and biases
weights = np.array([
    # Neuron 1: 4 inputs
    [0.2, 0.8, -0.5, 1.0],
    # Neuron 2: 4 inputs
    [0.5, -0.91, 0.26, -0.5],
    # Neuron 3: 4 inputs
    [-0.26, -0.27, 0.17, 0.87]]).T
log.debug(f"Weights: \n{weights}")

# initialize random bias
# This layer has 4 neurons, so we need 4 biases.
# The bias is a 4x1 matrix. Meaning we have 4 biases for the 4 
# neurons in the layer, in the shape of a 4x1 matrix and 1 input per neuron.
bias  = np.array([[0.0, 0.0, 0.0]])
log.debug(f"Bias: \n{bias}")

# get the weighted sums of the inputs and add the bias
# Here we model the neurons output using sample input data, 
# a vector that is passed to each neuron in the layer.
# $O = XW^T + B &= \begin{bmatrix} 1.0 & 2.0 & 3.0 & 2.5 \\ 2.0 & 5.0 & -1.0 & 2.0 \\ -1.5 & 2.7 & 3.3 & -0.8 \end{bmatrix} \begin{bmatrix} 0.2 & 0.5 & -0.26 \\ 0.8 & -0.91 & -0.27 \\ -0.5 & 0.26 & 0.17 \\ 1.0 & -0.5 & 0.87 \end{bmatrix} + \begin{bmatrix} 0.0 & 0.0 & 0.0 \end{bmatrix}$
outputs = np.dot(encoded_inputs, weights) + bias
# Outputs: [[ 2.8   -1.79   1.885], [ 6.9   -4.81  -0.3  ], [-0.59  -1.949 -0.474]]
log.debug(f"Outputs: \n{outputs}") 

# apply the activation function and get the predictions
# The prediction is a sort of transformation of the output of the 
# neurons in the layer.
predictions = sigmoid(outputs)
log.debug(f"Predictions: \n{predictions}")

# Error
error_rate = 1 - predictions 
log.debug(f"Error Ratio: \n{error_rate}")

[36m[2024-09-01 05:05:50][DEBUG][root][18]: 
Encoded Inputs: 
[[ 1.   2.   3.   2.5]
 [ 2.   5.  -1.   2. ]
 [-1.5  2.7  3.3 -0.8]][0m
[36m[2024-09-01 05:05:50][DEBUG][root][31]: 
Weights: 
[[ 0.2   0.5  -0.26]
 [ 0.8  -0.91 -0.27]
 [-0.5   0.26  0.17]
 [ 1.   -0.5   0.87]][0m
[36m[2024-09-01 05:05:50][DEBUG][root][38]: 
Bias: 
[[0. 0. 0.]][0m
[36m[2024-09-01 05:05:50][DEBUG][root][46]: 
Outputs: 
[[ 2.8   -1.79   1.885]
 [ 6.9   -4.81  -0.3  ]
 [-0.59  -1.949 -0.474]][0m
[36m[2024-09-01 05:05:50][DEBUG][root][52]: 
Predictions: 
[[0.94267582 0.14307272 0.86818438]
 [0.99899323 0.00808201 0.42555748]
 [0.35663485 0.12466244 0.38366994]][0m
[36m[2024-09-01 05:05:50][DEBUG][root][56]: 
Error Ratio: 
[[0.05732418 0.85692728 0.13181562]
 [0.00100677 0.99191799 0.57444252]
 [0.64336515 0.87533756 0.61633006]][0m


In [95]:
# Layer of Neurons Example 
import numpy as np
from src.functions.activation import Sigmoid
from src.encoder.label import encode as encode_labels

# initialize the activation function
sigmoid = Sigmoid()

# convert the words to sums of Unicode values
input_words = np.array(['Cat', 'Dog', 'Rabbit', 'Horse'])
encoded_inputs = encode_labels(input_words)
log.debug(f"Encoded Inputs: \n{encoded_inputs}")

# initialize seed for reproducibility
np.random.seed(10)


# TODO: Our weights define the number of neurons in the layer. This layer has 3 neurons with 4 inputs each to match your input data. In the 'np.random.rand(3, 4)' function, the first argument is the number of neurons in the layer, and the second argument is the number of inputs to each neuron.
# initialize random weights and biases
weights = np.random.rand(3, 4)
log.debug(f"Weights: \n{weights}")

# initialize random bias
# This layer has 4 neurons, so we need 4 biases.
# The bias is a 4x1 matrix. Meaning we have 4 biases for the 4 neurons in the layer, in the shape of a 4x1 matrix and 1 input per neuron.
bias  = np.random.rand(1, 3)
log.debug(f"Bias: \n{bias}")

# convert the words to sums of Unicode values
labels = np.array(['Cat', 'Dog', 'Rabbit', 'Horse'])
encoded_labels = encode_labels(labels)
log.debug(f"Labels: \n{labels}")
log.debug(f"Encoded Labels: \n{encoded_labels}")

# get the weighted sums of the inputs and add the bias
outputs = np.dot(encoded_inputs, weights.T) + bias
log.debug(f"Outputs: \n{outputs}")

# apply the activation function and get the predictions
# The prediction is a sort of transformation of the output of the neurons in the layer.
predictions = sigmoid(outputs)
log.debug(f"Predictions: \n{predictions}")

# Error
error_rate = (1 - predictions).tolist()
log.debug(f"Error Ratio: \n{error_rate}")

[36m[2024-09-01 05:05:50][DEBUG][root][12]: 
Encoded Inputs: 
[[104.       104.666664 104.666664 109.      ]][0m
[36m[2024-09-01 05:05:50][DEBUG][root][21]: 
Weights: 
[[0.77132064 0.02075195 0.63364823 0.74880388]
 [0.49850701 0.22479665 0.19806286 0.76053071]
 [0.16911084 0.08833981 0.68535982 0.95339335]][0m
[36m[2024-09-01 05:05:50][DEBUG][root][27]: 
Bias: 
[[0.00394827 0.51219226 0.81262096]][0m
[36m[2024-09-01 05:05:50][DEBUG][root][32]: 
Labels: 
['Cat' 'Dog' 'Rabbit' 'Horse'][0m
[36m[2024-09-01 05:05:50][DEBUG][root][33]: 
Encoded Labels: 
[[104.       104.666664 104.666664 109.      ]][0m
[36m[2024-09-01 05:05:50][DEBUG][root][37]: 
Outputs: 
[[230.33480265 179.51406351 203.30058227]][0m
[36m[2024-09-01 05:05:50][DEBUG][root][42]: 
Predictions: 
[[1. 1. 1.]][0m
[36m[2024-09-01 05:05:50][DEBUG][root][46]: 
Error Ratio: 
[[0.0, 0.0, 0.0]][0m


In [96]:
# Graph the error on a line plot
from plotly import graph_objects as go

fig = go.Figure()
fig.add_trace(
    go.Scatter(
        x=labels, y=error_rate[0], mode='lines+markers', 
    xaxis='x', yaxis='y', name='Outputs'
    ))
fig.update_layout(title='Error Ratio', xaxis_title='Labels', yaxis_title='Error')
fig.show()

In [97]:

inputs=[1, 2, 3, 2.5]

weights1 = [0.2, 0.8, -0.5, 1] 
weights2 = [0.5, -0.91, 0.26, -0.5] 
weights3 = [-0.26, -0.27, 0.17, 0.87]

bias1 = 2 
bias2 = 3 
bias3 = 0.5

# calculate the weighted sum of inputs and add the bias for three neurons.
outputs = np.array([
    # Neuron 1:
    inputs[0]*weights1[0] + inputs[1]*weights1[1] + inputs[2]*weights1[2] + inputs[3]*weights1[3] + bias1,
    # Neuron 2: 
    inputs[0]*weights2[0] + inputs[1]*weights2[1] + inputs[2]*weights2[2] + inputs[3]*weights2[3] + bias2,
    # Neuron 3: 
    inputs[0]*weights3[0] + inputs[1]*weights3[1] + inputs[2]*weights3[2] + inputs[3]*weights3[3] + bias3
    ])
log.debug(f"Outputs: \n{outputs}")  # [4.8, 1.21, 2.385]

# apply the activation function and get the predictions
predictions = sigmoid(outputs)
log.debug(f"Predictions: \n{predictions}")  # [0.99194602, 0.77015115, 0.9158585]

# Error ratio is the difference between the prediction and the actual value
error_ratio = 1 - predictions
log.debug(f"Error Ratio: \n{error_ratio}")  # [0.00816257 0.22970105 0.0843237 ]

[36m[2024-09-01 05:05:50][DEBUG][root][20]: 
Outputs: 
[4.8   1.21  2.385][0m
[36m[2024-09-01 05:05:50][DEBUG][root][24]: 
Predictions: 
[0.99183743 0.77029895 0.9156763 ][0m
[36m[2024-09-01 05:05:50][DEBUG][root][28]: 
Error Ratio: 
[0.00816257 0.22970105 0.0843237 ][0m


In [98]:
# Graph the error on a line plot
from plotly import graph_objects as go

fig = go.Figure()
fig.add_trace(
    go.Scatter(
        x=predictions, y=error_ratio, mode='lines+markers', 
    xaxis='x', yaxis='y', name='Outputs'
    ))
fig.update_layout(title='Error Ratio vs Predictions', xaxis_title='Predictions', yaxis_title='Error Ratio')
fig.show()

In [99]:
from src.encoder.label import encode as encode_labels

# Test the label encoder
labels = ["cat", "dog", "fish", "elephant", "lion", "tiger", "bear"]
encoded_labels = encode_labels(labels)
log.debug(f"Encoded labels: \n{encoded_labels}\n")
log.debug(f"Data type: \n{encoded_labels.dtype}\n")
log.debug(f"Shape: \n{encoded_labels.shape}\n")
log.debug(f"Size: \n{encoded_labels.size}\n")
log.debug(f"Number of dimensions: \n{encoded_labels.ndim}\n")
log.debug(f"Item size: \n{encoded_labels.itemsize}\n")
log.debug(f"Total bytes: \n{encoded_labels.nbytes}\n")
log.debug(f"Strides: \n{encoded_labels.strides}\n")
log.debug(f"Flags: \n{encoded_labels.flags}\n")
log.debug(f"Ctypes: \n{encoded_labels.ctypes}\n")
log.debug(f"Base: \n{encoded_labels.base}\n")
log.debug(f"Data: \n{encoded_labels.data}\n")
log.debug(f"Transpose: \n{encoded_labels.T}\n")
log.debug(f"Real part: \n{encoded_labels.real}\n")
log.debug(f"Imaginary part: \n{encoded_labels.imag}\n")
log.debug(f"Flat: \n{encoded_labels.flat}\n")
log.debug(f"Item: \n{encoded_labels.item}\n")
log.debug(f"List: \n{encoded_labels.tolist()}\n")
log.debug(f"Bytes: {encoded_labels.tobytes()}\n")

[36m[2024-09-01 05:05:50][DEBUG][root][6]: 
Encoded labels: 
[[104.       104.666664 106.5      106.125    108.5      107.8
  102.5     ]]
[0m
[36m[2024-09-01 05:05:50][DEBUG][root][7]: 
Data type: 
float32
[0m
[36m[2024-09-01 05:05:50][DEBUG][root][8]: 
Shape: 
(1, 7)
[0m
[36m[2024-09-01 05:05:50][DEBUG][root][9]: 
Size: 
7
[0m
[36m[2024-09-01 05:05:50][DEBUG][root][10]: 
Number of dimensions: 
2
[0m
[36m[2024-09-01 05:05:50][DEBUG][root][11]: 
Item size: 
4
[0m
[36m[2024-09-01 05:05:50][DEBUG][root][12]: 
Total bytes: 
28
[0m
[36m[2024-09-01 05:05:50][DEBUG][root][13]: 
Strides: 
(28, 4)
[0m
[36m[2024-09-01 05:05:50][DEBUG][root][14]: 
Flags: 
  C_CONTIGUOUS : True
  F_CONTIGUOUS : True
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  WRITEBACKIFCOPY : False

[0m
[36m[2024-09-01 05:05:50][DEBUG][root][15]: 
Ctypes: 
<numpy.core._internal._ctypes object at 0x120aa2c60>
[0m
[36m[2024-09-01 05:05:50][DEBUG][root][16]: 
Base: 
None
[0m
[36m[2024-09-01 05:05:5

In [100]:
import  numpy  as  np

# Create the input data
inputs  =  [
    [ 1.0 ,  2.0 ,  3.0 ,  2.5 ], 
    [ 2.0 ,  5.0 ,  - 1.0 ,  2.0 ],
    [ - 1.5 ,  2.7 ,  3.3 ,  - 0.8 ]
] 

# Create the weights and biases
weights  =  [
    [ 0.2 ,  0.8 ,  - 0.5 ,  1.0 ],
    [ 0.5 ,  - 0.91 ,  0.26 ,  - 0.5 ],
    [ - 0.26 ,  - 0.27 ,  0.17 ,  0.87 ]
] 
biases  =  [ 2.0 ,  3.0 ,  0.5 ]

# Calculate the output of the layer of neurons
layer_outputs  =  np.dot(inputs, np.array(weights).T)  +  biases 
log.debug(f"Layer Outputs: \n{layer_outputs}")

# Predictions
predictions = sigmoid(layer_outputs)
log.debug(f"Predictions: \n{predictions}")

# Error
error_rate = 1 - predictions
log.debug(f"Error Ratio: \n{error_rate}")



[36m[2024-09-01 05:05:50][DEBUG][root][20]: 
Layer Outputs: 
[[ 4.8    1.21   2.385]
 [ 8.9   -1.81   0.2  ]
 [ 1.41   1.051  0.026]][0m
[36m[2024-09-01 05:05:50][DEBUG][root][24]: 
Predictions: 
[[0.99183743 0.77029895 0.9156763 ]
 [0.99986363 0.14063813 0.549834  ]
 [0.80376594 0.74096688 0.50649963]][0m
[36m[2024-09-01 05:05:50][DEBUG][root][28]: 
Error Ratio: 
[[8.16257115e-03 2.29701051e-01 8.43236964e-02]
 [1.36370327e-04 8.59361874e-01 4.50166003e-01]
 [1.96234056e-01 2.59033120e-01 4.93500366e-01]][0m


In [101]:
from plotly import graph_objects as go

# Create 3D scatter plot
fig = go.Figure()
fig.add_trace(
    go.Scatter3d(
        x=inputs[0], y=inputs[1], z=inputs[2], mode='lines+markers', 
        marker=dict(size=12, color='blue', opacity=0.8), name='Inputs'
    ))
fig.add_trace(
    go.Scatter3d(
        x=weights[0], y=weights[1], z=weights[2], mode='lines+markers', 
        marker=dict(size=12, color='red', opacity=0.8), name='Weights'
    ))
fig.add_trace(
    go.Scatter3d(
        x=biases, y=biases, z=biases, mode='lines+markers', 
        marker=dict(size=12, color='green', opacity=0.8), name='Biases'
    ))
fig.add_trace(
    go.Scatter3d(
        x=layer_outputs[0], y=layer_outputs[1], z=layer_outputs[2], mode='lines+markers', 
        marker=dict(size=12, color='orange', opacity=0.8), name='Layer Outputs'
    ))
fig.add_trace(
    go.Scatter3d(
        x=predictions[0], y=predictions[1], z=predictions[2], mode='lines+markers', 
        marker=dict(size=12, color='purple', opacity=0.8), name='Predictions'
    ))
fig.add_trace(
    go.Scatter3d(
        x=error_rate[0], y=error_rate[1], z=error_rate[2], mode='lines+markers', 
        marker=dict(size=12, color='black', opacity=0.8), name='Error Rate'
    ))
fig.update_layout(title='Layer of Neurons', scene=dict(xaxis_title='X', yaxis_title='Y', zaxis_title='Z'))
fig.show()



In [102]:
# Graph the inputs, weights and biases on a line plot
from plotly import graph_objects as go

fig = go.Figure()
fig.add_trace(
    go.Scatter(
        x=inputs[0], y=inputs[1], mode='lines+markers', 
    xaxis='x', yaxis='y', name='Inputs'
    ))
fig.add_trace(
    go.Scatter(
        x=weights[0], y=weights[1], mode='lines+markers', 
    xaxis='x', yaxis='y', name='Weights'
    ))
fig.add_trace(
    go.Scatter(
        x=biases, y=biases, mode='lines+markers', 
    xaxis='x', yaxis='y', name='Biases'
    ))
fig.add_trace(
    go.Scatter(
        x=layer_outputs[0], y=layer_outputs[1], mode='lines+markers', 
    xaxis='x', yaxis='y', name='Layer Outputs'
    ))
fig.add_trace(
    go.Scatter(
        x=predictions[0], y=predictions[1], mode='lines+markers', 
    xaxis='x', yaxis='y', name='Predictions'
    ))
fig.add_trace(
    go.Scatter(
        x=error_rate[0], y=error_rate[1], mode='lines+markers', 
    xaxis='x', yaxis='y', name='Error Rate'
    ))
fig.update_layout(title='Layer of Neurons', xaxis_title='X', yaxis_title='Y')
fig.show()



## Hidden Layers

A classic multilayer perceptron has multiple interconnected `perceptrons`, such as units that are organized in different sequential layers (`input layer`, one or more `hidden layers`, and an `output layer`). Each unit of a layer is connected to all units of the `next layer`. First, the information is presented to the `input layer`, then we use it to compute the output (or `activation`), $y_i$, for each unit of the first hidden layer. We propagate forward, with the output as input for the next layers in the network (hence feedforward), and so on until we reach the output. The most common way to train Neural Networks is by using `gradient descent` in combination with `backpropagation`.

Think of the `hidden layers` as an abstract representation of the input data. This is the way the Neural Network understands the features of the data with its internal logic. However, Neural Networks are `non-interpretable models`. This means that if we observe the $y_i$ activations of the `hidden layer`, we wouldn’t be able to understand them. For us, they are just a `vector of numerical values`. We need the output layer to bridge the gap between the network’s representation and the actual data we’re interested in. You can think of this as a `translator`; we use it to understand the network’s logic, and at the same time, we can convert it into the actual target values that we are interested in.

---

### Hidden Layer Forward Pass

Input layer `A` has 4 features into 2 hidden layers with 3 neurons each. Each neuron has 4 weights and 1 bias. The output of the hidden layer is a $3x3$ matrix. The output of the hidden layer is passed to the output layer is a $3x3$ matrix.

**Note: inputs to layers are either inputs from the actual dataset you’re training with or outputs from a previous layer.**


In [103]:
from src.utils.mermaid import mm

mm('''
%%{
  init: {
    'theme': 'forest',
    'themeVariables': {
      'primaryColor': '#BB2528',
      'primaryTextColor': '#fff',
      'primaryBorderColor': '#7C0000',
      'lineColor': '#F8B229',
      'secondaryColor': '#006100',
      'secondaryBorderColor': '#003700',
      'secondaryTextColor': '#fff000',
      'tertiaryColor': '#fff999',
      'tertiaryBorderColor': '#000999',
      'orientation': 'landscape'
    }
  }
}%%

graph TB
    subgraph Input Layer
        direction LR
        A1["A1"]
        A2["A2"]
        A3["A3"]
    end 
    
    subgraph Hidden Layer 1
        direction LR
        B1["B1"]
        B2["B2"]
        B3["B3"]
        subgraph Biases B
            direction LR
            B1_bias[/"1"/]
            B2_bias[/"2"/]
            B3_bias[/"3"/]
        end
    end
    
    subgraph Hidden Layer 2
        direction LR
        C1["C1"]
        C2["C2"]
        C3["C3"]
        subgraph Biases C
            direction LR
            C1_bias[/"1"/]
            C2_bias[/"2"/]
            C3_bias[/"3"/]
        end
    end
    
    subgraph Output Layer D
        direction LR
        D["D"]
        subgraph Biases D
            direction LR
            D_bias[/"1"/]
        end
    end


    A1 --> |w1| B1[Neuron 1]
    A1 --> |w2| B2[Neuron 2]
    A1 --> |w3| B3[Neuron 3]

    A2 --> |w4| B1[Neuron 1]
    A2 --> |w5| B2[Neuron 2]
    A2 --> |w6| B3[Neuron 3]

    A3 --> |w7| B1[Neuron 1]
    A3 --> |w8| B2[Neuron 2]
    A3 --> |w9| B3[Neuron 3]

    B1 --> |w1| C1[Neuron 1]
    B1 --> |w2| C2[Neuron 2]
    B1 --> |w3| C3[Neuron 3]

    B2 --> |w4| C1[Neuron 1]
    B2 --> |w5| C2[Neuron 2]
    B2 --> |w6| C3[Neuron 3]

    B3 --> |w7| C1[Neuron 1]
    B3 --> |w8| C2[Neuron 2]
    B3 --> |w9| C3[Neuron 3]

    C1 --> |w1| D
    C2 --> |w2| D
    C3 --> |w3| D

    B1_bias --> B1
    B2_bias --> B2
    B3_bias --> B3

    C1_bias --> C1
    C2_bias --> C2
    C3_bias --> C3

    D_bias --> D
   ''')

In [104]:

import numpy as np

# Create the input data
inputs=np.array([
    [1,2,3,2.5], 
    [2.,5.,-1.,2], 
    [-1.5,2.7,3.3,-0.8]
])
# Create the weights and biases
weights=np.array([
    [0.2,0.8,-0.5,1],
    [0.5,-0.91,0.26,-0.5],
    [-0.26,-0.27,0.17,0.87]
])
biases=np.array([2,3,0.5])
weights2=np.array([
    [0.1,-0.14,0.5], 
    [-0.5,0.12,-0.33],
    [-0.44,0.73,-0.13]
])
biases2=np.array([-1,2,-0.5])
# Calculate the output of the hidden layers of neurons
layer1_outputs=(np.dot(inputs, weights.T) + biases) 
log.info(layer1_outputs)
# Calculate the output of the output layer of neurons
layer2_outputs=(np.dot(layer1_outputs, weights2.T) + biases2)
log.info(layer2_outputs)

[32m[2024-09-01 05:05:51][INFO][root][24]: 
[[ 4.8    1.21   2.385]
 [ 8.9   -1.81   0.2  ]
 [ 1.41   1.051  0.026]][0m
[32m[2024-09-01 05:05:51][INFO][root][27]: 
[[ 0.5031  -1.04185 -2.03875]
 [ 0.2434  -2.7332  -5.7633 ]
 [-0.99314  1.41254 -0.35655]][0m


In [105]:
from src.utils.datasets import create_random_nonlinear_3D_dataset

# Create a nonlinear dataset
X, y, z = create_random_nonlinear_3D_dataset(10, 10, 10, 3)

# Get the X, y, z and n values
log.info(X.shape)  # (10, 10, 3)
log.info(y.shape)  # (10, 10, 3)
log.info(z.shape)  # (10, 10, 3)

# Since we have 3 dimensions, we need to flatten the data
X_flat = X.flatten()
y_flat = y.flatten()
z_flat = z.flatten()
n_flat = np.sqrt(X_flat**2 + y_flat**2 + z_flat**2)

# Log the shapes
log.info(X_flat.shape)  # (300,)
log.info(y_flat.shape)  # (300,)
log.info(z_flat.shape)  # (300,)
log.info(n_flat.shape)  # (300,)

[32m[2024-09-01 05:05:51][INFO][root][7]: 
(10, 3)[0m
[32m[2024-09-01 05:05:51][INFO][root][8]: 
(10, 3)[0m
[32m[2024-09-01 05:05:51][INFO][root][9]: 
(10, 3)[0m
[32m[2024-09-01 05:05:51][INFO][root][18]: 
(30,)[0m
[32m[2024-09-01 05:05:51][INFO][root][19]: 
(30,)[0m
[32m[2024-09-01 05:05:51][INFO][root][20]: 
(30,)[0m
[32m[2024-09-01 05:05:51][INFO][root][21]: 
(30,)[0m


In [106]:
# Graph the dataset
from plotly import graph_objects as go

# Create the 3D scatter plot
fig = go.Figure()
# Add trace to represent all the data points in the dataset
fig.add_trace(
    go.Scatter3d(
        x=X_flat, y=y_flat, z=z_flat, mode='markers', 
        marker=dict(size=5, color=n_flat, opacity=0.5, colorscale='Viridis'), name='Dataset'
    )
)
# Update the layout of the plot
fig.update_layout(
    title='Nonlinear Dataset', 
    scene=dict(xaxis_title='X', yaxis_title='Y', zaxis_title='Z'),
)
# Show the plot
fig.show()


In [107]:
# Now lets graph the 2D X, y coordinates
reshaped_X = X.reshape(-1, 3)
log.info(reshaped_X[:,0].shape)  # (300,)
fig = go.Figure()
fig.add_trace(
    go.Scatter(
        x=X_flat, y=y_flat, mode='markers',
        marker=dict(size=8, color=n_flat, opacity=0.8, colorscale='curl'), name='Dataset')
)
fig.update_layout(title='Nonlinear Dataset', xaxis_title='X', yaxis_title='Y')
fig.show()

[32m[2024-09-01 05:05:51][INFO][root][3]: 
(10,)[0m


In [108]:
"""
{'Colorscale properties': 
['aggrnyl', 'agsunset', 'algae', 'amp', 'armyrose', 'balance',
 'blackbody', 'bluered', 'blues', 'blugrn', 'bluyl', 'brbg',
 'brwnyl', 'bugn', 'bupu', 'burg', 'burgyl', 'cividis', 'curl',
 'darkmint', 'deep', 'delta', 'dense', 'earth', 'edge', 'electric',
 'emrld', 'fall', 'geyser', 'gnbu', 'gray', 'greens', 'greys',
 'haline', 'hot', 'hsv', 'ice', 'icefire', 'inferno', 'jet',
 'magenta', 'magma', 'matter', 'mint', 'mrybm', 'mygbm', 'oranges',
 'orrd', 'oryel', 'oxy', 'peach', 'phase', 'picnic', 'pinkyl',
 'piyg', 'plasma', 'plotly3', 'portland', 'prgn', 'pubu', 'pubugn',
 'puor', 'purd', 'purp', 'purples', 'purpor', 'rainbow', 'rdbu',
 'rdgy', 'rdpu', 'rdylbu', 'rdylgn', 'redor', 'reds', 'solar',
 'spectral', 'speed', 'sunset', 'sunsetdark', 'teal', 'tealgrn',
 'tealrose', 'tempo', 'temps', 'thermal', 'tropic', 'turbid',
 'turbo', 'twilight', 'viridis', 'ylgn', 'ylgnbu', 'ylorbr',
 'ylorrd']}
 """

"\n{'Colorscale properties': \n['aggrnyl', 'agsunset', 'algae', 'amp', 'armyrose', 'balance',\n 'blackbody', 'bluered', 'blues', 'blugrn', 'bluyl', 'brbg',\n 'brwnyl', 'bugn', 'bupu', 'burg', 'burgyl', 'cividis', 'curl',\n 'darkmint', 'deep', 'delta', 'dense', 'earth', 'edge', 'electric',\n 'emrld', 'fall', 'geyser', 'gnbu', 'gray', 'greens', 'greys',\n 'haline', 'hot', 'hsv', 'ice', 'icefire', 'inferno', 'jet',\n 'magenta', 'magma', 'matter', 'mint', 'mrybm', 'mygbm', 'oranges',\n 'orrd', 'oryel', 'oxy', 'peach', 'phase', 'picnic', 'pinkyl',\n 'piyg', 'plasma', 'plotly3', 'portland', 'prgn', 'pubu', 'pubugn',\n 'puor', 'purd', 'purp', 'purples', 'purpor', 'rainbow', 'rdbu',\n 'rdgy', 'rdpu', 'rdylbu', 'rdylgn', 'redor', 'reds', 'solar',\n 'spectral', 'speed', 'sunset', 'sunsetdark', 'teal', 'tealgrn',\n 'tealrose', 'tempo', 'temps', 'thermal', 'tropic', 'turbid',\n 'turbo', 'twilight', 'viridis', 'ylgn', 'ylgnbu', 'ylorbr',\n 'ylorrd']}\n "

In [109]:
# Generate a spiral dataset
from src.utils.datasets import create_spiral_dataset

# Create a spiral dataset
X, y = create_spiral_dataset(100, 3)

log.info(X.shape)  # (100, 3)
log.info(y.shape)  # (100,)


[32m[2024-09-01 05:05:51][INFO][root][7]: 
(300, 2)[0m
[32m[2024-09-01 05:05:51][INFO][root][8]: 
(300,)[0m


In [110]:
# Graph the inputs
fig = go.Figure()
fig.add_trace(
    go.Scatter(
        x=X[:,0], y=X[:,1], mode='markers', 
        marker=dict(size=8, color=y, opacity=0.8, colorscale='Viridis'), name='Dataset'
    )
)

fig.update_layout(
    title='Spiral Dataset',
    xaxis_title='X',
    yaxis_title='Y',
    margin=dict(l=0, r=0, t=0, b=0),  # Set margins to zero
    
)
fig.show()

# Dense Layers

A dense layer is a layer of neurons where each neuron is connected to every neuron in the previous layer. The output of a dense layer is a list of predictions, one for each neuron in the layer. The output of a dense layer is calculated as follows:

---

$
\begin{align*}
\text{``Predictions``} &= \text{Activation Function}(\text{weights} \cdot \text{inputs} + \text{Bias}) \\
\text{``Predictions``} &= \text{Activation Function}(\text{weights}_1 \cdot \text{inputs}_1 + \text{weights}_2 \cdot \text{inputs}_2 + . . . + \text{weights}_n \cdot \text{inputs}_n + \text{Bias})
\end{align*}
$

---

In [111]:
# Simple dense layer with 2 inputs and 3 neurons; only one dense layer of 3 neurons
from src.layer.dense import Dense
from src.functions.loss import cross_entropy_loss
from src.functions.activation import ReLU

# Initialize activation function
relu = ReLU()

# Create a dense layer with 2 inputs features and 3 neurons
dense = Dense(2, 3)

# Lets do the forward pass
dense.forward(X)

# Get the predictions
predictions = relu(dense.output)

# Calculate the loss
avg_loss, loss = dense.loss(np.array(y), predictions[:,0])

# Log the outputs, weights and biases
log.info(f"Inputs: {X.shape}")
log.info(f"Weights: {dense.weights.shape}")
log.info(f"Biases: {dense.biases.shape}")
log.info(f"Output: {dense.output.shape}")
log.info(f"Predictions: {predictions.shape}")
log.info(f"Predictions data: {predictions[:5]}")
log.info(f"Loss: {avg_loss}, \n{loss[:5:1]}")


[32m[2024-09-01 05:05:52][INFO][root][22]: 
Inputs: (300, 2)[0m
[32m[2024-09-01 05:05:52][INFO][root][23]: 
Weights: (2, 3)[0m
[32m[2024-09-01 05:05:52][INFO][root][24]: 
Biases: (1, 3)[0m
[32m[2024-09-01 05:05:52][INFO][root][25]: 
Output: (300, 3)[0m
[32m[2024-09-01 05:05:52][INFO][root][26]: 
Predictions: (300, 3)[0m
[32m[2024-09-01 05:05:52][INFO][root][27]: 
Predictions data: [[0.00000000e+00 0.00000000e+00 0.00000000e+00]
 [7.72018239e-05 0.00000000e+00 0.00000000e+00]
 [0.00000000e+00 0.00000000e+00 0.00000000e+00]
 [0.00000000e+00 0.00000000e+00 0.00000000e+00]
 [0.00000000e+00 0.00000000e+00 0.00000000e+00]][0m
[32m[2024-09-01 05:05:52][INFO][root][28]: 
Loss: nan, 
[           nan 7.72048041e-05            nan            nan
            nan][0m


In [112]:
# Now we create a dense layer with 3 neurons with 2 inputs each and 2 dense layers; the first layer has 3 neurons with 2 inputs each and the second layer has 3 neurons with 3 inputs each.
from src.layer.dense import Dense
from src.functions.loss import cross_entropy_loss
from src.utils.datasets import create_spiral_dataset
from src.functions.activation import Softmax, ReLU

# Initialize activation function
softmax = Softmax()
relu = ReLU()

# Create a spiral dataset
X, y = create_spiral_dataset(100, 3)

# Create a dense layer with 3 neurons with 2 inputs each
dense1 = Dense(2, 3)

# Lets do the forward pass
dense1.forward(X)

# Run the activation function ReLU
dense1_output = relu(dense1.output)

# Create a dense layer with 3 neurons with 3 inputs each
dense2 = Dense(3, 3)

# Lets do the forward pass
dense2.forward(dense1_output)

# TODO: These final outputs are also our “confidence scores.” The higher the confidence score, the more confident the model is that the input belongs to that class.
# Get the predictions
predictions = softmax(dense2.output)

# Calculate the loss
avg_loss, loss = dense2.loss(np.array([y]), predictions[:,0])

# Log the outputs, weights and biases
log.info(f"Inputs: {X.shape}")
log.info(f"Y is a spiral dataset: {y.shape}")
log.info(f"Weights Layer 1: {dense1.weights.shape}")
log.info(f"Biases Layer 1: {dense1.biases.shape}")
log.info(f"Output Layer 1: {dense1.output.shape}")
log.info(f"Weights Layer 2: {dense2.weights.shape}")
log.info(f"Biases Layer 2: {dense2.biases.shape}")
log.info(f"Output Layer 2: {dense2.output.shape}")
log.info(f"Predictions: {predictions.shape}")
log.info(f"Predictions data: \n{predictions[:5]}")
log.info(f"Loss: {avg_loss}")

[32m[2024-09-01 05:05:52][INFO][root][37]: 
Inputs: (300, 2)[0m
[32m[2024-09-01 05:05:52][INFO][root][38]: 
Y is a spiral dataset: (300,)[0m
[32m[2024-09-01 05:05:52][INFO][root][39]: 
Weights Layer 1: (2, 3)[0m
[32m[2024-09-01 05:05:52][INFO][root][40]: 
Biases Layer 1: (1, 3)[0m
[32m[2024-09-01 05:05:52][INFO][root][41]: 
Output Layer 1: (300, 3)[0m
[32m[2024-09-01 05:05:52][INFO][root][42]: 
Weights Layer 2: (3, 3)[0m
[32m[2024-09-01 05:05:52][INFO][root][43]: 
Biases Layer 2: (1, 3)[0m
[32m[2024-09-01 05:05:52][INFO][root][44]: 
Output Layer 2: (300, 3)[0m
[32m[2024-09-01 05:05:52][INFO][root][45]: 
Predictions: (300, 3)[0m
[32m[2024-09-01 05:05:52][INFO][root][46]: 
Predictions data: 
[[0.00333326 0.00333328 0.00333336]
 [0.00333326 0.00333328 0.00333336]
 [0.00333326 0.00333328 0.00333336]
 [0.00333326 0.00333328 0.00333336]
 [0.00333326 0.00333328 0.00333336]][0m
[32m[2024-09-01 05:05:52][INFO][root][47]: 
Loss: 5.7037819461538595[0m


In [113]:
# Graph the predictions
import plotly.graph_objects as go
from plotly.subplots import make_subplots
# Graph the X inputs and y labels
fig = go.Figure()

# Graph the dataset with the predictions
fig.add_trace(
    go.Scatter3d(
        x=X[:,0], y=y, z=predictions[:,1], mode='markers',
        marker=dict(size=8, color=y, opacity=0.8, colorscale='Viridis'), name='Predictions'
    )
)
fig.update_layout(
    title='Spiral Dataset Predictions',
    scene=dict(xaxis_title='X', yaxis_title='Y', zaxis_title='Predictions'),
    margin=dict(l=0, r=0, t=0, b=0)
)
fig.show()
