# NNFS – Chapter 3: Adding Layers

In this notebook we follow along with **Neural Networks from Scratch** (NNFS) Chapter 3: *Adding Layers*.

The goal is to:
- Start with a batch of inputs (each with 4 features).
- Pass them through a first dense layer with 3 neurons.
- Use the outputs of that layer as inputs to a **second** dense layer with 3 neurons.

This is our first taste of a **multi-layer (deep) network**: inputs → hidden layer 1 → hidden layer 2.

In [1]:
import numpy as np

np.set_printoptions(precision=5, suppress=True)  # nicer printing

## 1. Inputs

We start with a **batch** of 3 samples. Each sample has 4 features.

- Shape: `(3, 4)` → 3 rows (samples), 4 columns (features).

In [2]:
inputs = [
    [1.0, 2.0, 3.0, 2.5],
    [2.0, 5.0, -1.0, 2.0],
    [-1.5, 2.7, 3.3, -0.8],
]

inputs = np.array(inputs)
print("Inputs:\n", inputs)
print("Shape of inputs:", inputs.shape)

Inputs:
 [[ 1.   2.   3.   2.5]
 [ 2.   5.  -1.   2. ]
 [-1.5  2.7  3.3 -0.8]]
Shape of inputs: (3, 4)


## 2. First hidden layer

Our first dense layer has **3 neurons**. Each neuron has:

- A weight for each input feature (so 4 weights per neuron).
- A single bias value.

So the weight matrix has shape `(3, 4)`:
- 3 rows → 3 neurons
- 4 columns → 4 inputs per neuron

The biases are stored as a 1D array of length 3.

In [3]:
weights = [
    [0.2, 0.8, -0.5, 1.0],
    [0.5, -0.91, 0.26, -0.5],
    [-0.26, -0.27, 0.17, 0.87],
]
biases = [2.0, 3.0, 0.5]

weights = np.array(weights)
biases = np.array(biases)

print("Weights (layer 1):\n", weights)
print("Shape of weights (layer 1):", weights.shape)
print("Biases (layer 1):", biases)

Weights (layer 1):
 [[ 0.2   0.8  -0.5   1.  ]
 [ 0.5  -0.91  0.26 -0.5 ]
 [-0.26 -0.27  0.17  0.87]]
Shape of weights (layer 1): (3, 4)
Biases (layer 1): [2.  3.  0.5]


### Forward pass through layer 1

To get the outputs of the first layer, we use a **matrix product**:

\begin{align}
    \text{layer1\_outputs} = X W^T + b
\end{align}

where:
- `X` is the input matrix with shape `(3, 4)`
- `W` is the weight matrix with shape `(3, 4)`
- `W.T` (the transpose) has shape `(4, 3)`
- `b` is the bias vector with shape `(3,)`

The result `layer1_outputs` has shape `(3, 3)` → one output per neuron for each of the 3 input samples.

In [4]:
layer1_outputs = np.dot(inputs, weights.T) + biases
print("Layer 1 outputs:\n", layer1_outputs)
print("Shape of layer1_outputs:", layer1_outputs.shape)

Layer 1 outputs:
 [[ 4.8    1.21   2.385]
 [ 8.9   -1.81   0.2  ]
 [ 1.41   1.051  0.026]]
Shape of layer1_outputs: (3, 3)


## 3. Second hidden layer

Now we **add another layer**. This is what makes the network *deep*.

Key rule:
- The number of inputs to a layer must match the number of outputs from the previous layer.

Our first hidden layer has 3 neurons → it outputs 3 values per sample.
So, each neuron in the second layer must have **3 weights** (one per output from layer 1).

We again choose 3 neurons for the second layer, so:
- Weight matrix: shape `(3, 3)`
- Bias vector: shape `(3,)`

In [5]:
weights2 = [
    [0.1, -0.14, 0.5],
    [-0.5, 0.12, -0.33],
    [-0.44, 0.73, -0.13],
]
biases2 = [-1.0, 2.0, -0.5]

weights2 = np.array(weights2)
biases2 = np.array(biases2)

print("Weights (layer 2):\n", weights2)
print("Shape of weights (layer 2):", weights2.shape)
print("Biases (layer 2):", biases2)

Weights (layer 2):
 [[ 0.1  -0.14  0.5 ]
 [-0.5   0.12 -0.33]
 [-0.44  0.73 -0.13]]
Shape of weights (layer 2): (3, 3)
Biases (layer 2): [-1.   2.  -0.5]


### Forward pass through layer 2

Now the **inputs to layer 2** are the outputs from layer 1 (`layer1_outputs`).

\begin{align}
    \text{layer2\_outputs} = \text{layer1\_outputs} \cdot W_2^T + b_2
\end{align}

Shapes:
- `layer1_outputs`: `(3, 3)`
- `weights2.T`: `(3, 3)`
- result: `(3, 3)` (3 samples × 3 neurons in layer 2)

In [6]:
layer2_outputs = np.dot(layer1_outputs, weights2.T) + biases2
print("Layer 2 outputs:\n", layer2_outputs)
print("Shape of layer2_outputs:", layer2_outputs.shape)

Layer 2 outputs:
 [[ 0.5031  -1.04185 -2.03875]
 [ 0.2434  -2.7332  -5.7633 ]
 [-0.99314  1.41254 -0.35655]]
Shape of layer2_outputs: (3, 3)


If you're following along with the book, your values should match (up to minor rounding):

```text
array([[ 0.5031 , -1.04185, -2.03875],
       [ 0.2434 , -2.7332 , -5.7633 ],
       [-0.99314,  1.41254, -0.35655]])
```

Small differences usually mean you've changed weights, biases, or print precision.

## 4. Summary

In this notebook we:

- Built a batch of inputs with 4 features each.
- Defined a first dense layer (3 neurons) and computed its outputs.
- Used those outputs as inputs to a second dense layer (3 neurons).
- Saw how the **shapes** of inputs, weights, and outputs must line up.

Conceptually, we now have a small **2-layer neural network** (both layers hidden for now). In later chapters, we'll add activation functions, an output layer, and eventually train the network using real data and gradient descent.