# Layer of Neurons & Batch of Data with NumPy

Previously we:

- computed a single layer’s output for **one sample** using NumPy:
  - `np.dot(weights, inputs) + biases`
- introduced **batches** of samples (inputs as a matrix)
- learned that matrix multiplication is just a **grid of dot products**

Now we combine all of this:

- inputs → a **batch** of samples (matrix)
- weights → a **layer** of neurons (matrix)
- biases → one bias per neuron (vector)

We’ll use:

$$
\text{layer\_outputs} = X \cdot W^T + \mathbf{b}
$$

Where:

- $X$ has shape $(\text{batch\_size}, \text{n\_features})$
- $W$ has shape $(\text{n\_neurons}, \text{n\_features})$
- $W^T$ has shape $(\text{n\_features}, \text{n\_neurons})$
- $\mathbf{b}$ has shape $(\text{n\_neurons},)$


In [1]:
import numpy as np

# Batch of 3 input samples, 4 features each
inputs = [
    [1.0,  2.0,  3.0,  2.5],
    [2.0,  5.0, -1.0,  2.0],
    [-1.5, 2.7,  3.3, -0.8]
]

# 3 neurons, each with 4 weights (one per input feature)
weights = [
    [0.2,   0.8,  -0.5,  1.0],
    [0.5,  -0.91,  0.26, -0.5],
    [-0.26, -0.27, 0.17,  0.87]
]

# One bias per neuron
biases = [2.0, 3.0, 0.5]

# Convert weights to a NumPy array so we can transpose them
weights = np.array(weights)

layer_outputs = np.dot(inputs, weights.T) + biases

print("Layer outputs:\n", layer_outputs)

Layer outputs:
 [[ 4.8    1.21   2.385]
 [ 8.9   -1.81   0.2  ]
 [ 1.41   1.051  0.026]]


## Shapes and Why We Transpose the Weights

Let’s inspect the shapes.

- `inputs` → batch of 3 samples, each with 4 features  
  → shape: $(3, 4)$

- `weights` → 3 neurons, each with 4 weights  
  → shape: $(3, 4)$

We want:

- for each **sample** (row of `inputs`)
- to get outputs from each **neuron** (row of `weights`)

For matrix multiplication:

- we need $(\text{batch\_size}, \text{n\_features}) \cdot (\text{n\_features}, \text{n\_neurons})$
- so we transpose `weights` from shape $(3, 4)$ to $(4, 3)$

Then:

$$
(3, 4) \times (4, 3) \rightarrow (3, 3)
$$

So the result has shape:

- $(\text{batch\_size}, \text{n\_neurons})$
- here: 3 samples × 3 neurons → $(3, 3)$

Each **row** = outputs of all neurons for one input sample.  
This is exactly what we want to pass to the **next layer** as a batch.


In [2]:
inputs_arr = np.array(inputs)

print("inputs shape :", inputs_arr.shape)   # (3, 4)
print("weights shape:", weights.shape)      # (3, 4)
print("weights.T shape:", weights.T.shape)  # (4, 3)

layer_outputs = np.dot(inputs_arr, weights.T) + biases

print("layer_outputs:\n", layer_outputs)
print("layer_outputs shape:", layer_outputs.shape)  # (3, 3)

inputs shape : (3, 4)
weights shape: (3, 4)
weights.T shape: (4, 3)
layer_outputs:
 [[ 4.8    1.21   2.385]
 [ 8.9   -1.81   0.2  ]
 [ 1.41   1.051  0.026]]
layer_outputs shape: (3, 3)


## How Biases Are Added for a Batch

`biases` is a vector of shape $(3,)$ — one bias per neuron.

After the matrix product, we have:

- `np.dot(inputs, weights.T)` → shape $(3, 3)$

When we add `biases` (shape $(3,)$) to this matrix, NumPy broadcasts
the bias vector across **rows**:

- bias[0] added to all outputs of neuron 0 (column 0)
- bias[1] added to all outputs of neuron 1 (column 1)
- bias[2] added to all outputs of neuron 2 (column 2)

This matches what we want:

> Each neuron’s bias is added to all of its outputs
> across every sample in the batch.


In [3]:
dot_only = np.dot(inputs_arr, weights.T)
print("Dot product only:\n", dot_only)

with_bias = dot_only + biases
print("\nWith biases added:\n", with_bias)

Dot product only:
 [[ 2.8   -1.79   1.885]
 [ 6.9   -4.81  -0.3  ]
 [-0.59  -1.949 -0.474]]

With biases added:
 [[ 4.8    1.21   2.385]
 [ 8.9   -1.81   0.2  ]
 [ 1.41   1.051  0.026]]


## Big Picture

We’ve now gone from:

- **Single neuron, single sample**  
  → dot(inputs, weights) + bias (scalar)

to:

- **Layer of neurons, single sample**  
  → dot(weights, inputs) + biases (vector)

to:

- **Layer of neurons, batch of samples**  
  → dot(inputs, weights.T) + biases (matrix)

The pattern is the same:

- linear combination (dot products)
- plus bias
- just scaled up to work on many samples at once.

This is exactly how real neural networks operate internally:
batches go in, batches of outputs come out.
