# Understanding Neuron calculations using proper matrix operations

In this lecture, we explore how a single neuron processes data using matrix multiplication. 

---

### A Single Neuron with Single and Multiple Training Samples
In this lecture, we explore how a single neuron processes data using matrix multiplication. 

We'll start with a single training sample and later expand to multiple samples, explicitly including the bias as a vector.


---

**Part 1: Inputs, Weights, and Bias for a single training sample**

1. *Inputs*: Denoted as $ x_1, x_2, \dots, x_n $ for $ n $ features, represented as a vector $ \mathbf{x} \in \mathbb{R}^n $:
   $$
   \mathbf{x} = \begin{bmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{bmatrix}
   $$

2. *Weights*: Denoted as $ w_1, w_2, \dots, w_n $, represented as a vector $ \mathbf{w} \in \mathbb{R}^n $:
   $$
   \mathbf{w} = \begin{bmatrix} w_1 \\ w_2 \\ \vdots \\ w_n \end{bmatrix}
   $$

3. *Bias*: A single scalar $ b \in \mathbb{R} $:
   $$
   b
   $$

The neuron then computes the **sum**:
$$
z = \mathbf{w}^\top \mathbf{x} + b
$$

Finally, an **activation function** $ \sigma $ is applied:
$$
a = \sigma(z)
$$




---

In [13]:
import numpy as np

# Single training sample

x = np.array([2.0, 3.0])  # inputs with n=2 features
w = np.array([0.5, -0.2]) # weights
b = -3.9                   # bias

# Sum calculation

z = np.dot(w, x) + b
print("sum (z):", z)

# ReLU activation function

def relu(z):
    return np.maximum(0, z)

# Activated output

a = relu(z)
print("Activated output (a):", a)


sum (z): -3.5
Activated output (a): 0.0


---

### Summary for Single Training Sample
1. The neuron computes:
   - sum: $ z = \mathbf{w}^\top \mathbf{x} + b $,
   - Activation: $ a = \sigma(z) $.
2. Weights $ \mathbf{w} $ and inputs $ \mathbf{x} $ are vectors, while the bias b is a scalar represented as a 1-dimensional vector.
3. Next, we generalize to **multiple training samples**.
---

**Part 2: Multiple Training Samples**

When there are $ m $ training samples, each with $ n $ features:
1. **Inputs**: Represented as a matrix $ \mathbf{X} \in \mathbb{R}^{m \times n} $:
   $$
   \mathbf{X} = \begin{bmatrix}
   x_{1,1} & x_{1,2} & \dots & x_{1,n} \\
   x_{2,1} & x_{2,2} & \dots & x_{2,n} \\
   \vdots & \vdots & \ddots & \vdots \\
   x_{m,1} & x_{m,2} & \dots & x_{m,n}
   \end{bmatrix}
   $$

2. **Weights**: Same as before, $ \mathbf{w} \in \mathbb{R}^n $:
   $$
   \mathbf{w} = \begin{bmatrix} w_1 \\ w_2 \\ \vdots \\ w_n \end{bmatrix}
   $$

3. **Bias**: A vector $ \mathbf{b} \in \mathbb{R}^m $, where each entry is equal to the bias scalar:
   $$
   \mathbf{b} = \begin{bmatrix} b \\ b \\ \vdots \\ b \end{bmatrix}
   $$

The neuron computes the sum:
$$
\mathbf{z} = \mathbf{X} \mathbf{w} + \mathbf{b}
$$

Finally, apply the activation function element-wise:
$$
\mathbf{a} = \sigma(\mathbf{z})
$$


---


In [10]:
# Multiple training samples (m=3, n=2)

X = np.array([[1.0, 2.0],      # First sample
              [3.0, 4.0],      # Second sample
              [5.0, 6.0]])     # Third sample (m=3, n=2)

w = np.array([0.5, -0.2])      # Weights 

b = np.array([1.0, 1.0, 1.0])  # Bias vector 

# Sigmoid activation function

def sigmoid(z):
    return 1 / (1 + np.exp(-z))

# Linear output

z = np.dot(X, w) + b
print("Linear output (z)     :", z)

# Activated output

a = sigmoid(z)
print("Activated output (a)  :", a)


Linear output (z)     : [1.1 1.7 2.3]
Activated output (a)  : [0.75026011 0.84553473 0.90887704]


---

### Summary for Multiple Training Samples
1. The neuron computes:
   - The sums: $ \mathbf{z} = \mathbf{X} \mathbf{w} + \mathbf{b} $,
   - Activation: $ \mathbf{a} = \sigma(\mathbf{z}) $.
2. **Key Difference**:
   - Inputs $ \mathbf{X} $: A matrix with all training samples.
   - Bias $ \mathbf{b} $: Explicitly represented as a vector, matching the number of training samples $ m $.
3. This efficient matrix formulation allows processing of multiple samples in parallel.


---

## Conclusion

1. For a **single training sample**, the neuron performs:
   - Weighted sum: $ z = \mathbf{w}^\top \mathbf{x} + b $,
   - Activation: $ a = \sigma(z) $.

2. For **multiple training samples**, the computation scales to matrix operations:
   - Linear output: $ \mathbf{z} = \mathbf{X} \mathbf{w} + \mathbf{b} $,
   - Activation: $ \mathbf{a} = \sigma(\mathbf{z}) $.

This framework is the foundation of neural network operations and scales naturally to larger architectures.
