# Module 1: Neurons and Layers

In this notebook, you'll learn:
- What a neuron computes mathematically
- How layers are built from neurons
- The forward pass through a network
- Implementing a neuron from scratch

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import torch
import torch.nn as nn

# For reproducibility
np.random.seed(42)
torch.manual_seed(42)

## 1.1 The Neuron: Basic Building Block

A neuron computes:

```
output = activation(sum(inputs * weights) + bias)
```

Or mathematically:

$$y = \sigma\left(\sum_{i=1}^{n} w_i x_i + b\right) = \sigma(\mathbf{w}^T \mathbf{x} + b)$$

Where:
- $\mathbf{x}$ = input vector
- $\mathbf{w}$ = weight vector
- $b$ = bias term
- $\sigma$ = activation function

In [None]:
# Let's implement a single neuron from scratch

class Neuron:
    """A single artificial neuron."""
    
    def __init__(self, n_inputs, activation='relu'):
        # Initialize weights randomly
        self.weights = np.random.randn(n_inputs) * 0.1
        self.bias = 0.0
        self.activation = activation
    
    def forward(self, x):
        """Compute neuron output."""
        # Weighted sum
        z = np.dot(self.weights, x) + self.bias
        
        # Apply activation
        if self.activation == 'relu':
            return max(0, z)
        elif self.activation == 'sigmoid':
            return 1 / (1 + np.exp(-z))
        else:
            return z  # linear

# Create a neuron with 3 inputs
neuron = Neuron(n_inputs=3)
print(f"Weights: {neuron.weights}")
print(f"Bias: {neuron.bias}")

# Test with some input
x = np.array([1.0, 2.0, 3.0])
output = neuron.forward(x)
print(f"\nInput: {x}")
print(f"Output: {output}")

## 1.2 Exercise: Complete the Layer Implementation

A layer is a collection of neurons. Complete the `forward` method below:

In [None]:
class Layer:
    """A layer of neurons."""
    
    def __init__(self, n_inputs, n_outputs, activation='relu'):
        # Weight matrix: (n_outputs, n_inputs)
        self.weights = np.random.randn(n_outputs, n_inputs) * 0.1
        self.biases = np.zeros(n_outputs)
        self.activation = activation
    
    def forward(self, x):
        """Compute layer output.
        
        TODO: Implement the forward pass
        1. Compute z = Wx + b (matrix multiplication)
        2. Apply activation function
        3. Return the result
        """
        # YOUR CODE HERE
        # Hint: Use np.dot for matrix multiplication
        z = np.dot(self.weights, x) + self.biases
        
        if self.activation == 'relu':
            return np.maximum(0, z)
        elif self.activation == 'sigmoid':
            return 1 / (1 + np.exp(-z))
        else:
            return z

# Test your implementation
layer = Layer(n_inputs=4, n_outputs=3)
x = np.array([1.0, 2.0, 3.0, 4.0])
output = layer.forward(x)

print(f"Input shape: {x.shape}")
print(f"Output shape: {output.shape}")
print(f"Output: {output}")

## 1.3 Activation Functions

Activation functions introduce non-linearity. Let's visualize the common ones:

In [None]:
x = np.linspace(-5, 5, 100)

# Define activations
relu = np.maximum(0, x)
sigmoid = 1 / (1 + np.exp(-x))
tanh = np.tanh(x)
leaky_relu = np.where(x > 0, x, 0.01 * x)

# Plot
fig, axes = plt.subplots(2, 2, figsize=(12, 8))

axes[0,0].plot(x, relu)
axes[0,0].set_title('ReLU: max(0, x)')
axes[0,0].axhline(y=0, color='k', linestyle='-', linewidth=0.5)
axes[0,0].axvline(x=0, color='k', linestyle='-', linewidth=0.5)

axes[0,1].plot(x, sigmoid)
axes[0,1].set_title('Sigmoid: 1/(1+e^-x)')
axes[0,1].axhline(y=0.5, color='r', linestyle='--', alpha=0.5)

axes[1,0].plot(x, tanh)
axes[1,0].set_title('Tanh')
axes[1,0].axhline(y=0, color='k', linestyle='-', linewidth=0.5)

axes[1,1].plot(x, leaky_relu)
axes[1,1].set_title('Leaky ReLU')
axes[1,1].axhline(y=0, color='k', linestyle='-', linewidth=0.5)

plt.tight_layout()
plt.show()

## 1.4 Using PyTorch

Now let's see how PyTorch makes this easier:

In [None]:
# PyTorch layer
torch_layer = nn.Linear(in_features=4, out_features=3)

# Input (needs to be a tensor)
x_torch = torch.tensor([1.0, 2.0, 3.0, 4.0])

# Forward pass
output_torch = torch_layer(x_torch)

print(f"PyTorch layer weights shape: {torch_layer.weight.shape}")
print(f"Output: {output_torch}")

## Key Takeaways

1. A **neuron** computes a weighted sum plus bias, then applies an activation
2. A **layer** is multiple neurons processing the same input in parallel
3. **Activation functions** introduce non-linearity (essential for learning complex patterns)
4. **PyTorch** provides optimized implementations of these building blocks

**Next:** Continue to `02_forward_pass.ipynb` to learn how data flows through networks!