<h2 style="text-align:center;">Neural Network Architecture</h2>

**Author:** Mubasshir Ahmed**  
**Module:** Deep Learning ‚Äî FSDS  
**Notebook:** 02_Neural_Network_Architecture  
**Objective:** Understand how a Neural Network is structured, how data flows through it, and how its components interact during learning.

---

A **Neural Network Architecture** defines the **arrangement of layers, neurons, and connections** that transform raw data into meaningful predictions.  
It determines how the network learns, how complex patterns can be recognized, and how efficiently it performs.


### <h3 style="text-align:center;">1Ô∏è‚É£ What is Neural Network Architecture?</h3>

A **neural network architecture** is like the *blueprint* of a brain-inspired machine ‚Äî it specifies how neurons are connected and how information flows between them.

Each network has:
- A **defined number of layers** (input, hidden, output)
- A **specific number of neurons** per layer
- **Weights** and **biases** for every connection
- **Activation functions** that introduce non-linearity

**Analogy:**  
> Think of a neural network as a multi-floor building:  
> - Each floor = a layer  
> - Each room = a neuron  
> - Wires connecting rooms = weights  
> - Power switches controlling flow = activation functions  
> Data enters from the ground floor and exits the top floor after transformation.


### <h3 style="text-align:center;">2Ô∏è‚É£ Layers in a Neural Network</h3>

There are three main types of layers in an ANN:

| Layer | Function | Example |
|--------|-----------|----------|
| **Input Layer** | Receives raw input data | Pixels of an image, features in dataset |
| **Hidden Layers** | Transform and extract patterns | Intermediate representations |
| **Output Layer** | Produces final prediction | Classification label or numeric value |

- The number of **neurons in the input layer** equals the number of features in your dataset.  
- The **output layer neurons** depend on the problem type:  
  - Regression ‚Üí 1 neuron  
  - Binary classification ‚Üí 1 neuron (sigmoid)  
  - Multiclass ‚Üí n neurons (softmax)


### <h3 style="text-align:center;">3Ô∏è‚É£ How Information Flows ‚Äî Feedforward Path</h3>

Information in a neural network flows **forward** through connections between neurons.

Each neuron in one layer connects to every neuron in the next layer (fully connected).  

**Computation per neuron:**  
\[ z_j = \sum_i w_{ij}x_i + b_j \]  
\[ a_j = f(z_j) \]

Where:  
- \( x_i \) = input from previous layer  
- \( w_{ij} \) = weight of connection  
- \( b_j \) = bias term  
- \( f(z) \) = activation function  
- \( a_j \) = output of neuron *j*

This process is repeated across all layers until the output is produced.


### <h3 style="text-align:center;">4Ô∏è‚É£ Weights and Bias ‚Äî The Core of Learning</h3>

- **Weights (W):** Control the influence of one neuron over another.  
  Higher weights ‚Üí stronger influence.

- **Bias (b):** Allows flexibility in shifting the activation threshold.  
  Without bias, all activations would pass through the origin, limiting learning.

**Example:**  
For a simple neuron:  
\[ y = f(w_1x_1 + w_2x_2 + b) \]

During training, the weights and biases get adjusted to minimize prediction errors.


### <h3 style="text-align:center;">5Ô∏è‚É£ Activation Functions in Architecture</h3>

Activation functions define **when** and **how much** a neuron should activate.

| Function | Role | Output Range |
|-----------|------|--------------|
| **ReLU** | Prevents vanishing gradients, speeds up learning | [0, ‚àû) |
| **Sigmoid** | Used in binary classification | (0, 1) |
| **Tanh** | Smooth symmetric output | (-1, 1) |
| **Softmax** | Multi-class probability distribution | (0, 1) |

**Tip:** Hidden layers commonly use ReLU; output layer depends on the task type.


### <h3 style="text-align:center;">6Ô∏è‚É£ Forward Propagation (Mathematical View)</h3>

**Step-by-step example:**

1. Input vector \( X = [x_1, x_2, x_3] \) enters the network.  
2. Each hidden layer neuron computes weighted input and adds bias:  
   \[ z = W \cdot X + b \]
3. Activation function transforms z ‚Üí a (output activation).  
4. The output of this layer becomes input to the next layer.  
5. Final output layer produces prediction \( \hat{y} \).

This process is known as **feedforward propagation**.


### <h3 style="text-align:center;">7Ô∏è‚É£ Backward Propagation (Weight Adjustment)</h3>

Once the network predicts \( \hat{y} \), it compares it with the true value \( y \) using a **loss function**.

The error is then propagated **backward** to adjust the weights using the **chain rule** ‚Äî this is called **backpropagation**.

**Formula:**  
\[ W_{new} = W_{old} - \eta \frac{\partial L}{\partial W} \]

Where:  
- \( L \) = loss function  
- \( \eta \) = learning rate  
- \( \partial L/\partial W \) = gradient of loss w.r.t weight

**Goal:** Reduce the loss step by step until the model learns optimal parameters.


### <h3 style="text-align:center;">8Ô∏è‚É£ Hyperparameters That Define Architecture</h3>

| Hyperparameter | Description | Typical Range |
|-----------------|--------------|----------------|
| **Number of Hidden Layers** | Controls model depth | 1‚Äì5 (ANN) |
| **Neurons per Layer** | Affects model capacity | 8‚Äì512 |
| **Learning Rate** | Step size in weight updates | 0.1 ‚Üí 0.0001 |
| **Batch Size** | Samples per gradient update | 16‚Äì128 |
| **Epochs** | Number of complete training cycles | 10‚Äì500 |
| **Optimizer** | Controls how gradients update weights | SGD, Adam, RMSProp |

Choosing these wisely prevents underfitting or overfitting.


### <h3 style="text-align:center;">9Ô∏è‚É£ Designing a Good ANN Architecture</h3>

Guidelines for building effective architectures:

‚úÖ Start simple ‚Äî 1 or 2 hidden layers, 16‚Äì64 neurons each.  
‚úÖ Use **ReLU** in hidden layers, **Sigmoid/Softmax** in output.  
‚úÖ Normalize input data for faster convergence.  
‚úÖ Experiment with learning rates (0.01‚Äì0.001).  
‚úÖ Regularize with dropout for large networks.  
‚úÖ Always monitor training vs validation accuracy/loss.

> Simplicity first ‚Äî then add complexity if the model underfits.


### <h3 style="text-align:center;">üîü Summary & Takeaways</h3>

- Neural Network Architecture defines the arrangement of layers and neurons.  
- Data flows **forward** for prediction and **backward** for learning.  
- **Weights** and **biases** determine how strongly neurons interact.  
- **Activation functions** make learning non-linear and powerful.  
- Proper choice of **hyperparameters** determines success or failure.

**Next:** Proceed to `03_Forward_Backward_Propagation/` to explore mathematical details of learning and gradient computation.
