# 🔍 Deep Learning
Deep learning is a subset of machine learning that uses artificial neural networks to model and solve complex problems. It's inspired by the way the human brain works — with layers of neurons that process information.

## Common Deep Learning Frameworks:
- TensorFlow (Google)
- PyTorch (Facebook/Meta)
- Keras (High-level API for TensorFlow)

---
# 🔍 Neural Network
A Neural Network is a computational model inspired by the human brain. It consists of neurons (also called nodes) arranged in layers that work together to learn patterns from data.

Each neuron:
- Takes input
- Applies a weight and bias
- Passes it through an activation function
- Produces output
These outputs then feed into the next layer of neurons — like a pipeline.

<img src="https://www.simplyblock.io/wp-content/media/a7fbb2_553adcbda4b346b28831a3f94d2994camv2.jpg" alt="A beautiful sunset" width="500" />

---
## 🧩 Types of Neural Networks

There are several types, each good for different tasks:

| Type                                | Description                                                             | Use Case                                              |
|-------------------------------------|-------------------------------------------------------------------------|--------------------------------------------------------|
| **1. Feedforward Neural Network (FNN)** | Simple network where data flows one way (input → output).              | Basic classification and regression tasks.             |
| **2. Convolutional Neural Network (CNN)** | Uses filters to detect patterns in images.                             | Image classification, object detection, facial recognition. |
| **3. Recurrent Neural Network (RNN)** | Has loops — output from one step is input to the next.                 | Time series, speech recognition, text generation.      |
| **4. Long Short-Term Memory (LSTM)** | A type of RNN that solves memory issues.                               | Language modeling, chatbots, translation.              |
| **5. Generative Adversarial Network (GAN)** | Two networks (Generator & Discriminator) compete to create realistic data. | Image generation, deepfakes, art generation.           |
| **6. Autoencoder**                   | Learns to compress and reconstruct data.                               | Noise reduction, anomaly detection, dimensionality reduction. |
| **7. Transformer**                  | Works on sequences but in parallel (no recurrence).                    | Language models like ChatGPT, BERT, translation.       |


---
# 🔍 Activation Functions in Deep Learning
An activation function is a mathematical function used in neural networks to determine whether a neuron should be "activated" or not — i.e., it decides whether the information that the neuron is processing is important enough to pass to the next layer.

## 🧠 Popular Activation Functions

| Activation Function     | Formula / Behavior                            | Use Case / Notes                                                                 |
|-------------------------|-----------------------------------------------|----------------------------------------------------------------------------------|
| **Sigmoid**             | $ \sigma(x) = \frac{1}{1 + e^{-x}} $        | Output between 0 and 1. Good for binary classification. Can cause vanishing gradients. |
| **Tanh**                | $ \tanh(x) = \frac{e^x - e^{-x}}{e^x + e^{-x}} $ | Output between -1 and 1. Better than sigmoid, but still has vanishing gradient issues. |
| **ReLU (Rectified Linear Unit)** | $ f(x) = \max(0, x) $                 | Most common. Fast and simple. Good for hidden layers. Can suffer from "dead neurons." |
| **Leaky ReLU**          | $ f(x) = \max(0.01x, x) $                   | Fixes ReLU’s dead neuron issue with a small slope for negative values.          |
| **Softmax**             | $ \text{Softmax}(x_i) = \frac{e^{x_i}}{\sum e^{x_j}} $ | Converts outputs to probabilities. Used in the output layer for multi-class classification. |
| **Swish / GELU / ELU**  | Various modern functions                      | Often used in advanced models like transformers. Smoother, better performance in some cases. |

---
## 🔁 Step-by-Step: How Weights & Biases are Updated

### ⚙️ 1. Forward Propagation
The input data goes through the network layer by layer.

At each neuron:

$$
z = w \cdot x + b
$$
$$
a = f(z)
$$

- `w`: weight  
- `x`: input  
- `b`: bias  
- `f(z)`: activation function output

---

### 📉 2. Loss Calculation
- Compare the predicted output with the true label using a **loss function** (e.g., MSE, Cross-Entropy).
- This gives us a **loss value** representing the error.

---

### 🔄 3. Backpropagation
- The error (loss) is **propagated backward** through the network.
- We calculate the **gradient (slope)** of the loss with respect to each weight and bias using the **chain rule** from calculus.

---

### 🧮 Update Rule: Gradient Descent

Weights and biases are updated using the **Gradient Descent** algorithm:

$$
w := w - \eta \cdot \frac{\partial \text{Loss}}{\partial w}
$$
$$
b := b - \eta \cdot \frac{\partial \text{Loss}}{\partial b}
$$

Where:
- `w` = weight  
- `b` = bias  
- `η` = learning rate  
- `∂Loss/∂w`, `∂Loss/∂b` = gradients

These gradients tell us how much a small change in weight or bias will affect the loss.

---

### 🔁 This process repeats for many epochs:
- **Forward pass** → compute output  
- **Compute loss**  
- **Backward pass** → compute gradients  
- **Update** weights & biases using gradient descent
