# <span style="color:#2E86C1"><b>What is Deep Learning?</b></span>

### <span style="color:#D35400"><b>Definition and Scope of Deep Learning</b></span>

- **Deep Learning (DL)** is a specialized subset of **machine learning** that utilizes **neural networks** with multiple layers, allowing models to automatically learn complex patterns from large amounts of data.
- It automates the process of **feature extraction**, removing the need for manual intervention, which is common in traditional machine learning. 
- Deep learning is especially powerful for tasks involving **unstructured data** (e.g., images, text, audio).
- **Neural Networks** in DL are inspired by the structure of the human brain, where **neurons** connect in layers and process information hierarchically.

---

### <span style="color:#28B463"><b>Difference Between Machine Learning and Deep Learning</b></span>


| **Aspect**                        | **Machine Learning (ML)**                                          | **Deep Learning (DL)**                                        |
|-----------------------------------|-------------------------------------------------------------------|---------------------------------------------------------------|
| **Feature Selection**             | Features need to be manually selected by domain experts.          | Automatically extracts features through multiple network layers. |
| **Data Requirement**              | Performs well with small to medium-sized datasets.                | Requires large datasets to perform effectively.               |
| **Computational Power**           | Can work with regular computational resources (CPUs).             | Requires high computational power (GPUs/TPUs).                |
| **Architecture**                  | Models are shallow, e.g., decision trees, linear regression.      | Uses deep neural networks with multiple hidden layers.         |
| **Training Time**                 | Typically shorter training time for simpler models.               | Requires longer training times due to complex architectures.   |
| **Interpretability**              | Easier to interpret and explain model decisions.                  | Harder to interpret due to complexity (black-box models).      |
| **Application Areas**             | Works well for simpler tasks, like regression or decision trees.  | Excels in complex tasks, such as image recognition and NLP.    |
| **Accuracy**                      | Limited accuracy for complex problems (depends on feature quality).| Generally achieves higher accuracy with large datasets.        |
| **Feature Engineering**           | Requires domain knowledge for feature engineering.                | Automatically discovers important features without human input.|
| **Data Processing**               | Requires structured/tabular data for good performance.            | Can process unstructured data (e.g., images, text, audio).     |


### <span style="color:#D35400"><b>Importance of Deep Learning</b></span>

- **Deep Learning** has revolutionized artificial intelligence by achieving **state-of-the-art performance** in various domains.
- It excels at handling **high-dimensional data** and uncovering complex relationships that traditional machine learning models struggle with.
- The ability to learn features from raw data has enabled advances in image recognition, speech processing, and natural language understanding.

---

### <span style="color:#2E86C1"><b>Real-World Applications of Deep Learning</b></span>

- **Image and Video Recognition**:
  - Used in **facial recognition**, **object detection** (e.g., in **self-driving cars**), and **medical image analysis** (e.g., identifying tumors).
  
- **Natural Language Processing (NLP)**:
  - Powers systems like **virtual assistants** (Siri, Alexa), **language translation services** (Google Translate), and **text generation** (chatbots).
  
- **Speech Recognition**:
  - Enables applications such as **voice-controlled assistants** (Google Assistant, Siri) and **speech-to-text** technologies.

- **Recommendation Systems**:
  - Platforms like **Netflix**, **YouTube**, and **Amazon** use deep learning to personalize user recommendations.

- **Healthcare**:
  - Used in **medical diagnosis**, **drug discovery**, and **personalized medicine**.

- **Autonomous Vehicles**:
  - Self-driving cars rely on deep learning models for **real-time decision making** and **object detection**.

---

# <span style="color:#3498DB"><b>What is a Neural Network?</b></span>


A **Neural Network** is a computational model inspired by the way biological neurons work in the human brain. It is made up of several interconnected nodes, called **neurons**, organized in layers. Neural networks are designed to recognize patterns, perform classification, and make predictions by learning from data.

---

### <span style="color:#28B463"><b>Key Components of a Neural Network</b></span>

| **Component**         | **Description**                                                                                                   |
|-----------------------|-------------------------------------------------------------------------------------------------------------------|
| **Neuron**            | The fundamental unit of a neural network that receives input, processes it, and passes an output to other neurons. |
| **Input Layer**        | The first layer of the network that receives the input data. Each input feature is represented by a neuron.        |
| **Output Layer**       | The final layer of the network that provides the predicted output, such as a class label or a value.              |
| **Hidden Layers**      | Layers between the input and output layers where the network learns to transform the input into meaningful outputs.|
| **Weights**           | Parameters that determine the importance of input data, adjusted during training to minimize errors.               |
| **Bias**              | A constant added to the weighted sum to shift the activation function, helping the model make better predictions.   |
| **Activation Function**| A function that determines if a neuron should activate and pass information to the next layer.                    |

---

### <span style="color:#F39C12"><b>1. Neuron</b></span>

A **neuron** (also called a node) is the basic building block of a neural network. It takes inputs from other neurons or directly from the data, processes the inputs by applying weights and bias, and then passes the result through an **activation function** to produce an output.

- **Function of a Neuron**: 
  - Inputs are multiplied by weights and summed with a bias term. This sum is passed through an **activation function**, which determines the final output of the neuron.
  
  $$
  \text{Output of a neuron} = f(w_1 \cdot x_1 + w_2 \cdot x_2 + ... + w_n \cdot x_n + b)
  $$
  Where:
  - $ w_1, w_2, ..., w_n $ are the weights
  - $ x_1, x_2, ..., x_n $ are the inputs
  - $ b $ is the bias term
  - $ f $ is the activation function

---

### <span style="color:#F39C12"><b>2. Input and Output Layers</b></span>

- **Input Layer**: 
  - This is where the input data is fed into the network. Each feature of the data corresponds to a neuron in the input layer. For example, if the input is a 3-feature vector $[x_1, x_2, x_3]$, the input layer will have 3 neurons.
  
- **Output Layer**: 
  - The final layer that provides the output prediction. In a classification problem, it could output probabilities of different classes; in regression, it outputs a continuous value. The number of neurons in this layer depends on the nature of the task (e.g., 1 for binary classification, multiple for multiclass classification).

---

### <span style="color:#F39C12"><b>3. Activation Function</b></span>

An **activation function** is applied to the output of each neuron. It adds **non-linearity** to the model, enabling the network to solve complex problems. Without activation functions, a neural network would behave like a simple linear regression model.

- **Common Activation Functions**:
  - **Sigmoid**: Maps the output between 0 and 1.
    $$
    \sigma(x) = \frac{1}{1 + e^{-x}}
    $$
  - **ReLU (Rectified Linear Unit)**: Outputs the input directly if it's positive; otherwise, it outputs zero.
    $$
    \text{ReLU}(x) = \max(0, x)
    $$
  - **Tanh**: Maps the output between -1 and 1.
    $$
    \tanh(x) = \frac{2}{1 + e^{-2x}} - 1
    $$

#### <span style="color:#E74C3C"><b>Why Do We Need Activation Functions?</b></span>
- **Non-Linearity**: They allow the network to learn non-linear relationships between input and output, making it capable of solving complex tasks like image recognition or language translation.
- **Differentiability**: Most activation functions are differentiable, enabling the use of backpropagation to adjust weights and biases during training.

---

# <span style="color:#9B59B6"><b>Perceptron: The Simplest Neural Network</b></span>


The **Perceptron** is the simplest type of neural network, consisting of only one **input layer** and one **output layer**, with no hidden layers. It is a binary classifier and can only solve **linearly separable problems**.

<center><img src="../../images/perceptron-math.png" alt="linear_regression" width="500"/></center>

#### **Architecture of a Perceptron**:
- **Input Layer**: 
  - Receives input data, e.g., $x_1, x_2, ..., x_n$.
- **Weights**: 
  - Each input is associated with a weight, determining its importance.
- **Bias**: 
  - A bias term is added to shift the activation function.
- **Activation Function**: 
  - Typically uses a **step function** to determine if the neuron fires (activates) or not.
  
  $$
  \text{Output} = 
  \begin{cases} 
  1, & \text{if } w_1x_1 + w_2x_2 + ... + w_nx_n + b > 0 \\
  0, & \text{otherwise}
  \end{cases}
  $$

#### **How It Works**:
1. The perceptron takes weighted inputs.
2. It adds a bias.
3. If the sum exceeds a certain threshold, the perceptron outputs 1 (positive class); otherwise, it outputs 0 (negative class).

---

### <span style="color:#9B59B6"><b>Perceptron Example:</b></span>

Consider a perceptron designed to classify points as either above or below a line in 2D space:

- Inputs: $ x_1 = 1.5, x_2 = 2.0 $
- Weights: $ w_1 = 0.8, w_2 = 0.5 $
- Bias: $ b = -1.0 $

$$
\text{Output} = \text{Step}(0.8 \cdot 1.5 + 0.5 \cdot 2.0 - 1.0) = \text{Step}(2.7 - 1.0) = \text{Step}(1.7) = 1
$$

---

### <span style="color:#D35400"><b>Why Perceptron is Limited?</b></span>

- A perceptron can only solve **linearly separable** problems (i.e., problems where a straight line can separate the classes).
- To solve more complex problems, we need networks with multiple **hidden layers**, which leads to the development of **Multilayer Perceptrons (MLPs)**.

---