
# Learning Deep Learning with Python

## Introduction
Deep learning is a subset of machine learning that focuses on using neural networks with multiple layers to learn and make decisions from data. It is part of a broader field called artificial intelligence (AI). In this notebook, we will explore the fundamentals of deep learning and dive into applications in computer vision.


### artificial neuron

An **artificial neuron** is the basic building block of artificial neural networks (ANNs) and is inspired by the biological neurons found in the human brain. Just like a biological neuron receives input signals from other neurons, processes them, and generates an output signal, an artificial neuron follows a similar approach but in a simplified, mathematical form.

### Key Components of an Artificial Neuron:

1. **Inputs (x₁, x₂, ..., xn)**:
   - These are the features or values fed into the neuron. They represent data or signals coming from other neurons or the external environment.

2. **Weights (w₁, w₂, ..., wn)**:
   - Each input is associated with a weight, which determines the strength or importance of that input. The weights are learned during the training process to adjust the network's behavior.

3. **Bias (b)**:
   - The bias is an additional parameter that allows the model to make better predictions. It helps shift the activation function to the left or right, enabling more flexibility in learning.

4. **Summation**:
   - The neuron computes a weighted sum of all the inputs, plus the bias:

$$ z = w_1 \cdot x_1 + w_2 \cdot x_2 + \ldots + w_n \cdot x_n + b $$
   
   This summation forms the input for the activation function.

5. **Activation Function (f)**:
   - The result of the summation is passed through an activation function. This function decides whether the neuron should "fire" (output a signal). Activation functions introduce non-linearity into the system, allowing the network to model complex patterns. Common activation functions include:
#### Sigmoid:
$$
\sigma(z) = \frac{1}{1 + e^{-z}}
$$

#### ReLU:
$$
\text{ReLU}(z) = \max(0, z)
$$

#### Tanh:
$$
\tanh(z) = \frac{e^z - e^{-z}}{e^z + e^{-z}}
$$

6. **Output**:
   - The final result of the activation function is the output of the neuron, which can be passed on to the next layer or used as the final result.

### How It Works:
- The artificial neuron processes the inputs, calculates the weighted sum, adds the bias, and then applies the activation function to produce an output.
- If the neuron is part of a network, its output becomes the input for other neurons in subsequent layers, leading to more complex decision-making in deep neural networks.

## Types of Neural Neworks
![image.png](attachment:image.png)

### 1. Feedforward Neural Networks (FNNs)
Description: The simplest type of neural network, where data flows in one direction, from input to output, without loops.

Applications: Regression, classification tasks.

Example: Multi-Layer Perceptron (MLP).

![image-2.png](attachment:image-2.png)
### 2. Recurrent Neural Networks (RNNs)
Description: Designed to process sequential data by maintaining a hidden state to remember past information.

Key Variants:
Long Short-Term Memory (LSTM)
Gated Recurrent Unit (GRU)

Applications: Time-series prediction, natural language processing (NLP), speech recognition.



| **Type**                     | **Description**                                                                                 | **Applications**                                           |
|------------------------------|-------------------------------------------------------------------------------------------------|-----------------------------------------------------------|
| **Feedforward Neural Networks (FNNs)** | Data flows in one direction, from input to output. No loops.                               | Regression, classification                                 |
| **Convolutional Neural Networks (CNNs)** | Extracts spatial features from grid-like data such as images.                              | Image recognition, object detection, video processing     |
| **Recurrent Neural Networks (RNNs)**   | Processes sequential data by maintaining a hidden state to remember past information.      | Time-series prediction, NLP, speech recognition           |
| **Transformer Networks**      | Uses self-attention mechanisms for efficient sequence processing.                              | Machine translation, NLP tasks, large language models     |
| **Generative Adversarial Networks (GANs)** | Comprises a generator and a discriminator competing to produce realistic outputs.         | Image generation, style transfer, synthetic data creation |
| **Autoencoders**              | Learns to compress and reconstruct data (unsupervised).                                        | Data compression, anomaly detection, image denoising      |
| **Radial Basis Function Networks (RBFNs)** | Uses radial basis functions as activation functions in the hidden layer.                  | Time-series prediction, regression, classification        |
| **Self-Organizing Maps (SOMs)** | Organizes data into clusters (unsupervised).                                                  | Data visualization, clustering tasks                      |
| **Modular Neural Networks**   | Consists of independent networks solving sub-problems in parallel.                             | Complex tasks requiring parallel computation              |
| **Spiking Neural Networks (SNNs)** | Mimics biological neurons with discrete spike-based computation.                            | Neuromorphic computing, edge devices                      |
| **Graph Neural Networks (GNNs)** | Processes graph-structured data.                                                             | Social network analysis, molecular property prediction    |
| **Capsule Networks**          | Captures hierarchical and spatial relationships in data.                                       | Image recognition with rotational invariance             |


### Loss function
A loss function (or cost function) - is a function that maps values of one or more variables onto a real number intuitively representing some "cost" associated with the event, that is negatively correlated with a success measure (such as accuracy). An optimization problem seeks to minimize a loss function.

MSE = $$\frac{1}{m} \sum_{n=1}^{m} (y_p,n - {y}_n)^2$$

![image.png](attachment:image.png)

### Conclusion and Takeaways

#### **Deep Learning (DL)**
- DL is a type of Machine Learning (ML).
- Distinguished from general ML models by having many intermediate representations (as noted by Goodfellow and Chollet).
- In practice, DL models are artificial neural networks (ANN) trained using gradient descent algorithms.

#### **Single Neuron**
- An artificial neuron sums up all its inputs and feeds the result into an activation function.
- The activation function can be:
  - **Linear**: A 'pass-through' function.
  - **Non-linear**: Functions like sigmoid or ReLU.
- The output of the activation function is the neuron’s output.

#### **Loss Function**
- Learning from data in DL involves training a neural network.
- Training requires an objective to numerically measure performance: the **loss function**.
- Example:
  - For regression problems, a common loss function is **Mean Squared Error (MSE)**.

#### **Gradient Descent**
- Training a neural network involves minimizing the value of its loss function for a given dataset.
- This is achieved by:
  - Taking small steps in the direction opposite to the gradient.
  - This process is known as **gradient descent**.
