# Deep Learning Day 1

## Introduction and Simple Models

### Deep Learning vs. Classic Machine Learning
1. **Classification and Regression**: Deep Learning excels in both tasks, especially when dealing with large, complex datasets.
2. **Using Big Data**: While classical ML often struggles with large datasets, Deep Learning thrives by leveraging massive amounts of data to improve performance.

### Emerging Topics
- **Large Language Models (LLMs)**: Powerful for understanding and generating human-like text.
- **Generative Models**: Used for creating new content, such as images, text, or audio.
- **Reinforcement Learning (RL)**: Focuses on learning optimal policies through interaction with an environment.

## Frameworks
1. **Scikit-Learn**: Often used for simpler ML models.
2. **TensorFlow**: Widely adopted for production applications.
3. **PyTorch**: Favored in research, especially by organizations like Hugging Face.

## Curriculum Overview
1. **Introduction and Basic Models**
2. **Training and Improving Neural Networks (NNs)**
3. **Neural Networks for Image Processing**
4. **Neural Networks for Language Processing**
5. **Advanced Neural Network Architectures**
6. **Generative Models**
7. **Reinforcement Learning**

## Notes on Tools and Training
- **TensorFlow**: Predominantly used in production settings.
- **PyTorch**: The go-to choice for research.
- **Reinforcement Learning from Human Feedback (RLHF)**: A critical area in modern RL applications.

### Key Dates
- **Regular Exam**: 15–16 February 2025  
  (Project and summaries deadline: 13 February 2025).
- **Retake Exam**: 1–2 March 2025.
- **Homework Assignments**: Provided regularly.

### Exam Structure
1. **Theory**: 10 questions in 30 minutes (30% of grade).
2. **Practice**: Two article summaries (application-focused and theory/concept-focused) worth 20%, and a project worth 80%.

## Introduction to Deep Learning

### Basic Models: Advantages and Disadvantages
- **Advantages**:  
  - Utilizes GPUs and specialized hardware.  
  - Handles larger datasets and more parameters.  
  - Reduces the need for complex feature selection.  
  - Supports multi-output models.
- **Disadvantages**:  
  - High computational cost.  
  - Not necessary for small datasets or simple tasks ("If your model trains in less than a day, you don’t need a Deep Learning model").

## Introduction to Neural Networks (NNs)
1. **Computational Graphs**: Represent operations and data flow.
2. **Directed Acyclic Graphs (DAGs)**: Ensure no cycles in computations.
3. **Feedforward Neural Networks**: The simplest NN architecture.
4. **Network Structure**: Input > Input layer > Hidden layer(s) (≥1) > Output layer > Output.
5. **Key Features**: No links between units within the same layer.
6. **Setup**: Install TensorFlow, PyTorch, and Nvidia CUDA.
7. **AST (Abstract Syntax Tree)**: Useful for programmatic NN construction.
8. **Propagation**: Includes Forward and Backward Propagation.
9. **Tensors**: Scalars, vectors, matrices, and higher-order tensors (e.g., RGB channels).

## Framework-Specific Notes

### TensorFlow
1. **High-Level API**: TensorFlow’s primary high-level API is Keras.
2. **Components**: Import `Sequential`, `Input`, and `Dense` from `tf.keras`.
3. **Model Summary**: Provides details about layers and parameters.
4. **Loss Function**: Use `sparse_categorical_crossentropy` for classification tasks in `tf.compile`.
5. **Activation Function**: Use `softmax` for multi-class classification.
6. **Optimizer**: Typically, `adam`.
7. **Metrics**: Commonly track `accuracy`.
8. **Training**: Via `model.fit`.
9. **Epochs**: Define the number of forward and backward passes.
10. **Batch Size**: Determines the subset of data used per update.
11. **Session Management**: Use `keras.clear_session()` to manage memory.

### PyTorch
1. **API**: Object-oriented and modular.
2. **Tensors**: Convert data into tensors for computation.
3. **Loss Function**: Defined as a criterion.
4. **Optimization**: Managed through optimizers.
5. **Training Loop**: Conducted within a `for` loop.
6. **Gradient Updates**: Use `optimizer.zero_grad()` before backpropagation.
7. **Extensions**: Incorporates `PyTorch Lightning` for streamlined training.
8. **Evaluation**: Use `torcheval` for performance metrics.

### Notes
- GPUs can accelerate training by up to 104x compared to CPUs.
- Install `Lightning` and `torcheval` for enhanced functionality.
