Date: 10/09/2024

# Introduction to Machine Learning

![alt text](../../ML_AI_notes_Aug_Sep_24/image.png)

### Prerequisites
1. **Python, Math, Statistics, and Probability**: Basic understanding of these concepts is essential.
2. **OSEMN**: Data science process stands for *Obtain*, *Scrub*, *Explore*, *Model*, and *INterpret*.
3. **Data Modeling**: Ability to represent real-world processes with mathematical models.
4. **Hypothesis Testing**: Forming and testing hypotheses to validate predictions.
5. **Scientific Mindset**: A structured, analytical approach to solving problems.

### Main Goal
1. Develop a strong intuition for mathematical concepts and data analysis.
2. Build models to understand, predict, and potentially influence processes.
3. Evaluate the model’s performance and make informed adjustments.

### Project Requirement
1. Complete at least one full project involving:
   - Data preparation
   - Choosing the right algorithms
   - Training and refining models

### Curriculum Overview
1. **Supervised Learning**: Learning from labeled data to predict outcomes.
2. **Unsupervised Learning**: Discovering patterns in data without labels.
3. **Neural Networks**: Introduction to neural networks and their limitations with tabular data.
4. **Time Series**: Analysis of time-dependent data.

### Machine Learning in Real Life
1. The application of machine learning to solve real-world problems.
2. Discussion on upcoming exams (23-24 November). Project and 2 paper summaries by 21 November.
3. Evaluation split: 80% project + theory, 20% papers summary + additional Discord bonus.

### Recommended Books
1. **Hands-On Machine Learning with Scikit-Learn, TensorFlow, and Keras**.
2. **Designing Machine Learning Systems**.
3. **The Hundred-Page Machine Learning Book**.

---

## Detailed Introduction to Machine Learning

### The Scientific Method

1. **Steps**:
   - Asking questions, conducting research, forming hypotheses, and testing them.
   - If the hypothesis aligns with the results, the process is validated; otherwise, adjust the hypothesis (new question, new hypothesis).
   - Communication of results is essential.

2. **Machine Learning as Hypothesis Testing**:
   - Every ML model is essentially a hypothesis about how data can be modeled.
   - This makes the entire process iterative and research-based.

3. **Applied Machine Learning Process (OSEMN)**:
   - *Obtain* data, *Scrub* it (clean and preprocess), *Explore* for patterns, *Model* to make predictions, and *INterpret* results.

---

## Machine Learning Overview

- **Definition**: Machine learning is the process by which a computer learns from data and experience, using a loss function to minimize errors and metrics to evaluate performance.

### Supervised Learning
1. We compare predicted outputs $\tilde{y}$ with real outcomes $y$.
2. The goal is to minimize the difference between $\tilde{y}$ and $y$:
   $$ \text{Loss Function} = \frac{1}{2} \sum (\tilde{y} - y)^2 $$

### Unsupervised Learning
1. No labeled outcomes $y$, only input data $X$.
2. The goal is to segment data into meaningful clusters.

### Reinforcement Learning
1. A program continuously learns by interacting with its environment.
2. It adapts based on rewards or penalties to optimize outcomes.
3. Often a combination between Supervised and Unsupervised.

### Machine Learning Pitfalls
- Data quality issues, bias, improper training, incorrect loss functions, poor data sampling, and faulty algorithms.

### Algorithm Categories by Task
1. **Regression**: Continuous values.
2. **Classification**: Assigning class labels.
3. **Clustering**: Grouping data points.
4. **Dimensionality Reduction**: Reducing features while maintaining information.
5. **Recommendation Systems**: Predicting user preferences.

---

## Gradient Descent

- Gradient descent is a greedy algorithm used to minimize the error in a model, searching for local minima. It works iteratively to adjust the model's parameters by computing the gradient of the loss function:
   $$ \theta_{\text{new}} = \theta_{\text{old}} - \eta \cdot \nabla J(\theta) $$
   where $\eta$ is the learning rate and $\nabla J(\theta)$ is the gradient of the loss function.

---

## Reproducibility
1. Ensure that experiments can be repeated by documenting every step in data handling and modeling.

---

### Common Python Libraries

- **Scikit-learn**: For machine learning algorithms.
- **Pandas, NumPy**: For data manipulation and analysis.
- **SciPy**: For scientific computing.
- **Matplotlib**: For data visualization.

---

### Exploratory Data Analysis (EDA)

1. **Explore Data**: Understand the structure and distribution of your data.
2. **Feature Engineering**: Creating new features to improve model accuracy.
3. **Normalization & Scaling**: Use `StandardScaler` (Z-score) or `MinMaxScaler` to handle varying data ranges.
4. **Handling Categorical Variables**:
   - One-hot encoding (`pd.get_dummies()`).
   - Label encoding.
   - Multi-hot encoding.

---

### Logistic Regression

- One of the fundamental models used for binary classification. It uses the following function to estimate the probability of an event:
  $$ \sigma(z) = \frac{1}{1 + e^{-z}} $$
  where $z = w \cdot x + b$ represents the linear combination of inputs and weights.

---

