## 🤖 **What is Machine Learning (ML)?**

**Machine Learning** is a subset of Artificial Intelligence (AI) that enables computers to **learn from data** and make decisions or predictions **without being explicitly programmed**.

In simple terms:
> "Instead of writing rules for every situation, we feed the machine data and let it figure out patterns and rules on its own."

---

## 📂 **Types of Machine Learning**

ML is divided into **four main types**:

| Type | Description | Uses Labels? | Common Use Cases |
|------|-------------|--------------|------------------|
| **Supervised Learning** | Learns from labeled data | Yes | Email spam detection, price prediction |
| **Unsupervised Learning** | Finds patterns in unlabeled data | No | Customer segmentation, anomaly detection |
| **Semi-Supervised Learning** | Uses small labeled + large unlabeled data | Partially | Medical image classification |
| **Reinforcement Learning** | Learns through trial and error via rewards | No | Game AI, robotics, recommendation engines |

---

## 🧠 **What is Supervised Machine Learning?**

Supervised learning is a type of ML where we train the model on a labeled dataset—meaning, each training example has an input and a correct output.

**Goal**: Learn a mapping from inputs \(X\) to outputs \(Y\), so that we can predict \(Y\) for new \(X\).

---

### 🔀 Types of Supervised Learning

There are mainly **two types**:

#### 1. **Regression**
- Predict **continuous** values.
- Example: Predicting house prices, temperature, stock prices.

📌 Common Algorithms:
- Linear Regression
- Ridge/Lasso Regression
- Support Vector Regression (SVR)
- Decision Tree Regression
- Random Forest Regression

#### 2. **Classification**
- Predict **discrete labels/classes**.
- Example: Email spam detection (spam/ham), digit recognition (0–9), cancer detection (malignant/benign).

📌 Common Algorithms:
- Logistic Regression
- K-Nearest Neighbors (KNN)
- Decision Trees
- Random Forest
- Support Vector Machines (SVM)
- Naive Bayes
- Gradient Boosting (XGBoost, LightGBM, CatBoost)
- Neural Networks (basic ones)

---

### ✅ Let's Do It Step-by-Step

Here’s how we can structure your learning:
1. **Pick a simple dataset** (Iris, Titanic, Boston Housing, etc.)
2. **Understand the problem type** (regression or classification)
3. **Apply algorithms one by one**
4. **Visualize and evaluate** performance (accuracy, confusion matrix, RMSE, etc.)

---

## 🧩 **Unsupervised Learning**

### 🧠 What it does:
Finds **hidden patterns** or **intrinsic structure** in **unlabeled data**.

### 🔧 Use Cases:
- Customer segmentation
- Topic modeling
- Anomaly detection

### 📚 Algorithms:
- **Clustering:**
  - K-Means
  - DBSCAN
  - Hierarchical Clustering
- **Dimensionality Reduction:**
  - PCA (Principal Component Analysis)
  - t-SNE / UMAP
  - Autoencoders
- **Association Rules:**
  - Apriori
  - Eclat

---

## 🌗 **Semi-Supervised Learning**

### 🧠 What it does:
Trains on a small set of labeled data with a large amount of unlabeled data.

### 🔧 Use Cases:
- Image classification when labels are expensive
- NLP tasks with limited annotations

### 📚 Techniques:
- Self-training
- Label propagation
- Consistency-based methods (e.g., MixMatch, FixMatch)

---

## 🎮 **Reinforcement Learning (RL)**

### 🧠 What it does:
An **agent** learns to interact with an environment by **taking actions** and **receiving rewards**.

### 🔧 Use Cases:
- Game-playing bots (e.g. AlphaGo, Dota2)
- Robotic control
- Personalized recommendations

### 📚 Algorithms:
- **Value-based:** Q-Learning, Deep Q Network (DQN)
- **Policy-based:** REINFORCE, PPO (Proximal Policy Optimization)
- **Actor-Critic Methods:** A3C, DDPG

---

## 🧠 **Emerging & Advanced Categories (2025)**

### **Self-Supervised Learning**
- Learns useful features from unlabeled data by predicting part of input (masked tokens/images)
- Algorithms: BERT, SimCLR, MAE, CLIP
- Used in: Foundation models (GPT, DALL·E), vision-language models

### **Online Learning**
- Learns incrementally as data arrives
- Used in: Real-time fraud detection, adaptive recommender systems

### **Federated Learning**
- Models trained across multiple devices while preserving privacy
- Used in: Healthcare, mobile applications (e.g., Gboard next-word prediction)

---

## 📊 Summary Table

| Algorithm | Type | Use Case |
|----------|------|----------|
| Linear Regression | Supervised (Regression) | Price prediction |
| Decision Trees | Supervised (Classification) | Loan approval |
| K-Means | Unsupervised (Clustering) | Customer segmentation |
| PCA | Unsupervised (Dim. Reduction) | Visualization, preprocessing |
| Q-Learning | Reinforcement Learning | Game AI |
| BERT | Self-Supervised | NLP (text understanding) |
| SimCLR | Self-Supervised | Image feature learning |
| FedAvg | Federated Learning | Private mobile ML |

---

# **Steps to Install Machine Learning Libraries**

- Create a new conda env.
- Install the following libraries using `pip install`:
   -  numpy
   - pandas
   - matplotlib
   - seaborn
   - scipy
   - scikit-learn
   - plotly