# 📘 Machine Learning Notes

## 📌 Table of Contents
1. [What is Machine Learning?](#what-is-machine-learning)
2. [Types of Machine Learning](#types-of-machine-learning)
3. [Common ML Algorithms](#common-ml-algorithms)
4. [Machine Learning Pipeline](#machine-learning-pipeline)
5. [Model Evaluation Metrics](#model-evaluation-metrics)
6. [Overfitting & Underfitting](#overfitting--underfitting)
7. [Feature Engineering](#feature-engineering)
8. [Hyperparameter Tuning](#hyperparameter-tuning)
9. [Saving and Deploying Models](#saving-and-deploying-models)
10. [Popular Libraries](#popular-libraries)

---

## 🔹 What is Machine Learning?

Machine Learning is a subset of AI that enables systems to **learn from data** and improve over time **without being explicitly programmed**.

### Key Components:
- **Data**: Input used to learn patterns  
- **Model**: Mathematical representation of a real-world process  
- **Training**: Process of learning from data  
- **Prediction**: Making future guesses based on the model  

---

## 🔹 Types of Machine Learning

| Type | Description | Examples |
|------|-------------|----------|
| **Supervised Learning** | Learns from labeled data | Regression, Classification |
| **Unsupervised Learning** | Learns from unlabeled data | Clustering, Dimensionality Reduction |
| **Semi-supervised Learning** | Mix of labeled and unlabeled data | Text classification |
| **Reinforcement Learning** | Learns via rewards/punishments | Game AI, Robotics |

---

## 🔹 Common ML Algorithms

### ✅ Supervised:
- Linear Regression  
- Logistic Regression  
- Decision Trees  
- Random Forest  
- Support Vector Machines (SVM)  
- K-Nearest Neighbors (KNN)  
- Gradient Boosting (XGBoost, LightGBM)  

### ✅ Unsupervised:
- K-Means Clustering  
- Hierarchical Clustering  
- Principal Component Analysis (PCA)  
- t-SNE  

---

## 🔹 Machine Learning Pipeline

1. Problem Definition  
2. Data Collection  
3. Data Cleaning  
4. Exploratory Data Analysis (EDA)  
5. Feature Engineering  
6. Splitting Data (Train/Test)  
7. Model Selection  
8. Training the Model  
9. Model Evaluation  
10. Model Tuning (Hyperparameters)  
11. Saving/Deploying the Model  

---

## 🔹 Model Evaluation Metrics

### For Classification:
- Accuracy  
- Precision, Recall, F1 Score  
- Confusion Matrix  
- ROC-AUC Score  

### For Regression:
- Mean Absolute Error (MAE)  
- Mean Squared Error (MSE)  
- Root Mean Squared Error (RMSE)  
- R² Score  

---

## 🔹 Overfitting & Underfitting

| Term | Description | Solution |
|------|-------------|----------|
| **Overfitting** | Model performs well on train but poorly on test | Reduce complexity, regularization |
| **Underfitting** | Model performs poorly on both train and test | Increase model complexity |

---

## 🔹 Feature Engineering

- Handling Missing Values  
- Encoding Categorical Data (OneHot, Label Encoding)  
- Feature Scaling (StandardScaler, MinMaxScaler)  
- Feature Selection (PCA, Correlation Matrix)  
- Creating new features (interaction, polynomials)  

---

## 🔹 Hyperparameter Tuning

- Grid Search  
- Random Search  
- Bayesian Optimization (e.g., Optuna)  
- Cross-validation (K-Fold)  

---

## 🔹 Saving and Deploying Models

### Saving:
```python
import joblib
joblib.dump(model, 'model.pkl')
