### Supervised Learning
Supervised Learning is a type of machine learning where the algorithm is trained on a labeled dataset, meaning each input comes with the correct output. The model learns a mapping from inputs → outputs.

It’s called “supervised” because a teacher (the labels) guides the learning.

| Component         | Description                                            |
| ----------------- | ------------------------------------------------------ |
| **Features (X)**  | Input data or independent variables                    |
| **Labels (y)**    | Target/output or dependent variable                    |
| **Model**         | Learns relationship between X and y                    |
| **Loss Function** | Measures how far predictions are from actual labels    |
| **Training**      | Process of adjusting model parameters to minimize loss |

### Types of Supervised Learning

1. Regression
- Output is continuous numerical value
- Examples: Predicting house prices, Predicting temperature
- Algorithms: Linear Regression, Decision Tree Regressor, Random Forest Regressor

2. Classification
- Output is categorical / discrete
- Examples: Email spam detection (spam / not spam), Disease prediction (yes / no)
- Algorithms: Logistic Regression, Decision Tree, Random Forest, SVM, KNN

Workflow:
1. Collect labeled data (Read from CSV file).
2. Split into train & test sets (Use train_test_split to get traning and testing data).
3. Optionally scale or preprocess features (Use StandardScaler or MinMaxScaler to scale data between 0 to 1 and handle missing values).
4. Train a model on the training set (Use supervised learning algorithm like LinearRegression to train data set).
5. Predict on the test set (Predict the output from testing dataset)
6. Evaluate using metrics
7. Regression: MSE, RMSE, R²
8. Classification: Accuracy, Precision, Recall, F1-score

Difference: Linear vs Logistic Regression
| Feature       | Linear Regression                  | Logistic Regression                     |
| ------------- | ---------------------------------- | --------------------------------------- |
| Used For      | **Regression** (continuous output) | **Classification** (categorical output) |
| Output        | Any real number                    | Probability (0 to 1)                    |
| Activation    | None                               | Sigmoid                                 |
| Equation Type | Straight Line                      | S-shaped Curve (Sigmoid)                |



Difference between KNN and Decision Tree
| Feature / Point                   | **KNN (K-Nearest Neighbors)**                                                                                  | **Decision Tree**                                                           |
| --------------------------------- | -------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------- |
| **Type of Algorithm**             | **Lazy learner** (does not learn during training)                                                              | **Eager learner** (learns patterns during training)                         |
| **Working Principle**             | Finds the **k closest neighbors** and predicts based on majority vote (classification) or average (regression) | Splits data into branches based on **if/else conditions** to make decisions |
| **Training Time**                 | **Very fast** (no model built)                                                                                 | **Slower** (must build the tree)                                            |
| **Prediction Time**               | **Slow** (must search neighbors at prediction time)                                                            | **Fast** (just follow tree branches)                                        |
| **Requires Feature Scaling?**     | **Yes** (distance-based algorithm) → use StandardScaler/MinMaxScaler                                           | **No need** (values compared by thresholds, not distance)                   |
| **Handles Categorical Data**      | Hard (requires encoding)                                                                                       | Very good (naturally handles categories & numbers)                          |
| **Sensitive to Outliers / Noise** | **Highly sensitive**                                                                                           | Less sensitive (tree splits absorb noise)                                   |
| **Overfitting Tendency**          | Less likely when k is large                                                                                    | **High chance** if the tree grows too deep                                  |
| **Interpretability**              | Hard to interpret                                                                                              | **Very easy** to visualize and understand                                   |
| **Use Cases**                     | Recommendation systems, handwriting recognition, similarity search                                             | Rule-based decision making, credit approval, medical decisions              |
