# *Regression*

Regression is a supervised machine learning technique used to predict continuous numerical values by studying the relationship between a dependent variable (target) and one or more independent variables (features).

- **Key Points** 
    - Used when the output is numeric (like salary, marks, temperature, price)
    - Helps understand how variables are related.
    - Uses a mathematical equation to map input --> output.
    - Commonly solved using the Least Squares method.

- **Example Scenarios**
    - Predicting house prices based on size, location, rooms
    - Predicting salary based on experience
    - Predicting sales based on advertising spend
    - Predicting marks based on study hour

- **Types of Regression**
    - *Linear Regression* : Straight-line relationship
    - *Polynomial Regression* : Curvy relationship
    - *Logistic Regression* : Used for classification

# *Linear Regression*

Linear Regression is a supervised learning algorithm used to model the relationship between a dependent variable (target) and one or more independent variables (features).

It predicts a continuous output (e.g., salary, house price, temperature).

- **Types of Linear Regression**
    1. Simple Linear Regression
    2. Multiple Linear Regression

- **Objective**
    - To find best fit line through the data that minimizes the error between preicted and actual values.

- **Assumptions**
    1. **Linearity** : Relationship between features and target is linear
    2. **Independence** : Observations are independent
    3. **Homoscedasticity** : Constant variance of errors.
    4. **Normality of Errors** : To make error normals
    5. **No Multicollinearity** : Especially in multiple regression

- **Evaluation Metrics**
    - **MSE** (Mean Squared Error)
    - **MAE** (Mean Absolute Error)
    - **RMSE** (Root Mean Squared Error)
    - **R2Square** (Coefficient of Determination)

- **Advantages**
    - Simple and easy to interpret
    - Fast to train
    - Wroks well when the relationship is truely linear
    - Provides explainable coeffecients

- **Disadvantages**
    - Assumes linearity
    - Sensitive to outliers
    - Not suitable for comples patterns
    - Affected by mutlicollinearity

- **When to use**
    - When target variable is continuous
    - When relationship between variables is almost linear
    - Dataset has low noise and not too many outliers

## *Simple Linear Regression*

Simple Linear Regression is a statistical technique that models the relationship between a single independent variable (X) and a dependent variable (Y) using a straight line.

It tries to find the best-fitting line that describes how Y changes with X.

- **Equation**
    - `Y = b0 + b1X`
        - Y = dependent variable
        - X = Independent variable
        - b0 = intercept
        - b1 = slope (rate of change)

- **Objective**
    - To find the line that minimizes the sum of squared errors between predicted and actual values.

- **Use Cases**
    - Predicting salary based on experience
    - Predicting marks based on study hours
    - Predicting price based on size

## *Multiple Linear Regression*

Multiple Linear Regression models the relationship between a dependent variable (Y) and two or more independent variables (X₁, X₂, X₃…).

It’s an extension of SLR when you have multiple factors influencing the output.

- **Equation**
    - `Y = b0 + b1X1 + b2X2 + b3X3 + ...... + bnXn`
    - X1, X2, X3, .... , Xn = Multiple independent variables
    - b1, b2, b3, ... , bn = Coefficients for each features

- **Objective**
    - To build a model that explains the relationship between several predictors and the target variable while minimizing prediction error.

- **Use Case**
    - Predicting house price using size, location, number of rooms, floor, etc.
    - Predicting sales using budget spent on TV, radio, newspaper ads
    - Predicting crop yield based on rainfall, fertilizer, soil quality

### SLR vs MLR (Quick Comparison Table)

| Feature               | SLR                    | MLR                      |
| --------------------- | ---------------------- | ------------------------ |
| Independent variables | 1                      | 2 or more                |
| Equation              | Straight line          | Multidimensional plane   |
| Complexity            | Low                    | Moderate                 |
| Interpretability      | Very easy              | Can get harder           |
| Use case              | One cause → one effect | Many causes → one effect |
