1. [Linear Regression](#linear_reg)
   1. [Assumption](#assumption)
   2. [Accuracy metrics](#accuracy_metrics)
   3. [Regularization](#regularization)
2.  [Non-Linear Regression](#non_linear)
    1.  Logistic Regression  

<a id='linear_reg'></a>
# Linear Regression

<a id="assumption"></a>
### Assumption

In [1]:
import sys
import numpy as np
import pandas as pd
sys.path.append(r'../../')
from utils.plots import plot_linear_plot
seed = np.random.seed(seed=42)

In [2]:
m = 10
b = 2
noise = 50

x = np.arange(start=0, stop=10, step=1)
noise = np.random.randint(low=0, high=noise, size=len(x))

y = m * x + b + noise

In [3]:
# first order polynomial - simple linear regression
coeff = np.polyfit(x=x, y=y, deg=1)
y_pred = (coeff[0] * x**1) + coeff[1]
plot_linear_plot(x, y, y_pred).show()

In [4]:
# third order polynomial
coeff = np.polyfit(x=x, y=y, deg=3)
y_pred = (coeff[0] * x**3) + (coeff[1] * x**2) + (coeff[2] * x**1) + coeff[3]
plot_linear_plot(x, y, y_pred).show()

In [5]:
# using numpy polyval function
coeff = np.polyfit(x=x, y=y, deg=8)
y_pred = np.round(np.polyval(coeff, x),2)
plot_linear_plot(x, y, y_pred).show()

<a id='accuracy_metrics'></a>
#### Accuracy metrics

1. **MAE (Mean Absolute Error)**
   - Pros:
     - The MAE is expressed in the same unit as the output variable.
     - Robust to outliers.
   - Cons:
     - Not differentiable, so it can't be used as a loss function.

2. **MSE (Mean Squared Error)**
   - Pros:
     - Differentiable and can be used as a loss function.
   - Cons:
     - Output is in squared units.
     - Not robust to outliers due to squared differences.

3. **RMSE (Root Mean Squared Error)**
   - Pros:
     - Output is in the same unit as the target variable.

4. **RMSLE (Root Mean Squared Log Error)**
   - Pros:
     - Does not penalize high errors due to the logarithm.
     - Useful when underestimation is unacceptable.
   - Cons:
     - Large penalty for underestimation.

5. **MAPE (Mean Absolute Percentage Error)**
   - Pros:
     - Reflects errors for both high and low magnitude values.
   - Cons:
     - Sensitive to outliers.

6. **R2 (R-Squared)**
   $$R^2 = 1 - \frac{{SS_{\text{res}}}}{{SS_{\text{tot}}}}$$
   - Pros:
     - Compares regression line to mean line.
     - Useful for model comparison.
     - Value between 0 and 1 (1 being best). How much variance is explained by your model.
   - Cons:
     - Adding useless features doesn't decrease R2.

7. **Adj R2 (Adjusted R-Squared)**
   $$ \text{Adjusted R}^2 = 1 - \left(1 - R^2\right) \cdot \frac{{n - 1}}{{n - k - 1}} $$
   - Pros:
     - Crucial for model evaluation.
     - Decreases with irrelevant features.

[Reference 1](https://www.analyticsvidhya.com/blog/2021/05/know-the-best-evaluation-metrics-for-your-regression-model/)
[Reference 2](https://www.linkedin.com/pulse/regression-metrics-all-why-mse-aishwarya-b/)

<a id="regularization"></a>
## Regularization

**Regularization** is important to maintain bias-variance trade off or overfitting/underfitting.
1. **Bias-Variance Trade-off**:
    - Polynomial regression aims to find a balance between **bias** (underfitting) and **variance** (overfitting).
    - High-degree polynomials can fit the training data perfectly but may generalize poorly to unseen data (overfitting).
    - Regularization helps control this trade-off.

2. **Why Regularization?**:
    - When fitting polynomials, we often face a dilemma:
        - **Low-degree polynomials** (e.g., linear or quadratic) may underfit the data.
        - **High-degree polynomials** (e.g., cubic or higher) may overfit the data.
    - Regularization provides a way to address this by introducing a **penalty term**.

3. **Penalty Term**:
    - Regularization adds a penalty to the loss function.
    - The total loss becomes: **Loss = Loss Function + Penalty**
    - The penalty discourages large coefficients, preventing overfitting.

4. **Types of Regularization**:
    - **L2 (Ridge) Regularization**:
        - Adds the sum of squared coefficients to the loss function.
        - Encourages small coefficients.
        - Helps prevent overfitting.
    - **L1 (Lasso) Regularization**:
        - Adds the sum of absolute coefficients to the loss function.
        - Encourages sparse models (sets some coefficients to exactly zero).
        - Useful for feature selection.
    - **Elastic Net Regularization**:
        - Combines L1 and L2 regularization.
        - Balances between sparsity and smoothness.

5. **Effect on Coefficients**:
    - Regularization shrinks the coefficients toward zero.
    - Smaller coefficients lead to simpler models.
    - It helps prevent overfitting by reducing the model's complexity.

6. **Continuous Complexity Range**:
    - Regularization provides a **continuous range** of complexity parameters.
    - Unlike choosing a fixed polynomial degree, you can fine-tune the regularization strength.
    - This flexibility allows finding the right balance between bias and variance.


### Methods to detect overfitting.
1. **Visual Inspection:** Plot the fitted line against the data points.
2. **Cross-Validation:** Use techniques like k-fold cross-validation to assess model performance on unseen data.
3. **Learning Curves:**
   - Plot the model’s performance (e.g., accuracy or loss) against the size of the training dataset.
   - If the training performance keeps improving while the validation performance plateaus or worsens, overfitting could be occurring.
4. **Feature Importance Analysis:** If a few features dominate, it might indicate overfitting.


<a id="non_linear"></a>
## Non Linear Regression

### Logistic Regression