<a href="https://colab.research.google.com/github/cedamusk/AI-N-ML/blob/main/Quantile_regression.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Quantile Regression
##Importing Libraries

1. `pandas`: Used for handling and manipulating datasets in tabular form.

2. `numpy`: Provides mathematical functions and operations for working with arrays and numerical data.

`QuantReg`: A tool from the `statsmodels` library for performing quantile regression.

`matplotlib.pyplot`. Used for creating visualizations and plotting graphs.

In [None]:
import pandas as pd
import numpy as np
from statsmodels.regression.quantile_regression import QuantReg
import matplotlib.pyplot as plt

## DataFrame
This DataFrame represents a small dataset with the following structure.

##Columns Description
1. **Year**: Represents the year from 2010 to 2018. Acts as a time variable in the dataset.

2. **X**: Represents an independent variable or feature. Could be some measured value increasing over time.

3. **Y_median**: Represents the median (50th percentile) value of a dependent variable Y for each year.

4. **Y_Upper_Quantile**: Represents the 90th percentile (upper quantile) of Y for each year.

5 **Y_Lower_Quantile**: Represents the 10th percentile (lower quantile) of Y for each year. Indicates the value below which 10% of the Y observations fall.

In [None]:
data=pd.DataFrame({
    'Year':[2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019],
    'X':[5.0, 10.0, 15.0, 20.0, 25.0, 30.0, 35.0, 40.0, 45.0, 50.0],
    'Y_Median':[19.83, 24.16, 45.71, 53.76, 63.96, 65.45, 87.01, 82.99, 102.93, 120.95],
    'Y_Upper_Quantile':[15.08, 29.7, 45.80, 51.97, 54.59, 78.55, 80.50, 103.79, 103.64, 134.40],
    'Y_Lower_Quantile':[14.65, 25.03, 37.44, 40.31, 53.68, 65.92, 66.18, 80.55, 89.78, 100.35]
})

## Fit Quantile Regression
The function `fit_quantile_regression(X, y, q)` is designed to fit a quantile regression using `QuantReg` class from `statsmodels`.

##Functionality
1. `QuantReg(y, sm.add_constant(X))`:

  `QuantReg`: This creates a quantile regression model using the dependent variable `y` and the independent variable `x`.

  `sm.add_constant(X)`: This adds a constant term (intercept) to the independent variable matrix `X`, which is needed for the model to calculate the intercept. Without this, the model would not have an intercept, and the regression line would always pass through the origin (0,0).

2. `model.fit(q=q)`:

  This fits the wuantile regression model for the specified quantile `q`. The quantile `q` is a value between 0 and 1 that specifies the desired quantile (e.g., 0.5 for the median, 0.25 for the first quartile).

  The `.fit(q=q)` method returns the fitted model for that specific quantile, which can then be used to analyze the model's coefficients, predictions, e.t.c.

In [None]:
def fit_quantile_regression(X, y, q):
  model=QuantReg(y, sm.add_constant(X))
  return model.fit(q=q)

## Quantile Regression, Visualization, Interpretation

The code performs quantile_regression for three quantiles (0.5, 0.75, and 0.25) and visualizes the results.

###Functionality
**Data Preparation**: Extract the independent variable (`X`) and the dependent variables (`Y_Median`, `Y_Upper_Quantile`, and `Y_Lower_Quantile`) from the `data` DataFrame.

**Adding Constant**: Adds an intercept term to `X` for the regression model, as required by `statsmodels`.

**Quantile Regression Models**: Fits three different quantile regression models. **Median Regression** for **q** =0.5 (the median or 50th percentile). **Upper Quantile Regression** for **q**=0.75 (the 75th percentile) **Lower Quantile Regression** for **q** =0.25 (the 25th percentile).

The `fit_quantile_regression` function (defined earlier) is used to fit these models.


**Genarating Predictions**:

  New X Values: `x_new` is created as a sequence of 100 evenly spaced values between the minimum and maximum of `X`, which will be used for predictors.

  Constant Term: Adds a constant term to `X_new` for predictions.

  Predictions: The fitted models are used to predict the dependent variable for the new `X_new` values.

  **Plotting Results**

  **Scatter plot**: The actual data point for `y_median`, `y_upper` and `y_lower` are plotted as scatter plots with different plots.

  **Regression Lines**: The predicted values (`pred_median`, `pred_upper`, `pred_lower`) are plotted as lines representing the regression results for the median, upper wauntile and lower quantile, respectively.

  **Customization**: labels, title, and grid added for better visualization.



In [None]:
X=data['X']
y_median=data['Y_Median']
y_upper=data['Y_Upper_Quantile']
y_lower=data['Y_Lower_Quantile']

import statsmodels.api as sm
X_const=sm.add_constant(X)

model_median=fit_quantile_regression(X, y_median, 0.5)
model_upper=fit_quantile_regression(X, y_upper, 0.75)
model_lower=fit_quantile_regression(X, y_lower, 0.25)

X_new=np.linspace(X.min(), X.max(), 100)
X_new_const=sm.add_constant(X_new)

pred_median=model_median.predict(X_new_const)
pred_upper=model_upper.predict(X_new_const)
pred_lower=model_lower.predict(X_new_const)

plt.figure(figsize=(12, 8))

plt.scatter(X, y_median, color='blue', label='Median', alpha=0.6)
plt.scatter(X, y_upper, color='red', label='Upper Quantile', alpha=0.6)
plt.scatter(X, y_lower, color='green', label='Lower Quantile', alpha=0.6)

plt.plot(X_new, pred_median, 'b-', label='Median regression')
plt.plot(X_new, pred_upper, 'r-', label='Upper Quantile Regression')
plt.plot(X_new, pred_lower, 'g-', label='Lower Quantile Regression')

plt.xlabel('X')
plt.ylabel('Y')
plt.title('Quantile Regression Analysis')
plt.legend()
plt.grid(True, alpha=0.3)

print("Median Regression (q=0.5):")
print(f"Intercept: {model_median.params[0]:.4f}")
print(f"Slope: {model_median.params[1]:.4f}")
print(f"Pseudo R-Squared:{model_median.prsquared:.4f}\n")

print("Upper Quantile Regression (q=0.75):")
print(f"Intercept: {model_upper.params[0]:.4f}")
print(f"Slope:{model_upper.params[1]:.4f}")
print(f"Pseudo R-Squared: {model_upper.prsquared:.4f}\n")

print("Lower Quantile Regression(q=0.25):")
print(f"Intercept: {model_lower.params[0]:.4f}")
print(f"Slope: {model_lower.params[1]:.4f}")
print(f"Pseudo R-Squared: {model_lower.prsquared:.4f}")
