<a href="https://colab.research.google.com/github/cedamusk/AI-N-ML/blob/main/Weighted_regression.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Weighted Regression
## Setup essential imports
1. `pandas`: For data manipulation and analysis, e.g., loading CSV files or handling dataframes.

2. `numpy`: For numerical computations, e.g., creating arrays, mathematical operations.

3. `sklearn.linear_model.LinearRegression`: Implements the Linear Regression model for supervised learning tasks.

4. `matplotlib.pyplot`: Fr creating visualiztions, such as plotting regression results.

In [None]:
import pandas as pd
import numpy as np
from sklearn.linear_model import LinearRegression
import matplotlib.pyplot as plt

## Data
A DataFrame with our columns: `Year`, `X`, `weight` and `Y`.

In [None]:
data=pd.DataFrame({
    'Year': [2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017,2018, 2019],
    'X':[10.0, 14.44, 18.89, 23.33, 27.78, 32.22, 36.67, 41.11, 45.56, 50.0],
    'Weight':[1.32, 1.36, 0.51, 1.01, 0.92, 0.72, 0.62, 0.84, 1.44, 0.82],
    'Y': [30.46, 27.27, 30.50, 33.44, 31.94, 40.72, 44.45, 54.08, 50.61, 55.74]
})


## Weighted Regression
The function below performs a weighted regression on the dataset above.

###Functionality
1. **Input Parameters**:

    `X`: Independent variable(s).

    `y`: Dependent variable.

    `weights`: Weights apply to each data point.

2. **Steps**:

  Reshape `x` to ensure it's a 2D array for `LinearRegression`.

  Create and fit a weighted linear regression model using the `sample_weight` parameter.

  Calculate predictions (`y_pred`) for the given `X`.

  Compute the weighted mean of `y`.

  Calculate the weighted total sum of squares (`weighted_total_ss`).

  Calculate the weighted residual sum of squares (`weighted_residual_ss`).

  Derive the weighted R^2 score.

3. **Output**:

  Returns the trained model, predictions and weighted R^2 score.


In [None]:
def perform_weighted_regression(X, y, weights):
  X=X.values.reshape(-1, 1)

  weighted_model=LinearRegression()
  weighted_model.fit(X, y, sample_weight=weights)

  y_pred=weighted_model.predict(X)

  weighted_mean=np.average(y, weights=weights)
  weighted_total_ss=np.sum(weights*(y-weighted_mean)**2)
  weighted_residual_ss=np.sum(weights*(y-y_pred)**2)
  weighted_r2=1- (weighted_residual_ss/ weighted_total_ss)

  return weighted_model, y_pred, weighted_r2

## Visualization of the Model
###Functionality

1. **Extract Variables**:

  `X`, `y` and ` weights` are extracted from your `data` DataFrame.

2. **Perform Weighted Regression**:

  Uses your `perform_weighted_regression` function to calculate the model, predictions and weighted R^2.

3. **Visualization**:

  A scatter plot of the data points, where the size of each point is proportional to the `weights` (scaled for better visibility).

  The weighted regression line is overlaid in red (`r--`).

4. **Output results**:

  Prints the model's intercept, slope and weighted R^2 value, formatted to four decimal places for clarity.

In [None]:
X= data['X']
y=data['Y']
weights=data['Weight']

model, predictions, r2=perform_weighted_regression(X, y, weights)

plt.figure(figsize=(10, 6))
plt.scatter(X, y, c='blue', alpha=0.5, s=weights*100, label='Data Points (Size=weight)')
plt.plot(X, predictions, 'r-', label='Weighted Regression Line')
plt.xlabel('X')
plt.ylabel('Y')
plt.title('Weighted Linear Regression')
plt.legend()

print(f"Intercept: {model.intercept_:.4f}")
print(f"Slope: {model.coef_[0]:.4f}")
print(f"Weighted R-squared:{r2:.4f}")

## Make prediction
The code demonstrates how to use the trained weighted regression model to predict a value for a new input.

###Functionality
1. **Input New value(`X_new`)**:

  The value `X_new=30` represents a new independent variable value.

2. **Prediction**:
  The `predict` method of the trained `model` is used to calculate the predicted value of `Y` for `X_new`.

  Since `LinearRegression` expects a 2D array, `X_new` is wrapped in double brackets (`[[X_new]]`).

3. **Output the Prediction**:

  Print the predicted `Y` value formatted to four decimal places.

In [None]:
X_new=30
prediction=model.predict([[X_new]])
print(f"\nPredicted Y for X ={X_new}: {prediction[0]:.4f}")