prepared by Muhirwa Salomon

Simple Linear Regression:
Simple linear regression is used when there is only one input feature. The equation of the model is:

Y = b0 + b1X + e

where Y is the output variable, X is the input feature, b0 and b1 are the regression coefficients, and e is the random error term.

The goal of simple linear regression is to find the values of b0 and b1 that minimize the sum of squared errors between the predicted values and the actual values in the training data.

Simple Linear Regression Formula:
The formula for simple linear regression is:

b1 = Σ((Xi - X_mean)*(Yi - Y_mean)) / Σ((Xi - X_mean)^2)

b0 = Y_mean - b1*X_mean

where Xi and Yi are the input and output values in the training data, X_mean and Y_mean are the mean values of the input and output variables, and Σ denotes the sum of the values over all observations in the training data

In [51]:
import pandas as pd
from sklearn.datasets import load_diabetes
from sklearn.model_selection import train_test_split
import numpy as np

diabetes = load_diabetes()
df = pd.DataFrame(data=diabetes.data, columns=diabetes.feature_names)
X = df.values
y = diabetes.target.reshape(-1, 1)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

class sLregression:
    def __init__(self):
        self.b0 = None
        self.b1 = None
        
    def fit(self, x, y):
        y = y.reshape(-1, 1)
        # calculate the coefficients 
        self.b1 = np.sum((x - np.mean(x,axis=0))*(y - np.mean(y))) / np.sum((x - np.mean(x,axis=0))**2,axis=0)
        self.b0 = np.mean(y) - self.b1*np.mean(x,axis=0)
        return self
    
    def predict(self, x):
        # predict the output
        y_pred = self.b0 + (x*self.b1)
        return y_pred
    
    def loss(self, y, y_pred):
        n = len(y)
        mse = (1/len(y))*np.sum((y-y_pred)**2)
        return mse

# Initialize the model
model = sLregression()

# Fit the model to the training data
model.fit(X_train, y_train)

# Print the coefficients for each feature
coefficients = pd.Series(model.b1.ravel(), index=diabetes.feature_names)
print("Coefficients:\n", coefficients)

# Use the model to make predictions on the test data
y_pred = model.predict(X_test)

# Calculate the mean squared error between the predicted and actual target values
mse = model.loss(y_test, y_pred)
print("Mean Squared Error:", mse)


Coefficients:
 age    4510.693166
sex    4266.067184
bmi    4332.872681
bp     4122.123722
s1     4240.784315
s2     4268.359157
s3     4375.642193
s4     4240.046293
s5     4233.261114
s6     4137.203283
dtype: float64
Mean Squared Error: 390077.0305393243
