# Linear Regression from Scratch: Module Demo & Testing


# 1. Introduction
Linear Regression is one of the simplest and most widely used supervised learning algorithms for regression tasks. It models the relationship between input features and a continuous target variable by fitting a linear equation to the observed data.

Linear Regression assumes a linear relationship between features and the target, making it interpretable and efficient for prediction. Regularization techniques like L2 (Ridge) help prevent overfitting by penalizing large coefficients.

In this notebook, we test a custom Linear Regression model implemented from scratch using gradient descent with L2 regularization. We apply it to the Diabetes dataset to evaluate how well the model can predict disease progression based on clinical measurements.

## 2. Import Libraries

In [14]:
import numpy as np
from sklearn import datasets
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
from sklearn.model_selection import train_test_split

In [5]:
# Connect to Google Drive and access my custom KNN model
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [6]:
import sys
sys.path.append('/content/drive/MyDrive/scratch/')
from models.linear_regression import LinearRegression

## 3. Load Dataset

In [7]:
# Load diabetes dataset

cancer =  datasets.load_diabetes()
X = cancer.data
y = cancer.target

## 4. Train-Test Split


In [8]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
print(X_train.shape , X_test.shape)

(353, 10) (89, 10)


## 5. Train the model

In [9]:
# Standardize features
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

In [17]:
model = LinearRegression()
model.fit(X_train_scaled, y_train, lr=0.01, landa=0.1, max_iter=1000)

Iteration 0: Loss = 28435.2766
Iteration 100: Loss = 3463.0777
Iteration 200: Loss = 3072.6100
Iteration 300: Loss = 3068.5430
Iteration 400: Loss = 3069.0578
Iteration 500: Loss = 3069.2636
Iteration 600: Loss = 3069.4978
Iteration 700: Loss = 3069.8748
Iteration 800: Loss = 3070.4209
Iteration 900: Loss = 3071.1369
Iteration 999: Loss = 3072.0066


## 6. predict

In [12]:
y_pred = model.predict(X_test)
y_pred

array([153.21511186, 155.16258322, 153.19278203, 160.55968962,
       152.30754204, 150.9741337 , 158.75981224, 155.44326599,
       150.53709755, 151.7993324 , 151.05665245, 154.30028225,
       149.55359968, 156.32735873, 151.24185692, 152.82453449,
       157.08409622, 158.29164834, 155.89220116, 156.77019038,
       156.40949044, 150.74396684, 149.96299248, 155.51151899,
       153.97022078, 154.24605778, 155.56927836, 154.97914838,
       148.90521923, 151.78184796, 155.07651287, 150.87466527,
       152.76668573, 155.14758088, 154.74692932, 155.65029211,
       152.35285862, 152.13755272, 153.45362893, 149.42136451,
       150.08179226, 151.68193254, 154.25944731, 153.63351779,
       154.90659851, 149.64464872, 150.24151822, 151.66245726,
       149.30540013, 154.23382378, 153.98889267, 149.64116885,
       151.89929757, 151.68223059, 154.65869835, 154.14154982,
       151.02129918, 156.43444845, 152.15672545, 149.82613008,
       155.33325632, 156.18898113, 153.2657169 , 151.55

## 7. Evaluate

In [15]:
# Evaluate
print("MAE:", mean_absolute_error(y_test, y_pred))
print("MSE:", mean_squared_error(y_test, y_pred))
print("R² Score:", r2_score(y_test, y_pred))

MAE: 62.53662519680951
MSE: 5111.04848001325
R² Score: 0.03531481871121134
