 # Comparison of SGD for Linear Regression: From Scratch vs Scikit-learn

 ## Introduction

 This notebook compares SGD implementations for Linear Regression from scratch and scikit-learn.

 ## Data Loading

 Load  the student data.

In [None]:
import os
import sys

import numpy as np

# Set project root directory and add it to the system path
project_root = os.path.abspath(os.path.join(os.getcwd(), "..", "..", ".."))
sys.path.append(project_root)


from src.scratch.utils.viz_utils import plot_scatter_for_regression


X_train = np.load("../../../data/processed/student_X_train.npy")
X_test = np.load("../../../data/processed/student_X_test.npy")
y_train = np.load("../../../data/processed/student_y_train.npy")
y_test = np.load("../../../data/processed/student_y_test.npy")

print("Training features shape:", X_train.shape)
print("Test features shape:", X_test.shape)
print("Training target shape:", y_train.shape)
print("Test target shape:", y_test.shape)

 ## Exploratory Data Analysis

 Visualize feature 1 vs. target.

In [None]:
plot_scatter_for_regression(X_train, y_train, feature_index=0, title="Feature 1 vs Target", filename="feature1_vs_target_sgd_comp.png")


 ## Training Both Models

 Train both implementations with consistent hyperparameters.

In [None]:
from src.scratch.models.linear_regression import LinearRegression
from src.sklearn_impl.linear_regression_sk import LinearRegressionSK
import time

# From Scratch
model_scratch = LinearRegression(method='stochastic_gd', learning_rate=0.01, n_iterations=1000)
start_time = time.time()
model_scratch.fit(X_train, y_train)
time_scratch = time.time() - start_time

# Scikit-learn
model_sk = LinearRegressionSK(method='sgd', learning_rate=0.01, n_iterations=1000)
start_time = time.time()
model_sk.fit(X_train, y_train)
time_sk = time.time() - start_time


 ## Performance Metrics

 Compare MSE and R².

In [None]:
from src.scratch.utils.metrics import mean_squared_error, r2_score

y_pred_scratch = model_scratch.predict(X_test)
mse_scratch = mean_squared_error(y_test, y_pred_scratch)
r2_scratch = r2_score(y_test, y_pred_scratch)

y_pred_sk = model_sk.predict(X_test)
mse_sk = mean_squared_error(y_test, y_pred_sk)
r2_sk = r2_score(y_test, y_pred_sk)

print(f"From Scratch - MSE: {mse_scratch:.4f}, R²: {r2_scratch:.4f}, Time: {time_scratch:.4f} seconds")
print(f"Scikit-learn - MSE: {mse_sk:.4f}, R²: {r2_sk:.4f}, Time: {time_sk:.4f} seconds")


 ## Visual Comparison

 Visualize differences.

In [None]:
from src.scratch.utils.viz_utils import plot_actual_vs_predicted, plot_learning_curve

plot_learning_curve(model_scratch.get_loss_history(), title="Learning Curve (SGD Scratch)", filename="learning_curve_sgd_scratch_comp.png")
plot_learning_curve(model_sk.get_loss_history(), title="Learning Curve (SGD SK)", filename="learning_curve_sgd_sk_comp.png")
plot_actual_vs_predicted(y_test, y_pred_scratch, title="Actual vs Predicted (SGD Scratch)", filename="actual_vs_predicted_sgd_scratch_comp.png")
plot_actual_vs_predicted(y_test, y_pred_sk, title="Actual vs Predicted (SGD SK)", filename="actual_vs_predicted_sgd_sk_comp.png")


 ## Insights

 The scikit-learn implementation often converges faster and may achieve slightly better metrics due to optimized internals, while the "from scratch" version provides transparency into the algorithm.