#### Introduction



In this guide, we'll implement linear regression from scratch using gradient descent. Starting with dataset loading, we'll cover the mathematical foundations and step-by-step code implementation.

The goal is to understand how linear regression works, how gradient descent optimizes model parameters, and how to build it without high-level machine learning libraries.

Table of Contents
Importing Libraries
Setting up the necessary libraries for data manipulation, model implementation, and visualization.

Loading and Exploring the Dataset
Understanding the structure of the dataset and initial data exploration.

Preparing the Data
Preprocessing the data by scaling features and splitting into training and testing sets.

Initializing Parameters
Defining the initial parameters for the model, including weights and bias.

Defining the Prediction Function
Implementing the model's prediction function to make estimates based on input data.

Defining the Cost Function
Formulating the cost function to measure the accuracy of predictions against actual values.

Computing the Gradients
Calculating the gradients for weights and bias to optimize the cost function.

Updating Parameters Using Gradient Descent
Applying gradient descent to adjust parameters and minimize the cost function.

Training the Model
Training the model using the data and updating parameters through iterative optimization.

Evaluating Model Performance with Test Data
Assessing the model's performance using test data and relevant metrics.

Conclusion
Summarizing the key findings and insights from the model implementation.

Comparison with Sklearn Linear Regression
Side by side comparison of the algorithm that we've written with the algorithms predefined in sklearn to check performance

##### 1. Importing Libraries

The following code imports essential libraries for linear regression and dataset loading:

numpy: For numerical computing and array manipulation.

load_diabetes: Loads the Diabetes dataset for regression tasks.

matplotlib.pyplot: For visualizations such as loss curves and predictions.

In [11]:
!pip install scikit-learn


^C


Defaulting to user installation because normal site-packages is not writeable
Collecting scikit-learn
  Downloading scikit_learn-1.6.1-cp313-cp313-win_amd64.whl.metadata (15 kB)
Collecting scipy>=1.6.0 (from scikit-learn)
  Downloading scipy-1.15.2-cp313-cp313-win_amd64.whl.metadata (60 kB)
Collecting joblib>=1.2.0 (from scikit-learn)
  Downloading joblib-1.4.2-py3-none-any.whl.metadata (5.4 kB)
Collecting threadpoolctl>=3.1.0 (from scikit-learn)
  Downloading threadpoolctl-3.6.0-py3-none-any.whl.metadata (13 kB)
Downloading scikit_learn-1.6.1-cp313-cp313-win_amd64.whl (11.1 MB)
   ---------------------------------------- 0.0/11.1 MB ? eta -:--:--
   - -------------------------------------- 0.5/11.1 MB 2.7 MB/s eta 0:00:04
   ---- ----------------------------------- 1.3/11.1 MB 3.4 MB/s eta 0:00:03
   ------- -------------------------------- 2.1/11.1 MB 3.4 MB/s eta 0:00:03
   ---------- ----------------------------- 2.9/11.1 MB 3.3 MB/s eta 0:00:03
   ------------- -------------------

ERROR: Could not install packages due to an OSError: [WinError 5] Access is denied: 'C:\\Users\\abhay jha\\AppData\\Roaming\\Python\\Python313\\site-packages\\sklearn\\.libs\\msvcp140.dll'
Check the permissions.


[notice] A new release of pip is available: 24.2 -> 25.0.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [14]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_diabetes

## ** 2. Loading and Exploring the Dataset**

This code loads the Diabetes dataset and prints:

X: Feature matrix (442 samples, 10 features).
y: Target vector (442 values, diabetes progression).
It also displays the feature names, the first five samples of X, and the first five target values.**

In [15]:
data = load_diabetes()

X = data.data         # Feature matrix (shape: [442, 10])
y = data.target       # Target vector (shape: [442,])

print('Feature names:', data.feature_names)
print('First five target values:', y[:5])

Feature names: ['age', 'sex', 'bmi', 'bp', 's1', 's2', 's3', 's4', 's5', 's6']
First five target values: [151.  75. 141. 206. 135.]


##### 3. Preparing the Data

The following code standardizes the features and splits the dataset into training and testing sets:

StandardScaler: Standardizes the feature matrix by removing the mean and scaling to unit variance.

train_test_split: Splits the dataset into training (80%) and testing (20%) sets.

The feature matrix X is transformed, and the dataset is divided into X, X_test, y, and y_test.

In [16]:
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split

feature_scaler = StandardScaler()
target_scaler = StandardScaler()

X = feature_scaler.fit_transform(X)

# y = y.reshape(-1, 1)
# y = target_scaler.fit_transform(y)
# y = y.ravel()

X, X_test, y, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

#### 3. Preparing the Data

The following code standardizes the features and splits the dataset into training and testing sets:

StandardScaler: Standardizes the feature matrix by removing the mean and scaling to unit variance.

train_test_split: Splits the dataset into training (80%) and testing (20%) sets.

The feature matrix X is transformed, and the dataset is divided into X, X_test, y, and y_test

In [17]:
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split

feature_scaler = StandardScaler()
target_scaler = StandardScaler()

X = feature_scaler.fit_transform(X)

# y = y.reshape(-1, 1)
# y = target_scaler.fit_transform(y)
# y = y.ravel()

X, X_test, y, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

##### 4. Initializing Parameters

This code initializes the parameters for the linear regression model:

m, n: The shape of the feature matrix X, where m is the number of samples (442) and n is the number of features (10).

w: The weight vector, initialized to zeros with shape [10,].

b: The bias term, initialized to 0.

In [18]:
m, n = X.shape   # m = 442, n = 10
w = np.zeros(n)  # Weight vector (shape: [10,])
b = 0            # Bias term (scalar)

# **5. Defining the Prediction Function**

The prediction function is given by:



In [19]:
def predict(X, w, b):
    return np.dot(X, w) + b

# **6. Defining the Cost Function**

In [20]:
def compute_cost(X, y, w, b):
    m = len(y)
    y_pred = predict(X, w, b)
    cost = (1 / (2 * m)) * np.sum((y_pred - y) ** 2)

    return cost

# **7. Computing the Gradients**

Gradients are computed to update the weights and bias during training. The partial derivatives of the cost function with respect to each weight and the bias are:

In [21]:
def compute_gradients(X, y, w, b):

    m = len(y)
    y_pred = predict(X, w, b)
    error = y_pred - y

    dw = (1 / m) * np.dot(X.T, error)
    db = (1 / m) * np.sum(error)

    return dw, db

# **8. Updating Parameters Using Gradient Descent**

In [22]:
def update_parameters(w, b, dw, db, learning_rate):

    w = w - learning_rate * dw
    b = b - learning_rate * db

    return w, b