# Introduction to Linear Model

### Regression with an Abalone Dataset

Playground Series - Season 4, Episode 4

https://www.kaggle.com/competitions/playground-series-s4e4/data

In [None]:
import numpy as np
import pandas as pd
from ydata_profiling import ProfileReport

import warnings
from tqdm import tqdm

# Suppress warnings
warnings.filterwarnings("ignore")


Simplified EDA

In [24]:

train_df = pd.read_csv("./data/playground-series-s4e4/train.csv")
train_report = ProfileReport(train_df, title="Train", progress_bar=False)
train_report.to_file("./profile/abalone.html")

test_df = pd.read_csv("./data/playground-series-s4e4/test.csv")
test_report = ProfileReport(test_df, title="Test", progress_bar=False)

comparison_report = train_report.compare(test_report)
comparison_report.to_file("abalone_comp.html")

100%|██████████| 10/10 [00:00<00:00, 127.98it/s]
100%|██████████| 9/9 [00:00<00:00, 75.41it/s]


In [None]:
Linear Regression

What is the problem ?

Ways to solve it ?

How to measure if where are efficient ?



Linear Regression (here we go...)

The loss function used is the **Mean Squared Error (MSE)**. This is implicitly calculated during the gradient descent process.

The gradients $dw$ and $db$ are derived from the partial derivatives of the MSE loss function with respect to the weights and bias, respectively: 

\begin{align}
(\frac{1}{n} \sum (y_i - \hat{y}_i)^2) 
\end{align}
- **Gradients**:

\begin{align}
(dw = \frac{1}{n} X^T (\hat{y} - y))
\end{align}  

\begin{align}
(db = \frac{1}{n} \sum (\hat{y} - y))
\end{align}

Here, \(\hat{y}\) is the predicted value: \(\hat{y} = X \cdot \text{weights} + \text{bias}\).


In [None]:

# Linear Regression using Gradient Descent
class LinearRegressionGD:
    def __init__(self, learning_rate=0.01, epochs=1000):
        self.learning_rate = learning_rate
        self.epochs = epochs
        self.weights = None
        self.bias = None

    def fit(self, X, y):
        # Initialize weights and bias
        n_samples, n_features = X.shape
        self.weights = np.zeros(n_features)
        self.bias = 0

        # Gradient Descent
        for _ in range(self.epochs):
            y_predicted = np.dot(X, self.weights) + self.bias
            dw = (1 / n_samples) * np.dot(X.T, (y_predicted - y))
            db = (1 / n_samples) * np.sum(y_predicted - y)

            # Update weights and bias
            self.weights -= self.learning_rate * dw
            self.bias -= self.learning_rate * db

    def predict(self, X):
        return np.dot(X, self.weights) + self.bias