<a href="https://colab.research.google.com/github/poojabisht10/Deep-Learning/blob/main/Multi_Linear_Regression_via_Linear_Perceptron.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Pooja Bisht

In [1]:
import pandas as pd
import numpy as np

In [2]:
# Load dataset
data = pd.read_csv("multiple_linear_regression_dataset.csv")

# Inspect data
print(data.head())
print(data.columns)
print(data.shape)

   age  experience  income
0   25           1   30450
1   30           3   35670
2   47           2   31580
3   32           5   40130
4   43          10   47830
Index(['age', 'experience', 'income'], dtype='object')
(20, 3)


Inputs (features): age, experience

Output (target): income

Model must learn from: 2 input features

# Separate Inputs and Output

In [3]:
# Inputs (features)
X = data[["age", "experience"]].values

# Output (target)
y = data["income"].values

#Model Parameters Intialization

In [4]:
n_features = X.shape[1]

# Initialize weights and bias
w = np.zeros(n_features)
b = 0.0
# One weight per feature = each input affects salary differently

# Bias allows prediction even when inputs are zero

# Large initial values can cause unstable learning

#Forward pass

In [5]:
def predict(X, w, b):
    y_hat = X.dot(w) + b
    return y_hat

# No activation function because this is regression

# Output can take any real value

# Unlike logistic regression, we are predicting numbers, not probabilities

#Loss Function (Mean Squared Error)

In [6]:
def mean_squared_error(y, y_hat):
    loss = ((y_hat - y) ** 2).mean()
    return loss

#Compute Gradients

In [7]:
def compute_gradients(X, y, y_hat):
    N = len(y)

    dw = (2 / N) * X.T.dot(y_hat - y)
    db = (2 / N) * (y_hat - y).sum()

    return dw, db

# X appears in dw because weights depend on inputs

# Bias affects all samples equally, so db has no X

# If error is zero, gradients are zero = learning stops

#Update Parameters (Gradient Descent)

In [8]:
def update_parameters(w, b, dw, db, lr):
    w = w - lr * dw
    b = b - lr * db
    return w, b

# Learning rate controls step size

#Training Loop

In [9]:
lr = 0.0001
epochs = 1000

for epoch in range(epochs):
    y_hat = predict(X, w, b)
    loss = mean_squared_error(y, y_hat)
    dw, db = compute_gradients(X, y, y_hat)
    w, b = update_parameters(w, b, dw, db, lr)

    if epoch % 100 == 0:
        print(f"Epoch {epoch}, Loss: {loss}")

# Loss should decrease over time

# Increasing loss = learning rate too high

# Epochs and learning rate together control convergence

Epoch 0, Loss: 1727049635.0
Epoch 100, Loss: 66491868.55311352
Epoch 200, Loss: 61752567.201190114
Epoch 300, Loss: 58616531.07847049
Epoch 400, Loss: 56528801.53951118
Epoch 500, Loss: 55126542.02946697
Epoch 600, Loss: 54172526.94885703
Epoch 700, Loss: 53511656.14292054
Epoch 800, Loss: 53042523.72795741
Epoch 900, Loss: 52698829.56325033


#Final Evaluation

In [10]:
print("Final weights:", w)
print("Final bias:", b)

# Predict for a new candidate
new_candidate = np.array([4.5, 68])
predicted_salary = new_candidate.dot(w) + b
print("Predicted Salary:", predicted_salary)

Final weights: [ 764.75405919 1371.03430441]
Final bias: 321.73641174472493
Predicted Salary: 96993.4623777421


#Is the prediction reasonable?

Yes, the predicted income is reasonable because it lies within the range of values seen in the dataset and follows the trend of increasing income with experience and test score.


#Does it interpolate smoothly?

Yes, the linear model produces smooth and continuous predictions for intermediate input values.


#Why is this better than threshold rules?

This approach is better than threshold rules because it provides an exact numeric prediction instead of a coarse yes/no decision.