# 📉 Linear Regression from Scratch — Predicting House Prices

This notebook demonstrates how **Linear Regression** works using **NumPy**, without relying on any machine learning libraries.  
We’ll build a simple model to predict house prices based on their size (in square feet).

---

## 🎯 Objectives
- Understand the concept of linear regression.
- Implement gradient descent manually.
- Visualize the regression line and loss reduction.
---


In [None]:
import numpy as np
import matplotlib.pyplot as plt

# Sample data: (Size in square feet, Price in $1000)
X = np.array([650, 785, 900, 1100, 1200, 1400, 1500])
y = np.array([100, 120, 130, 150, 170, 180, 195])

plt.scatter(X, y, color='blue', label='Data Points')
plt.title("House Price vs Size")
plt.xlabel("Size (sq ft)")
plt.ylabel("Price ($1000s)")
plt.legend()
plt.grid(True)
plt.show()

## 🧮 Step 1: Normalize the data
Normalization helps gradient descent converge faster by scaling down large values.


In [None]:
X_mean, X_std = np.mean(X), np.std(X)
y_mean, y_std = np.mean(y), np.std(y)

X_norm = (X - X_mean) / X_std
y_norm = (y - y_mean) / y_std

plt.scatter(X_norm, y_norm, color='orange')
plt.title("Normalized Data")
plt.xlabel("Size (normalized)")
plt.ylabel("Price (normalized)")
plt.grid(True)
plt.show()

## ⚙️ Step 2: Initialize parameters
We'll initialize weights randomly and use **gradient descent** to minimize loss (Mean Squared Error).


In [None]:
# Initialize parameters
m = 0.0  # slope
b = 0.0  # intercept
lr = 0.1  # learning rate
epochs = 100

# Store losses for visualization
losses = []

# Gradient Descent Loop
for i in range(epochs):
    y_pred = m * X_norm + b
    error = y_pred - y_norm

    dm = (2 / len(X_norm)) * np.dot(error, X_norm)
    db = (2 / len(X_norm)) * np.sum(error)

    m -= lr * dm
    b -= lr * db

    loss = np.mean(error ** 2)
    losses.append(loss)

print(f"Trained parameters: slope={m:.3f}, intercept={b:.3f}")


## 📉 Step 3: Visualize loss during training

In [None]:
plt.plot(losses, color='purple')
plt.title("Loss Reduction Over Epochs")
plt.xlabel("Epochs")
plt.ylabel("Mean Squared Error")
plt.grid(True)
plt.show()

## 🧾 Step 4: Visualize regression line

In [None]:
# Denormalize the line for plotting
m_real = m * (y_std / X_std)
b_real = y_mean + y_std * (b - m * X_mean / X_std)

x_line = np.linspace(min(X), max(X), 100)
y_line = m_real * x_line + b_real

plt.scatter(X, y, color='blue', label='Data Points')
plt.plot(x_line, y_line, color='red', label='Regression Line')
plt.title("Linear Regression: House Price Prediction")
plt.xlabel("Size (sq ft)")
plt.ylabel("Price ($1000s)")
plt.legend()
plt.grid(True)
plt.show()

## 🧮 Step 5: Make predictions
Let's predict the price for a new house size, e.g., **1300 sq ft**.


In [None]:
new_house = 1300
predicted_price = m_real * new_house + b_real
print(f"Predicted Price for {new_house} sq ft house: ${predicted_price*1000:.2f}")

---
### ✅ Summary
In this notebook, you learned:
- How linear regression fits a straight line to data.
- How gradient descent adjusts parameters to minimize error.
- How to visualize loss reduction and regression lines.
---