# Linear Regression from Scratch
This notebook walks through a beginner-friendly implementation of linear regression using Python and NumPy.

## Step 1: Generate Synthetic Dataset
We simulate a dataset where the true relationship is linear with some noise added.

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.metrics import mean_squared_error, r2_score

# Generate data
np.random.seed(0)
X = np.random.rand(50) * 10
true_slope = 2
true_intercept = 1
noise = np.random.randn(50) * 2
y = true_slope * X + true_intercept + noise

## Step 2: Compute Slope and Intercept
We compute the best-fit line using the closed-form solution of linear regression.

In [None]:
x_mean = np.mean(X)
y_mean = np.mean(y)

numerator = np.sum((X - x_mean) * (y - y_mean))
denominator = np.sum((X - x_mean)**2)
w = numerator / denominator
b = y_mean - w * x_mean

## Step 3: Make Predictions
We use the learned slope and intercept to predict values for X.

In [None]:
y_pred = w * X + b

## Step 4: Plot the Results with Error Lines
We visualize the actual data, the predicted line, and the error lines between them.

In [None]:
plt.figure(figsize=(10, 6))
plt.scatter(X, y, label='Actual data')
plt.plot(X, y_pred, color='red', linewidth=2, label='Prediction line')
for i in range(len(X)):
    plt.plot([X[i], X[i]], [y[i], y_pred[i]], color='gray', linestyle='--', linewidth=0.5)
plt.xlabel("X")
plt.ylabel("y")
plt.title("Linear Regression with Error Lines")
plt.legend()
plt.grid(True)
plt.tight_layout()
plt.show()

## Step 5: Evaluate the Model
We use Mean Squared Error (MSE) and R-squared (R²) to evaluate the performance.

In [None]:
mse = mean_squared_error(y, y_pred)
r2 = r2_score(y, y_pred)

print(f" Learned slope (w): {w:.2f}")
print(f" Learned intercept (b): {b:.2f}")
print(f" Mean Squared Error (MSE): {mse:.2f}")
print(f" R-squared (R²): {r2:.2f}")


## 🎨 Creative Exercise – Regression Artist

Let's make regression fun and visual! In this exercise, you will become a **Regression Artist**:

### 🎯 Your Goal:
Design a dataset that makes your linear regression model perform *poorly* — even though it **looks linear** at a glance.

You can control:
- The slope (`w`)
- The intercept (`b`)
- The noise level (`noise_scale`)

### 🔍 Explore:
- How do noise and outliers affect the model?
- What happens to the Mean Squared Error and R² score?
