# 📘 Practical 2 — Multiple Linear Regression on California Housing Dataset

**Objective:** Implement Multiple Linear Regression to predict Median House Value using 8 features.

## 📖 Theory
Multiple Linear Regression models relationship between multiple independent variables and one dependent variable:

$$ y = b_0 + b_1x_1 + b_2x_2 + ... + b_nx_n $$

Matrix Form:
$$ W = (X^T X)^{-1} X^T Y $$

Evaluation Metrics: MSE and R².

In [None]:
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score
from sklearn import datasets
import matplotlib.pyplot as plt

In [None]:
data = datasets.fetch_california_housing()
X = data.data
y = data.target

print('Features:', data.feature_names)
print('Dataset Shape:', X.shape)

In [None]:
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=11
)

In [None]:
model = LinearRegression()
model.fit(X_train, y_train)

In [None]:
y_train_pred = model.predict(X_train)
y_test_pred = model.predict(X_test)

In [None]:
print('MSE Train:', mean_squared_error(y_train, y_train_pred))
print('MSE Test:', mean_squared_error(y_test, y_test_pred))
print('R2 Train:', r2_score(y_train, y_train_pred))
print('R2 Test:', r2_score(y_test, y_test_pred))

In [None]:
plt.figure(figsize=(8,6))
plt.scatter(y_test, y_test_pred, alpha=0.5)
plt.plot([min(y_test), max(y_test)], [min(y_test), max(y_test)], color='red')
plt.xlabel('Actual Values')
plt.ylabel('Predicted Values')
plt.title('Actual vs Predicted (Perfect Fit Line)')
plt.show()

## ✅ Conclusion
The model explains ~60% variance in housing prices and generalizes well on test data.