In [5]:
import numpy as np
from sklearn.dummy import DummyRegressor
from sklearn.pipeline import Pipeline
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
import sys
sys.path.append("../src")
from data_loader import load_data



#Loading data
X_train, X_test, y_train, y_test = load_data()

#Creating pipeline
pipeline = Pipeline([
    ("dummy", DummyRegressor(strategy="mean"))
])

# Training pipeline
pipeline.fit(X_train, y_train)

# Predicting
y_pred = pipeline.predict(X_test)

#Evaluating predictions
mae = mean_absolute_error(y_test, y_pred)
mse = mean_squared_error(y_test, y_pred)
rmse = np.sqrt(mse)
r2 = r2_score(y_test, y_pred)

#Displaying results
print("Dummy Regressor (Mean Strategy) Results:")
print(f"MAE:  {mae:,.2f}")
print(f"MSE:  {mse:,.2f}")
print(f"RMSE: {rmse:,.2f}")
print(f"R²:   {r2:.4f}")


 Dropping potential leakage column: Log_Price
Dummy Regressor (Mean Strategy) Results:
MAE:  1,352,127.61
MSE:  3,252,814,346,692.91
RMSE: 1,803,556.03
R²:   -0.0001


### Baseline Prediction: Dummy Regressor
This notebook creates a baseline model using `DummyRegressor` with `strategy="mean"` wrapped inside a `Pipeline`. It loads preprocessed train-test data using `load_data()`, fits the dummy model inside a pipeline, makes predictions, and evaluates performance using MAE, MSE, RMSE, and R² metrics. The pipeline is also saved using `joblib` for future reuse. This serves as a benchmark to compare more complex models later.


In [6]:
import pandas as pd

comparison_df = pd.DataFrame({
    "Actual Price": y_test.values,
    "Predicted Price": y_pred
})

comparison_df.head(10)


Unnamed: 0,Actual Price,Predicted Price
0,3140000.0,4202679.0
1,3390000.0,4202679.0
2,3400000.0,4202679.0
3,5850000.0,4202679.0
4,2490000.0,4202679.0
5,7500000.0,4202679.0
6,3200000.0,4202679.0
7,6090000.0,4202679.0
8,8640000.0,4202679.0
9,5090000.0,4202679.0


### Actual vs Predicted Prices
Creates a DataFrame to compare the actual and predicted house prices using the baseline model. The first 10 entries are displayed for quick inspection.