<a href="https://colab.research.google.com/github/yashaswinidinesh/pycaret-assignment-yashaswinidinesh/blob/main/notebooks/a_regression_california_housing.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Regression — California Housing (PyCaret 3)

This notebook runs a complete **PyCaret 3** regression pipeline on the **California Housing** dataset.
It has two main cells: (1) installs, (2) full pipeline (CPU fallback for stability in Colab).

In [None]:
# Pinned installs for reproducibility in Colab / Jupyter
# After this cell **restart the runtime** so new binaries load.
%pip -q install "pycaret>=3.0.4,<4" "pandas-datareader>=0.10.0" xgboost lightgbm catboost --upgrade


In [None]:
# === Regression — California Housing (PyCaret 3) ===
# Runs on CPU for stability; flip use_gpu=True later if your session is stable.

import pandas as pd
from sklearn.datasets import fetch_california_housing
from pycaret.regression import (
    setup, compare_models, tune_model, finalize_model,
    plot_model, save_model, predict_model
)

# 1) Load dataset into a DataFrame
cal = fetch_california_housing(as_frame=True)
df = cal.frame.copy()
print("Shape:", df.shape)
print(df.head())

# 2) PyCaret setup
exp = setup(
    data=df,
    target="MedHouseVal",
    session_id=42,
    use_gpu=False,   # set True if your GPU run is stable
    fold=3,
    n_jobs=1
)

# 3) AutoML compare and tune
top = compare_models(
    include=["lightgbm", "xgboost", "catboost", "lr", "ridge", "lasso", "rf"],
    sort="R2"
)
best = tune_model(top, optimize="R2", choose_better=True)

# 4) Evaluate, finalize, save, quick inference
plot_model(best, plot="residuals")
final = finalize_model(best)
path = save_model(final, "california_housing_regressor_cpu")
print("Saved:", path)

sample = df.sample(5, random_state=7)
display(predict_model(final, data=sample))


In [None]:
import pycaret, pandas, numpy, sklearn
print("pycaret:", pycaret.__version__)
print("pandas :", pandas.__version__)
print("numpy  :", numpy.__version__)
print("sklearn:", sklearn.__version__)
