# 模型演算者第四章：交叉驗證與模型比較
比較多種模型效能，使用交叉驗證提升評估穩定性

## 1. 匯入套件與資料

In [None]:
import pandas as pd
from sklearn.model_selection import cross_val_score
from sklearn.linear_model import LinearRegression
from sklearn.tree import DecisionTreeRegressor
from sklearn.ensemble import RandomForestRegressor

# 載入資料
df = pd.read_csv('house_data_processed.csv')
X = df[['area']]
y = df['price']

## 2. 定義模型

In [None]:
models = {
    'LinearRegression': LinearRegression(),
    'DecisionTree': DecisionTreeRegressor(random_state=42),
    'RandomForest': RandomForestRegressor(random_state=42)
}

## 3. 進行交叉驗證

In [None]:
for name, model in models.items():
    scores = cross_val_score(model, X, y, scoring='neg_mean_squared_error', cv=3)
    print(f'{name}: MSE 平均 {-scores.mean():.2f}')

## 4. 結論

根據交叉驗證的平均誤差分數，我們可選擇最佳模型進行後續部署。