# 21. Delta 特徵消融研究

## 目的
透過比較加入與移除 Delta 特徵（Δ）後的模型效能，驗證 Delta 特徵的價值。

## 假說
Delta 特徵捕捉時序變化，應能提升預測效能。

## 消融設計
- **完整模型**：T1 + T2 + Delta 特徵（26 個特徵）
- **消融模型**：僅 T1 + T2（18 個特徵，無 Delta）

## 日期：2026-01-13
## 執行時間：約 47 分 13 秒（10 模型 × 4 特徵集 × 3 目標 × 5-Fold CV = 600 次訓練）

In [1]:
# 匯入套件
import pandas as pd
import numpy as np
import warnings
warnings.filterwarnings('ignore')

from sklearn.model_selection import StratifiedGroupKFold
from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import RandomForestClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.naive_bayes import GaussianNB
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
from sklearn.svm import SVC
from sklearn.neural_network import MLPClassifier
import xgboost as xgb
import lightgbm as lgb
from sklearn.metrics import roc_auc_score, average_precision_score

print("套件載入完成")

套件載入完成


## 1. 載入資料與定義特徵集

In [2]:
# 載入滑動視窗資料
df = pd.read_csv('../../data/01_primary/SUA/processed/SUA_sliding_window.csv')
print(f"資料：{len(df):,} 筆樣本，{df['patient_id'].nunique():,} 位患者")

# 定義特徵集
base_features = ['sex', 'Age']

t1_features = ['FBG_Tinput1', 'TC_Tinput1', 'Cr_Tinput1', 'UA_Tinput1', 
               'GFR_Tinput1', 'BMI_Tinput1', 'SBP_Tinput1', 'DBP_Tinput1']

t2_features = ['FBG_Tinput2', 'TC_Tinput2', 'Cr_Tinput2', 'UA_Tinput2',
               'GFR_Tinput2', 'BMI_Tinput2', 'SBP_Tinput2', 'DBP_Tinput2']

delta_features = ['Delta_FBG', 'Delta_TC', 'Delta_Cr', 'Delta_UA',
                  'Delta_GFR', 'Delta_BMI', 'Delta_SBP', 'Delta_DBP']

# 消融用特徵集
feature_sets = {
    '完整 (T1+T2+Δ)': base_features + t1_features + t2_features + delta_features,
    '無 Δ (T1+T2)': base_features + t1_features + t2_features,
    '僅 T2+Δ': base_features + t2_features + delta_features,
    '僅 T2': base_features + t2_features,
}

print("\n特徵集：")
for name, features in feature_sets.items():
    print(f"  {name}：{len(features)} 個特徵")

資料：13,514 筆樣本，6,056 位患者

特徵集：
  完整 (T1+T2+Δ)：26 個特徵
  無 Δ (T1+T2)：18 個特徵
  僅 T2+Δ：18 個特徵
  僅 T2：10 個特徵


In [3]:
# 準備目標變數
groups = df['patient_id']

targets = {
    'HTN': (df['hypertension_target'] == 2).astype(int),
    'HG': (df['hyperglycemia_target'] == 2).astype(int),
    'DL': (df['dyslipidemia_target'] == 2).astype(int)
}

print("目標變數準備完成")
for name, y in targets.items():
    print(f"  {name}：{y.mean()*100:.1f}% 陽性")

目標變數準備完成
  HTN：19.3% 陽性
  HG：5.9% 陽性
  DL：7.9% 陽性


## 2. 執行消融實驗

In [4]:
def get_models(random_state=42):
    """定義 10 種模型供消融比較"""
    return {
        'LR': LogisticRegression(max_iter=1000, class_weight='balanced', random_state=random_state),
        'NB': GaussianNB(),
        'LDA': LinearDiscriminantAnalysis(),
        'KNN': KNeighborsClassifier(n_neighbors=5, weights='uniform', n_jobs=-1),
        'DT': DecisionTreeClassifier(max_depth=5, class_weight='balanced', random_state=random_state),
        'RF': RandomForestClassifier(n_estimators=100, class_weight='balanced', random_state=random_state, n_jobs=-1),
        'XGB': xgb.XGBClassifier(n_estimators=100, max_depth=5, learning_rate=0.1, random_state=random_state,
                                   use_label_encoder=False, eval_metric='logloss', verbosity=0),
        'LGBM': lgb.LGBMClassifier(n_estimators=100, max_depth=-1, num_leaves=31, learning_rate=0.1,
                                    is_unbalance=True, random_state=random_state, verbosity=-1),
        'SVM': SVC(kernel='rbf', class_weight='balanced', probability=True, random_state=random_state),
        'MLP': MLPClassifier(hidden_layer_sizes=(64, 32), max_iter=500, random_state=random_state),
    }

def run_ablation_cv(X, y, groups, model_name, model, n_splits=5):
    """執行 5-Fold CV，回傳平均 AUC 與 PR-AUC"""
    cv = StratifiedGroupKFold(n_splits=n_splits, shuffle=True, random_state=42)
    
    aucs = []
    pr_aucs = []
    
    for train_idx, test_idx in cv.split(X, y, groups):
        X_train, X_test = X.iloc[train_idx], X.iloc[test_idx]
        y_train, y_test = y.iloc[train_idx], y.iloc[test_idx]
        
        # 標準化
        scaler = StandardScaler()
        X_train_scaled = pd.DataFrame(scaler.fit_transform(X_train), columns=X_train.columns, index=X_train.index)
        X_test_scaled = pd.DataFrame(scaler.transform(X_test), columns=X_test.columns, index=X_test.index)
        
        # 處理 XGB 的 scale_pos_weight
        if model_name == 'XGB':
            neg_count = (y_train == 0).sum()
            pos_count = (y_train == 1).sum()
            model.set_params(scale_pos_weight=neg_count / pos_count)
        
        model.fit(X_train_scaled, y_train)
        
        # 預測
        y_prob = model.predict_proba(X_test_scaled)[:, 1]
        
        # 評估指標
        aucs.append(roc_auc_score(y_test, y_prob))
        pr_aucs.append(average_precision_score(y_test, y_prob))
    
    return np.mean(aucs), np.std(aucs), np.mean(pr_aucs), np.std(pr_aucs)

print("消融函式定義完成（10 種模型）")

消融函式定義完成（10 種模型）


In [5]:
# 對每個模型、目標、特徵集執行消融實驗
results = []
models = get_models()

print("=" * 80)
print("Delta 特徵消融研究 - 全部模型")
print("=" * 80)

for target_name, y in targets.items():
    print(f"\n{'='*80}")
    print(f"目標：{target_name}")
    print(f"{'='*80}")
    
    for model_name, model in models.items():
        print(f"\n  --- {model_name} ---")
        
        for feature_set_name, features in feature_sets.items():
            X = df[features]
            
            # 每次重新初始化模型
            fresh_models = get_models()
            fresh_model = fresh_models[model_name]
            
            auc_mean, auc_std, pr_auc_mean, pr_auc_std = run_ablation_cv(
                X, y, groups, model_name, fresh_model
            )
            
            results.append({
                'Target': target_name,
                'Model': model_name,
                'Feature_Set': feature_set_name,
                'N_Features': len(features),
                'AUC_mean': auc_mean,
                'AUC_std': auc_std,
                'PR_AUC_mean': pr_auc_mean,
                'PR_AUC_std': pr_auc_std
            })
            
            print(f"    {feature_set_name}：AUC={auc_mean:.3f}±{auc_std:.3f}")

print("\n" + "=" * 80)
print("消融實驗完成")
print("=" * 80)

Delta 特徵消融研究 - 全部模型

目標：HTN

  --- LR ---
    完整 (T1+T2+Δ)：AUC=0.721±0.017
    無 Δ (T1+T2)：AUC=0.721±0.017
    僅 T2+Δ：AUC=0.721±0.017
    僅 T2：AUC=0.698±0.017

  --- NB ---
    完整 (T1+T2+Δ)：AUC=0.709±0.022
    無 Δ (T1+T2)：AUC=0.701±0.017
    僅 T2+Δ：AUC=0.690±0.024
    僅 T2：AUC=0.688±0.018

  --- LDA ---
    完整 (T1+T2+Δ)：AUC=0.720±0.017
    無 Δ (T1+T2)：AUC=0.720±0.017
    僅 T2+Δ：AUC=0.720±0.017
    僅 T2：AUC=0.698±0.016

  --- KNN ---
    完整 (T1+T2+Δ)：AUC=0.630±0.018
    無 Δ (T1+T2)：AUC=0.633±0.017
    僅 T2+Δ：AUC=0.605±0.014
    僅 T2：AUC=0.599±0.012

  --- DT ---
    完整 (T1+T2+Δ)：AUC=0.717±0.013
    無 Δ (T1+T2)：AUC=0.717±0.013
    僅 T2+Δ：AUC=0.723±0.017
    僅 T2：AUC=0.686±0.018

  --- RF ---
    完整 (T1+T2+Δ)：AUC=0.735±0.011
    無 Δ (T1+T2)：AUC=0.735±0.015
    僅 T2+Δ：AUC=0.733±0.015
    僅 T2：AUC=0.684±0.013

  --- XGB ---
    完整 (T1+T2+Δ)：AUC=0.738±0.012
    無 Δ (T1+T2)：AUC=0.731±0.014
    僅 T2+Δ：AUC=0.736±0.011
    僅 T2：AUC=0.685±0.016

  --- LGBM ---
    完整 (T1+T2+Δ)：AUC=0.730±0.011
   

## 3. 結果分析

In [6]:
# 建立結果 DataFrame
results_df = pd.DataFrame(results)

# 關鍵比較：T2+Δ vs 僅 T2（最有意義的比較）
print("=" * 80)
print("各模型的 Delta 貢獻：T2+Δ vs 僅 T2")
print("=" * 80)

print("\n| 模型 | 類型 | HTN Δ | HG Δ | DL Δ | 平均 Δ |")
print("|------|------|-------|------|------|--------|")

model_types = {
    'LR': '傳統統計', 'NB': '傳統統計', 'LDA': '傳統統計',
    'KNN': '基於實例',
    'DT': '樹模型', 'RF': '樹模型', 'XGB': '樹模型', 'LGBM': '樹模型',
    'SVM': '核方法', 'MLP': '神經網路'
}

delta_contributions = []

for model_name in ['LR', 'NB', 'LDA', 'KNN', 'DT', 'RF', 'XGB', 'LGBM', 'SVM', 'MLP']:
    deltas = []
    for target in ['HTN', 'HG', 'DL']:
        t2_delta = results_df[(results_df['Target'] == target) & 
                              (results_df['Model'] == model_name) &
                              (results_df['Feature_Set'] == '僅 T2+Δ')]['AUC_mean'].values[0]
        t2_only = results_df[(results_df['Target'] == target) & 
                             (results_df['Model'] == model_name) &
                             (results_df['Feature_Set'] == '僅 T2')]['AUC_mean'].values[0]
        deltas.append(t2_delta - t2_only)
    
    avg_delta = np.mean(deltas)
    model_type = model_types[model_name]
    print(f"| {model_name:5s} | {model_type:6s} | {deltas[0]:+.3f} | {deltas[1]:+.3f} | {deltas[2]:+.3f} | {avg_delta:+.3f} |")
    delta_contributions.append({'Model': model_name, 'Type': model_type, 'Avg_Delta': avg_delta})

# 依模型類型彙整
print("\n" + "=" * 80)
print("各模型類型的平均 Delta 貢獻")
print("=" * 80)

dc_df = pd.DataFrame(delta_contributions)
type_avg = dc_df.groupby('Type')['Avg_Delta'].mean()
for t in ['傳統統計', '基於實例', '樹模型', '核方法', '神經網路']:
    if t in type_avg.index:
        print(f"  {t}：{type_avg[t]:+.3f} AUC")

各模型的 Delta 貢獻：T2+Δ vs 僅 T2

| 模型 | 類型 | HTN Δ | HG Δ | DL Δ | 平均 Δ |
|------|------|-------|------|------|--------|
| LR    | 傳統統計   | +0.022 | +0.014 | +0.021 | +0.019 |
| NB    | 傳統統計   | +0.002 | -0.003 | -0.006 | -0.002 |
| LDA   | 傳統統計   | +0.022 | +0.014 | +0.021 | +0.019 |
| KNN   | 基於實例   | +0.006 | -0.025 | +0.003 | -0.005 |
| DT    | 樹模型    | +0.037 | +0.013 | +0.019 | +0.023 |
| RF    | 樹模型    | +0.049 | +0.020 | +0.038 | +0.035 |
| XGB   | 樹模型    | +0.051 | +0.015 | +0.020 | +0.029 |
| LGBM  | 樹模型    | +0.049 | +0.023 | +0.035 | +0.036 |
| SVM   | 核方法    | +0.037 | +0.021 | +0.024 | +0.027 |
| MLP   | 神經網路   | +0.031 | +0.032 | +0.014 | +0.026 |

各模型類型的平均 Delta 貢獻
  傳統統計：+0.012 AUC
  基於實例：-0.005 AUC
  樹模型：+0.031 AUC
  核方法：+0.027 AUC
  神經網路：+0.026 AUC


In [7]:
# 資訊冗餘性檢驗：完整 (T1+T2+Δ) vs 無 Δ (T1+T2)
print("=" * 80)
print("資訊冗餘性檢驗：完整 (T1+T2+Δ) vs 無 Δ (T1+T2)")
print("=" * 80)

print("\n| 模型 | 類型 | HTN Δ | HG Δ | DL Δ | 平均 Δ |")
print("|------|------|-------|------|------|--------|")

for model_name in ['LR', 'NB', 'LDA', 'KNN', 'DT', 'RF', 'XGB', 'LGBM', 'SVM', 'MLP']:
    deltas = []
    for target in ['HTN', 'HG', 'DL']:
        full = results_df[(results_df['Target'] == target) & 
                          (results_df['Model'] == model_name) &
                          (results_df['Feature_Set'] == '完整 (T1+T2+Δ)')]['AUC_mean'].values[0]
        no_delta = results_df[(results_df['Target'] == target) & 
                              (results_df['Model'] == model_name) &
                              (results_df['Feature_Set'] == '無 Δ (T1+T2)')]['AUC_mean'].values[0]
        deltas.append(full - no_delta)
    
    avg_delta = np.mean(deltas)
    model_type = model_types[model_name]
    print(f"| {model_name:5s} | {model_type:6s} | {deltas[0]:+.3f} | {deltas[1]:+.3f} | {deltas[2]:+.3f} | {avg_delta:+.3f} |")

print("\n備註：接近零 = 模型可從 T1+T2 自行推導 Δ（資訊冗餘）")
print("正值 = 模型需要明確的 Δ 特徵（無法自動推導）")

資訊冗餘性檢驗：完整 (T1+T2+Δ) vs 無 Δ (T1+T2)

| 模型 | 類型 | HTN Δ | HG Δ | DL Δ | 平均 Δ |
|------|------|-------|------|------|--------|
| LR    | 傳統統計   | -0.000 | +0.000 | +0.000 | +0.000 |
| NB    | 傳統統計   | +0.008 | -0.002 | -0.001 | +0.001 |
| LDA   | 傳統統計   | +0.000 | +0.000 | +0.000 | +0.000 |
| KNN   | 基於實例   | -0.003 | -0.018 | -0.011 | -0.011 |
| DT    | 樹模型    | -0.000 | +0.003 | +0.007 | +0.003 |
| RF    | 樹模型    | +0.000 | -0.001 | +0.001 | +0.000 |
| XGB   | 樹模型    | +0.007 | +0.004 | +0.003 | +0.005 |
| LGBM  | 樹模型    | +0.002 | +0.002 | +0.004 | +0.003 |
| SVM   | 核方法    | +0.004 | +0.001 | +0.001 | +0.002 |
| MLP   | 神經網路   | +0.004 | +0.003 | +0.010 | +0.006 |

備註：接近零 = 模型可從 T1+T2 自行推導 Δ（資訊冗餘）
正值 = 模型需要明確的 Δ 特徵（無法自動推導）


In [8]:
# T1 特徵貢獻（比較 T2+Δ vs 完整模型）
print("\n" + "=" * 80)
print("T1 特徵貢獻")
print("=" * 80)

print("\n| 目標 | 完整模型 | 僅 T2+Δ | T1 貢獻 |")
print("|------|---------|---------|---------|")

for target in ['HTN', 'HG', 'DL']:
    full = results_df[(results_df['Target'] == target) & 
                      (results_df['Feature_Set'] == '完整 (T1+T2+Δ)')]['AUC_mean'].values[0]
    t2_delta = results_df[(results_df['Target'] == target) & 
                          (results_df['Feature_Set'] == '僅 T2+Δ')]['AUC_mean'].values[0]
    t1_contrib = full - t2_delta
    
    print(f"| {target} | {full:.3f} | {t2_delta:.3f} | {t1_contrib:+.3f} |")


T1 特徵貢獻

| 目標 | 完整模型 | 僅 T2+Δ | T1 貢獻 |
|------|---------|---------|---------|
| HTN | 0.721 | 0.721 | +0.000 |
| HG | 0.938 | 0.938 | +0.000 |
| DL | 0.867 | 0.867 | -0.000 |


In [9]:
# 儲存結果
results_df.to_csv('../../results/delta_ablation_all_models.csv', index=False)
print("已儲存：results/delta_ablation_all_models.csv")
print(f"總實驗數：{len(results_df)}（10 模型 × 4 特徵集 × 3 目標）")

已儲存：results/delta_ablation_all_models.csv
總實驗數：120（10 模型 × 4 特徵集 × 3 目標）


## 4. 結論

### 關鍵研究問題

1. **傳統統計方法（NB、LDA）是否比 ML 模型更受益於明確的 Δ 特徵？**
   - NB 假設特徵獨立 → 無法推導 T2-T1 的關係 → 預期受益較多
   - LDA 考慮共變異數 → 可能部分捕捉 → 預期中度受益
   - 樹模型與 MLP 可學習非線性交互作用 → 預期受益較小

2. **資訊冗餘性檢驗（完整模型 vs 無 Δ）**
   - 接近零：模型可從 T1+T2 自動推導 Δ
   - 正值：模型需要明確的 Δ 特徵

3. **實用價值檢驗（T2+Δ vs 僅 T2）**
   - 臨床上最有意義的比較
   - 當僅有近期資料 + 變化資訊時，Δ 能提供多少幫助？