# 02 · S3WD v02 主线：参考元组 + 相似度 + 漂移闭环

本 Notebook 演示 S3WD v02 流程：利用 Reference Tuple（参考元组）/混合相似度/批级阈值小网格，结合漂移分级响应（S1/S2/S3）。
运行前请确认 `configs/s3wd_airline_v02.yaml` 已更新至 v02 键位，并准备好 Airlines 数据。


In [1]:
import os
from pathlib import Path
import sys
import logging

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from IPython.display import display

# 中文字体配置
plt.rcParams['font.sans-serif'] = ['SimHei', 'Microsoft YaHei', 'Arial Unicode MS']
plt.rcParams['axes.unicode_minus'] = False

# 工程根目录自动探测：当前目录或上级目录中包含 s3wdlib
PROJECT_ROOT = Path.cwd()
if not (PROJECT_ROOT / 's3wdlib').exists() and (PROJECT_ROOT.parent / 's3wdlib').exists():
    PROJECT_ROOT = PROJECT_ROOT.parent
if str(PROJECT_ROOT) not in sys.path:
    sys.path.insert(0, str(PROJECT_ROOT))

logging.basicConfig(level=logging.INFO, format='[%(levelname)s] %(message)s')

print("【步骤0】环境信息：", {
    "python": sys.version.split()[0],
    "cwd": os.getcwd(),
    "project_root": str(PROJECT_ROOT),
})


【步骤0】环境信息： {'python': '3.11.5', 'cwd': 'e:\\yan\\组\\三支决策\\机器学习\\C三支决策与不平衡数据集分类\\S3WD实验\\notebooks', 'project_root': 'e:\\yan\\组\\三支决策\\机器学习\\C三支决策与不平衡数据集分类\\S3WD实验'}


In [2]:
# 读取 v02 配置并展示关键信息
from s3wdlib.config_loader import load_yaml_cfg, show_cfg

CONFIG_PATH = PROJECT_ROOT / "configs" / "s3wd_airline_v02.yaml"

cfg = load_yaml_cfg(str(CONFIG_PATH))
print("【步骤1】配置文件已加载：", CONFIG_PATH)
show_cfg(cfg)


[INFO] Loading faiss.
[INFO] Successfully loaded faiss.


【步骤1】配置文件已加载： e:\yan\组\三支决策\机器学习\C三支决策与不平衡数据集分类\S3WD实验\configs\s3wd_airline_v02.yaml
【配置快照】
- DATA: {'data_dir': '../data', 'data_file': 'airlines_train_regression_1000000.arff', 'continuous_label': 'DepDelay', 'threshold': 15, 'threshold_op': '>', 'test_size': 0.3, 'val_size': 0.3, 'random_state': 42, 'start_year': 1987}
- LEVEL: {'level_pcts': [0.6, 0.8, 1.0], 'ranker': 'mi'}
- KWB: {'k': 6, 'metric': 'euclidean', 'eps': 1e-06, 'use_faiss': True, 'faiss_gpu': True}
- GWB: {'k': 10, 'metric': 'euclidean', 'eps': 1e-06, 'mode': 'epanechnikov', 'bandwidth': 0.72, 'bandwidth_scale': 1.05, 'use_faiss': True, 'faiss_gpu': True, 'categorical_features': ['Origin', 'Dest'], 'category_penalty': 0.3}
- S3WD: {'c1': 0.37, 'c2': 0.63, 'xi_min': 0.1, 'theta_pos': 0.9, 'theta_neg': 0.1, 'sigma': 3.0, 'regret_mode': 'utility', 'penalty_large': 1000000.0, 'gamma_last': True, 'gap': 0.02}
- PSO: {'particles': 20, 'iters': 20, 'w_max': 0.9, 'w_min': 0.4, 'c1': 2.8, 'c2': 1.3, 'seed': 42, 'use_gpu': Tru

In [None]:
# 仅运行「流式 v02 主线」，不再单独做静态基线实验
from s3wdlib.v02_flow import run_streaming_flow

print("\n【步骤2】开始运行流式前滚 v02（只保留动态部分）...")
results = run_streaming_flow(str(CONFIG_PATH))

trace_df = results.get("threshold_trace")
metrics_df = results.get("window_metrics")
yearly_df = results.get("metrics_by_year") or results.get("yearly_metrics")

print("\n【步骤3】流式前滚完成。窗口级指标预览：")
if metrics_df is not None:
    display(metrics_df.head())

print("\n【步骤4】按年份汇总指标：")
if yearly_df is not None:
    display(yearly_df)

# 简单画一张按年份的 F1 / BAC 曲线，便于快速观察动态方案效果
if yearly_df is not None:
    fig, ax = plt.subplots(figsize=(10, 5))
    if "F1" in yearly_df.columns:
        ax.plot(yearly_df.index, yearly_df["F1"], marker="o", label="F1")
    if "BAC" in yearly_df.columns:
        ax.plot(yearly_df.index, yearly_df["BAC"], marker="o", label="BAC")
    ax.set_xlabel("year")
    ax.set_ylabel("score")
    ax.set_title("流式 v02 · 按年份 F1/BAC")
    ax.legend()
    plt.tight_layout()
    plt.show()


[INFO] 【验证阶段】VAL.inline_delay=true → 当前月评估仅使用延迟到达的标注。
[INFO] 【验证阶段】VAL.inline_delay=true → 当前月评估仅使用延迟到达的标注。



【步骤2】开始运行流式前滚 v02（只保留动态部分）...
【配置快照】
- DATA: {'data_dir': '../data', 'data_file': 'airlines_train_regression_1000000.arff', 'continuous_label': 'DepDelay', 'threshold': 15, 'threshold_op': '>', 'test_size': 0.3, 'val_size': 0.3, 'random_state': 42, 'start_year': 1987}
- LEVEL: {'level_pcts': [0.6, 0.8, 1.0], 'ranker': 'mi'}
- KWB: {'k': 6, 'metric': 'euclidean', 'eps': 1e-06, 'use_faiss': True, 'faiss_gpu': True}
- GWB: {'k': 10, 'metric': 'euclidean', 'eps': 1e-06, 'mode': 'epanechnikov', 'bandwidth': 0.72, 'bandwidth_scale': 1.05, 'use_faiss': True, 'faiss_gpu': True, 'categorical_features': ['Origin', 'Dest'], 'category_penalty': 0.3}
- S3WD: {'c1': 0.37, 'c2': 0.63, 'xi_min': 0.1, 'theta_pos': 0.9, 'theta_neg': 0.1, 'sigma': 3.0, 'regret_mode': 'utility', 'penalty_large': 1000000.0, 'gamma_last': True, 'gap': 0.02}
- PSO: {'particles': 20, 'iters': 20, 'w_max': 0.9, 'w_min': 0.4, 'c1': 2.8, 'c2': 1.3, 'seed': 42, 'use_gpu': True}
- BUCKET: {'keys': ['UniqueCarrier', 'Origin', 'Des

[INFO] 【初始化】warmup(月)=['0.10', '0.11', '0.12']，stream(月)=['1.01', '1.02', '1.03', '1.04', '1.05', '1.06', '1.07', '1.08', '1.09', '1.10', '1.11', '1.12', '2.01', '2.02', '2.03', '2.04', '2.05', '2.06', '2.07', '2.08', '2.09', '2.10', '2.11', '2.12', '3.01', '3.02', '3.03', '3.04', '3.05', '3.06', '3.07', '3.08', '3.09', '3.10', '3.11', '3.12', '4.01', '4.02', '4.03', '4.04', '4.05', '4.06', '4.07', '4.08', '4.09', '4.10', '4.11', '4.12', '5.01', '5.02', '5.03', '5.04', '5.05', '5.06', '5.07', '5.08', '5.09', '5.10', '5.11', '5.12', '6.01', '6.02', '6.03', '6.04', '6.05', '6.06', '6.07', '6.08', '6.09', '6.10', '6.11', '6.12', '7.01', '7.02', '7.03', '7.04', '7.05', '7.06', '7.07', '7.08', '7.09', '7.10', '7.11', '7.12', '8.01', '8.02', '8.03', '8.04', '8.05', '8.06', '8.07', '8.08', '8.09', '8.10', '8.11', '8.12', '9.01', '9.02', '9.03', '9.04', '9.05', '9.06', '9.07', '9.08', '9.09', '9.10', '9.11', '9.12', '10.01', '10.02', '10.03', '10.04', '10.05', '10.06', '10.07', '10.08', '10.09

KeyboardInterrupt: 

: 