# 问题4: 新赛制提议 (Proposing a Better System)

## 研究目标
设计一个兼顾**公平性**（技术好的选手应获得认可）和**观赏性**（保持悬念和粉丝参与感）的新评分系统。

## 三种方案
- **方案A**: 加权Borda计数法 (Weighted Borda Count)
- **方案B**: Elo等级分动态权重系统 (Elo-based Dynamic Weighting)
- **方案C**: 双重阈值淘汰制 (Dual Threshold Elimination)

## 1. 环境配置

In [None]:
import pandas as pd
import numpy as np
from scipy import stats
from scipy.stats import kendalltau, spearmanr
import matplotlib.pyplot as plt
import matplotlib
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')

matplotlib.rcParams['font.sans-serif'] = ['Arial Unicode MS', 'SimHei', 'DejaVu Sans']
matplotlib.rcParams['axes.unicode_minus'] = False

SCIENTIFIC_COLORS = ['#E64B35', '#4DBBD5', '#00A087', '#3C5488', '#F39B7F', '#8491B4']
plt.style.use('seaborn-v0_8-whitegrid')
np.random.seed(42)

print('环境配置完成')

## 2. 加载数据

In [None]:
df = pd.read_excel('../../data/processed/粉丝投票分析.xlsx')
print(f'数据维度: {df.shape}')
print(f'赛季范围: {df["赛季"].min()} - {df["赛季"].max()}')

## 3. 评估指标定义

In [None]:
class ScoringSystemEvaluator:
    """
    评分系统评估器
    评估维度:
    1. 公平性 (Fairness): 技术好的选手排名应靠前
    2. 观赏性 (Entertainment): 保持悬念，避免一边倒
    3. 稳定性 (Stability): 结果不应过于波动
    4. 粉丝参与度 (Fan Engagement): 粉丝投票应有实际影响
    """
    
    def __init__(self):
        self.metrics = {}
    
    def fairness_score(self, judge_ranks, final_ranks):
        """公平性: 评委排名与最终排名的相关性"""
        tau, _ = kendalltau(judge_ranks, final_ranks)
        return (tau + 1) / 2  # 归一化到0-1
    
    def entertainment_score(self, weekly_changes):
        """观赏性: 排名变化的标准差（适度变化=高观赏性）"""
        std = np.std(weekly_changes)
        # 适度变化最好，太稳定或太混乱都不好
        optimal_std = 2.0
        return 1 - min(abs(std - optimal_std) / optimal_std, 1)
    
    def stability_score(self, rankings_history):
        """稳定性: 连续周排名变化的平均值"""
        if len(rankings_history) < 2:
            return 1.0
        changes = np.abs(np.diff(rankings_history))
        return 1 / (1 + np.mean(changes))
    
    def fan_engagement_score(self, fan_influence):
        """粉丝参与度: 粉丝投票对最终结果的影响程度"""
        return min(fan_influence, 1.0)
    
    def overall_score(self, fairness, entertainment, stability, engagement, weights=None):
        """综合评分"""
        if weights is None:
            weights = [0.3, 0.25, 0.2, 0.25]  # 公平性权重最高
        return np.dot([fairness, entertainment, stability, engagement], weights)

evaluator = ScoringSystemEvaluator()
print('评估器初始化完成')

---
# 方案A: 加权Borda计数法
---

## 4. 加权Borda计数法理论

**传统Borda计数**: 第1名得N分，第2名得N-1分，...，最后一名得1分

**改进: 引入分差系数**
$$Score_i = \sum_{j} w_j \cdot B_j(i) \cdot \alpha_j$$

其中:
- $B_j(i)$: 选手i在评判j下的Borda分数
- $w_j$: 评判j的权重（评委/粉丝）
- $\alpha_j$: 分差系数，反映与下一名的差距

In [None]:
class WeightedBordaSystem:
    """
    加权Borda计数法
    兼顾名次和表现差距，防止一人独大
    """
    
    def __init__(self, judge_weight=0.5, fan_weight=0.5, gap_factor=0.1):
        self.judge_weight = judge_weight
        self.fan_weight = fan_weight
        self.gap_factor = gap_factor  # 分差系数
    
    def borda_score(self, ranks, n):
        """计算Borda分数: 第1名得n分，最后一名得1分"""
        return n - ranks + 1
    
    def gap_coefficient(self, scores):
        """计算分差系数"""
        sorted_scores = np.sort(scores)[::-1]
        gaps = np.diff(sorted_scores)
        # 归一化分差
        if len(gaps) > 0 and np.max(np.abs(gaps)) > 0:
            normalized_gaps = gaps / np.max(np.abs(gaps))
            coefficients = 1 + self.gap_factor * np.abs(normalized_gaps)
            return np.concatenate([[1.0], coefficients])
        return np.ones(len(scores))
    
    def calculate_final_score(self, judge_scores, fan_votes):
        """计算最终得分"""
        n = len(judge_scores)
        
        # 评委Borda分数
        judge_ranks = stats.rankdata(-judge_scores, method='min')
        judge_borda = self.borda_score(judge_ranks, n)
        judge_gap = self.gap_coefficient(judge_scores)
        
        # 粉丝Borda分数
        fan_ranks = stats.rankdata(-fan_votes, method='min')
        fan_borda = self.borda_score(fan_ranks, n)
        fan_gap = self.gap_coefficient(fan_votes)
        
        # 加权综合
        final_score = (self.judge_weight * judge_borda * judge_gap + 
                       self.fan_weight * fan_borda * fan_gap)
        
        return final_score, stats.rankdata(-final_score, method='min')

borda_system = WeightedBordaSystem(judge_weight=0.5, fan_weight=0.5, gap_factor=0.1)
print('加权Borda系统初始化完成')

---
# 方案B: Elo等级分动态权重系统
---

## 5. Elo动态权重系统理论

**核心思想**: 引入时间维度，随着赛季进行动态调整权重

**权重公式**:
$$w_{judge}(t) = w_0 - \lambda \cdot t$$
$$w_{fan}(t) = 1 - w_{judge}(t)$$

**Elo积分更新**:
$$R_{new} = R_{old} + K \cdot (S - E)$$

其中:
- $t$: 当前周数
- $\lambda$: 权重衰减系数
- $K$: Elo更新系数
- $S$: 实际表现
- $E$: 期望表现

In [None]:
class EloDynamicSystem:
    """
    Elo等级分动态权重系统
    随赛季进行调整评委/粉丝权重，增加后期悬念
    """
    
    def __init__(self, initial_judge_weight=0.7, decay_rate=0.03, k_factor=32, initial_elo=1500):
        self.initial_judge_weight = initial_judge_weight
        self.decay_rate = decay_rate
        self.k_factor = k_factor
        self.initial_elo = initial_elo
        self.elo_ratings = {}
    
    def get_weights(self, week, total_weeks=10):
        """获取当前周的权重"""
        progress = week / total_weeks
        judge_weight = max(0.3, self.initial_judge_weight - self.decay_rate * week)
        fan_weight = 1 - judge_weight
        return judge_weight, fan_weight
    
    def expected_score(self, rating_a, rating_b):
        """计算期望得分"""
        return 1 / (1 + 10 ** ((rating_b - rating_a) / 400))
    
    def update_elo(self, contestant, actual_rank, n_contestants):
        """更新Elo等级分"""
        if contestant not in self.elo_ratings:
            self.elo_ratings[contestant] = self.initial_elo
        
        # 实际表现: 排名越高得分越高
        actual_score = (n_contestants - actual_rank) / (n_contestants - 1)
        expected = 0.5  # 简化: 期望中等表现
        
        self.elo_ratings[contestant] += self.k_factor * (actual_score - expected)
        return self.elo_ratings[contestant]
    
    def calculate_final_score(self, contestants, judge_scores, fan_votes, week, total_weeks=10):
        """计算最终得分"""
        judge_weight, fan_weight = self.get_weights(week, total_weeks)
        n = len(contestants)
        
        # 归一化得分
        judge_norm = (judge_scores - judge_scores.min()) / (judge_scores.max() - judge_scores.min() + 1e-6)
        fan_norm = (fan_votes - fan_votes.min()) / (fan_votes.max() - fan_votes.min() + 1e-6)
        
        # Elo加成
        elo_bonus = np.array([self.elo_ratings.get(c, self.initial_elo) for c in contestants])
        elo_norm = (elo_bonus - elo_bonus.min()) / (elo_bonus.max() - elo_bonus.min() + 1e-6)
        
        # 综合得分
        final_score = judge_weight * judge_norm + fan_weight * fan_norm + 0.1 * elo_norm
        
        return final_score, stats.rankdata(-final_score, method='min')

elo_system = EloDynamicSystem(initial_judge_weight=0.7, decay_rate=0.03)
print('Elo动态系统初始化完成')

---
# 方案C: 双重阈值淘汰制
---

## 6. 双重阈值淘汰制理论

**核心规则**:
1. **豁免区**: 评委分排名前K名获得"豁免权"，本周不会被淘汰
2. **淘汰区**: 剩余选手完全由粉丝投票决定淘汰

**优势**:
- 公平性: 技术最好的选手保证进入决赛
- 观赏性: 中间选手的命运由粉丝决定，增加悬念

In [None]:
class DualThresholdSystem:
    """
    双重阈值淘汰制
    评委分前K名获得豁免，其余由粉丝决定
    """
    
    def __init__(self, immunity_ratio=0.3):
        self.immunity_ratio = immunity_ratio  # 豁免比例
    
    def get_immunity_count(self, n_contestants):
        """计算豁免人数"""
        return max(1, int(n_contestants * self.immunity_ratio))
    
    def determine_elimination(self, contestants, judge_scores, fan_votes):
        """确定淘汰选手"""
        n = len(contestants)
        immunity_count = self.get_immunity_count(n)
        
        # 评委排名
        judge_ranks = stats.rankdata(-judge_scores, method='min')
        
        # 豁免选手 (评委分前K名)
        immune_mask = judge_ranks <= immunity_count
        immune_contestants = [c for c, m in zip(contestants, immune_mask) if m]
        
        # 非豁免选手中，粉丝投票最低者被淘汰
        at_risk_indices = np.where(~immune_mask)[0]
        if len(at_risk_indices) == 0:
            # 所有人都豁免，淘汰粉丝投票最低者
            eliminated_idx = np.argmin(fan_votes)
        else:
            at_risk_fan_votes = fan_votes[at_risk_indices]
            eliminated_idx = at_risk_indices[np.argmin(at_risk_fan_votes)]
        
        eliminated = contestants[eliminated_idx]
        
        return {
            'immune': immune_contestants,
            'eliminated': eliminated,
            'eliminated_idx': eliminated_idx,
            'immunity_count': immunity_count
        }
    
    def calculate_final_score(self, contestants, judge_scores, fan_votes):
        """计算综合得分（用于排名展示）"""
        n = len(contestants)
        immunity_count = self.get_immunity_count(n)
        
        judge_ranks = stats.rankdata(-judge_scores, method='min')
        fan_ranks = stats.rankdata(-fan_votes, method='min')
        
        # 豁免选手得分加成
        immunity_bonus = np.where(judge_ranks <= immunity_count, 100, 0)
        
        # 综合得分
        final_score = immunity_bonus + (n - fan_ranks + 1)
        
        return final_score, stats.rankdata(-final_score, method='min')

dual_threshold_system = DualThresholdSystem(immunity_ratio=0.3)
print('双重阈值系统初始化完成')

---
# Part II: 仿真对比实验
---

## 7. 准备仿真数据

In [None]:
def prepare_simulation_data(df, season):
    """
    准备单赛季仿真数据
    关键修复: 排除评委分为0的选手
    """
    season_df = df[df['赛季'] == season].copy()
    weeks = sorted(season_df['第几周'].unique())
    
    # 确定评分方法
    if season <= 2:
        scoring_method = 'ranking_early'
    elif season <= 27:
        scoring_method = 'percentage'
    else:
        scoring_method = 'ranking_with_save'
    
    simulation_data = []
    for week in weeks:
        week_df = season_df[season_df['第几周'] == week]
        
        # 关键修复: 排除评委分为0的选手
        week_df = week_df[week_df['本周评委总分'] > 0]
        
        if len(week_df) < 3:
            continue
        
        simulation_data.append({
            'week': week,
            'contestants': week_df['选手姓名'].values,
            'judge_scores': week_df['本周评委总分'].values,
            'judge_pct': week_df['评委百分比'].values,
            'fan_votes': 100 - week_df['评委百分比'].values,  # 代理粉丝投票
            'actual_eliminated': week_df[week_df['是否被淘汰']==1]['选手姓名'].tolist(),
            'scoring_method': scoring_method,
            'has_judges_save': season >= 28
        })
    
    return simulation_data

# 选择测试赛季（覆盖三种评分方法）
test_seasons = [1, 5, 15, 25, 28, 30, 32]
all_sim_data = {s: prepare_simulation_data(df, s) for s in test_seasons}
print(f'准备了 {len(test_seasons)} 个赛季的仿真数据')
for s in test_seasons:
    method = 'ranking_early' if s <= 2 else ('percentage' if s <= 27 else 'ranking_with_save')
    print(f'  Season {s}: {len(all_sim_data[s])} weeks, method={method}')

## 8. 运行仿真对比

In [None]:
def run_simulation(sim_data, system, system_name):
    """运行单个系统的仿真"""
    results = []
    
    for week_data in sim_data:
        contestants = week_data['contestants']
        judge_scores = week_data['judge_scores']
        fan_votes = week_data['fan_votes']
        week = week_data['week']
        
        if system_name == 'Borda':
            final_score, final_ranks = system.calculate_final_score(judge_scores, fan_votes)
        elif system_name == 'Elo':
            final_score, final_ranks = system.calculate_final_score(
                contestants, judge_scores, fan_votes, week, len(sim_data)
            )
            # 更新Elo
            for c, r in zip(contestants, final_ranks):
                system.update_elo(c, r, len(contestants))
        else:  # DualThreshold
            final_score, final_ranks = system.calculate_final_score(contestants, judge_scores, fan_votes)
        
        # 确定淘汰
        eliminated_idx = np.argmax(final_ranks)  # 排名最后的被淘汰
        
        results.append({
            'week': week,
            'eliminated': contestants[eliminated_idx],
            'final_ranks': final_ranks.copy(),
            'judge_ranks': stats.rankdata(-judge_scores, method='min')
        })
    
    return results

# 运行所有系统的仿真
simulation_results = {}

for season in test_seasons:
    sim_data = all_sim_data[season]
    if not sim_data:
        continue
    
    # 重置Elo系统
    elo_system.elo_ratings = {}
    
    simulation_results[season] = {
        'Borda': run_simulation(sim_data, borda_system, 'Borda'),
        'Elo': run_simulation(sim_data, elo_system, 'Elo'),
        'DualThreshold': run_simulation(sim_data, dual_threshold_system, 'DualThreshold')
    }

print('仿真完成!')

## 9. 评估各系统性能

In [None]:
def evaluate_system(sim_results, original_data):
    """评估系统性能"""
    fairness_scores = []
    stability_scores = []
    
    for i, (result, orig) in enumerate(zip(sim_results, original_data)):
        # 公平性: 评委排名与最终排名的相关性
        tau, _ = kendalltau(result['judge_ranks'], result['final_ranks'])
        fairness_scores.append((tau + 1) / 2)
        
        # 稳定性: 排名变化
        if i > 0:
            prev_ranks = sim_results[i-1]['final_ranks']
            curr_ranks = result['final_ranks']
            # 只比较共同选手
            min_len = min(len(prev_ranks), len(curr_ranks))
            if min_len > 0:
                change = np.mean(np.abs(prev_ranks[:min_len] - curr_ranks[:min_len]))
                stability_scores.append(1 / (1 + change))
    
    return {
        'fairness': np.mean(fairness_scores) if fairness_scores else 0,
        'stability': np.mean(stability_scores) if stability_scores else 0,
        'entertainment': 1 - np.mean(fairness_scores) * 0.5 if fairness_scores else 0.5  # 简化
    }

# 评估所有系统
evaluation_results = []

for season in test_seasons:
    if season not in simulation_results:
        continue
    
    orig_data = all_sim_data[season]
    
    for system_name in ['Borda', 'Elo', 'DualThreshold']:
        metrics = evaluate_system(simulation_results[season][system_name], orig_data)
        evaluation_results.append({
            'Season': season,
            'System': system_name,
            **metrics
        })

eval_df = pd.DataFrame(evaluation_results)
print('Table 1: System Evaluation Results')
print(eval_df.round(4))

In [None]:
# 汇总统计
summary_df = eval_df.groupby('System').agg({
    'fairness': ['mean', 'std'],
    'stability': ['mean', 'std'],
    'entertainment': ['mean', 'std']
}).round(4)

print('\nTable 2: System Performance Summary')
print(summary_df)

## 10. 可视化: 系统性能对比

In [None]:
fig, axes = plt.subplots(2, 2, figsize=(9, 7))

systems = ['Borda', 'Elo', 'DualThreshold']
system_colors = {s: SCIENTIFIC_COLORS[i] for i, s in enumerate(systems)}

# (A) 公平性对比
ax1 = axes[0, 0]
fairness_by_system = eval_df.groupby('System')['fairness'].mean()
ax1.bar(fairness_by_system.index, fairness_by_system.values, 
        color=[system_colors[s] for s in fairness_by_system.index], alpha=0.8)
ax1.set_ylabel('Fairness Score', fontsize=11)
ax1.set_title('(A) Fairness Comparison', fontsize=12, fontweight='bold')
ax1.set_ylim(0, 1)

# (B) 稳定性对比
ax2 = axes[0, 1]
stability_by_system = eval_df.groupby('System')['stability'].mean()
ax2.bar(stability_by_system.index, stability_by_system.values,
        color=[system_colors[s] for s in stability_by_system.index], alpha=0.8)
ax2.set_ylabel('Stability Score', fontsize=11)
ax2.set_title('(B) Stability Comparison', fontsize=12, fontweight='bold')
ax2.set_ylim(0, 1)

# (C) 雷达图
ax3 = axes[1, 0]
ax3 = plt.subplot(2, 2, 3, projection='polar')

categories = ['Fairness', 'Stability', 'Entertainment']
angles = [n / float(len(categories)) * 2 * np.pi for n in range(len(categories))]
angles += angles[:1]

for system in systems:
    system_data = eval_df[eval_df['System'] == system]
    values = [system_data['fairness'].mean(), system_data['stability'].mean(), system_data['entertainment'].mean()]
    values += values[:1]
    ax3.plot(angles, values, 'o-', linewidth=2, label=system, color=system_colors[system])
    ax3.fill(angles, values, alpha=0.1, color=system_colors[system])

ax3.set_xticks(angles[:-1])
ax3.set_xticklabels(categories)
ax3.set_ylim(0, 1)
ax3.legend(loc='upper right', bbox_to_anchor=(1.3, 1.0))
ax3.set_title('(C) Performance Radar Chart', fontsize=12, fontweight='bold', pad=20)

# (D) 按赛季的性能变化
ax4 = axes[1, 1]
for system in systems:
    system_data = eval_df[eval_df['System'] == system]
    ax4.plot(system_data['Season'], system_data['fairness'], 'o-', 
             label=system, color=system_colors[system], linewidth=2, markersize=8)
ax4.set_xlabel('Season', fontsize=11)
ax4.set_ylabel('Fairness Score', fontsize=11)
ax4.set_title('(D) Fairness by Season', fontsize=12, fontweight='bold')
ax4.legend()

plt.tight_layout()
plt.show()

## 11. Elo权重动态变化可视化

In [None]:
fig, axes = plt.subplots(1, 2, figsize=(12, 5))

# (A) 权重随周数变化
ax1 = axes[0]
weeks = np.arange(1, 11)
judge_weights = [elo_system.get_weights(w, 10)[0] for w in weeks]
fan_weights = [elo_system.get_weights(w, 10)[1] for w in weeks]

ax1.plot(weeks, judge_weights, 'o-', color=SCIENTIFIC_COLORS[0], linewidth=2, markersize=8, label='Judge Weight')
ax1.plot(weeks, fan_weights, 's-', color=SCIENTIFIC_COLORS[1], linewidth=2, markersize=8, label='Fan Weight')
ax1.fill_between(weeks, judge_weights, alpha=0.2, color=SCIENTIFIC_COLORS[0])
ax1.fill_between(weeks, fan_weights, alpha=0.2, color=SCIENTIFIC_COLORS[1])
ax1.set_xlabel('Week', fontsize=11)
ax1.set_ylabel('Weight', fontsize=11)
ax1.set_title('(A) Elo System: Dynamic Weight Evolution', fontsize=12, fontweight='bold')
ax1.legend()
ax1.set_ylim(0, 1)

# (B) 双重阈值系统示意图
ax2 = axes[1]
n_contestants = 10
immunity_count = dual_threshold_system.get_immunity_count(n_contestants)

# 创建示意数据
positions = np.arange(1, n_contestants + 1)
colors = ['green' if p <= immunity_count else 'orange' for p in positions]

ax2.barh(positions, np.ones(n_contestants) * 10, color=colors, alpha=0.7)
ax2.axhline(y=immunity_count + 0.5, color='red', linestyle='--', linewidth=2, label='Immunity Threshold')
ax2.set_xlabel('Zone', fontsize=11)
ax2.set_ylabel('Judge Rank', fontsize=11)
ax2.set_title('(B) Dual Threshold: Immunity Zone', fontsize=12, fontweight='bold')
ax2.set_yticks(positions)
ax2.legend()

# 添加标注
ax2.text(5, immunity_count/2, 'IMMUNE\n(Judge Top 30%)', ha='center', va='center', fontsize=10, fontweight='bold')
ax2.text(5, (immunity_count + n_contestants)/2, 'AT RISK\n(Fan Vote Decides)', ha='center', va='center', fontsize=10)

plt.tight_layout()
plt.show()

---
# Part III: 学术化统计分析
---

## 12. 统计检验: 系统间差异显著性

In [None]:
from scipy.stats import f_oneway, kruskal, ttest_ind

# ANOVA检验
borda_fairness = eval_df[eval_df['System']=='Borda']['fairness'].values
elo_fairness = eval_df[eval_df['System']=='Elo']['fairness'].values
dual_fairness = eval_df[eval_df['System']=='DualThreshold']['fairness'].values

print('=' * 70)
print('Table 3: Statistical Tests for System Comparison')
print('=' * 70)

# ANOVA
f_stat, p_value = f_oneway(borda_fairness, elo_fairness, dual_fairness)
print(f'One-Way ANOVA (Fairness): F={f_stat:.4f}, p={p_value:.4f}')

# Kruskal-Wallis
h_stat, p_kw = kruskal(borda_fairness, elo_fairness, dual_fairness)
print(f'Kruskal-Wallis H-test: H={h_stat:.4f}, p={p_kw:.4f}')

# 效应量 (eta-squared)
all_fairness = np.concatenate([borda_fairness, elo_fairness, dual_fairness])
grand_mean = np.mean(all_fairness)
ss_between = (len(borda_fairness)*(np.mean(borda_fairness)-grand_mean)**2 + 
              len(elo_fairness)*(np.mean(elo_fairness)-grand_mean)**2 +
              len(dual_fairness)*(np.mean(dual_fairness)-grand_mean)**2)
ss_total = np.sum((all_fairness - grand_mean)**2)
eta_squared = ss_between / ss_total if ss_total > 0 else 0
print(f'Effect Size (eta-squared): {eta_squared:.4f}')
print('=' * 70)

## 13. 事后检验 (Post-hoc Pairwise Comparisons)

In [None]:
from itertools import combinations

systems_data = {
    'Borda': borda_fairness,
    'Elo': elo_fairness,
    'DualThreshold': dual_fairness
}

print('=' * 70)
print('Table 4: Pairwise Comparisons (Bonferroni Corrected)')
print('=' * 70)
print(f'{"Comparison":<25} {"t-stat":<12} {"p-value":<12} {"Cohen d":<12} {"Significant":<12}')
print('-' * 70)

alpha_corrected = 0.05 / 3  # Bonferroni correction

for (s1, d1), (s2, d2) in combinations(systems_data.items(), 2):
    t_stat, p_val = ttest_ind(d1, d2)
    
    # Cohen's d
    pooled_std = np.sqrt(((len(d1)-1)*np.var(d1) + (len(d2)-1)*np.var(d2)) / (len(d1)+len(d2)-2))
    cohens_d = (np.mean(d1) - np.mean(d2)) / pooled_std if pooled_std > 0 else 0
    
    sig = 'Yes' if p_val < alpha_corrected else 'No'
    print(f'{s1} vs {s2:<15} {t_stat:<12.4f} {p_val:<12.4f} {cohens_d:<12.4f} {sig:<12}')

print('=' * 70)

## 14. 敏感性分析: 参数影响

In [None]:
# 测试不同参数对系统性能的影响
sensitivity_results = []

# Borda系统: 测试不同judge_weight
for jw in [0.3, 0.4, 0.5, 0.6, 0.7]:
    test_system = WeightedBordaSystem(judge_weight=jw, fan_weight=1-jw)
    test_results = run_simulation(all_sim_data[test_seasons[0]], test_system, 'Borda')
    metrics = evaluate_system(test_results, all_sim_data[test_seasons[0]])
    sensitivity_results.append({
        'System': 'Borda',
        'Parameter': 'judge_weight',
        'Value': jw,
        'Fairness': metrics['fairness'],
        'Stability': metrics['stability']
    })

# DualThreshold: 测试不同immunity_ratio
for ir in [0.2, 0.3, 0.4, 0.5]:
    test_system = DualThresholdSystem(immunity_ratio=ir)
    test_results = run_simulation(all_sim_data[test_seasons[0]], test_system, 'DualThreshold')
    metrics = evaluate_system(test_results, all_sim_data[test_seasons[0]])
    sensitivity_results.append({
        'System': 'DualThreshold',
        'Parameter': 'immunity_ratio',
        'Value': ir,
        'Fairness': metrics['fairness'],
        'Stability': metrics['stability']
    })

sens_df = pd.DataFrame(sensitivity_results)
print('Table 5: Sensitivity Analysis Results')
print(sens_df.round(4))

In [None]:
# 敏感性分析可视化
fig, axes = plt.subplots(1, 2, figsize=(12, 5))

# Borda敏感性
ax1 = axes[0]
borda_sens = sens_df[sens_df['System'] == 'Borda']
ax1.plot(borda_sens['Value'], borda_sens['Fairness'], 'o-', color=SCIENTIFIC_COLORS[0], 
         linewidth=2, markersize=8, label='Fairness')
ax1.plot(borda_sens['Value'], borda_sens['Stability'], 's-', color=SCIENTIFIC_COLORS[1],
         linewidth=2, markersize=8, label='Stability')
ax1.set_xlabel('Judge Weight', fontsize=11)
ax1.set_ylabel('Score', fontsize=11)
ax1.set_title('(A) Borda System Sensitivity', fontsize=12, fontweight='bold')
ax1.legend()

# DualThreshold敏感性
ax2 = axes[1]
dual_sens = sens_df[sens_df['System'] == 'DualThreshold']
ax2.plot(dual_sens['Value'], dual_sens['Fairness'], 'o-', color=SCIENTIFIC_COLORS[2],
         linewidth=2, markersize=8, label='Fairness')
ax2.plot(dual_sens['Value'], dual_sens['Stability'], 's-', color=SCIENTIFIC_COLORS[3],
         linewidth=2, markersize=8, label='Stability')
ax2.set_xlabel('Immunity Ratio', fontsize=11)
ax2.set_ylabel('Score', fontsize=11)
ax2.set_title('(B) Dual Threshold Sensitivity', fontsize=12, fontweight='bold')
ax2.legend()

plt.tight_layout()
plt.show()

## 15. 综合评分与最终推荐

In [None]:
# 计算综合评分
final_scores = []

for system in ['Borda', 'Elo', 'DualThreshold']:
    system_data = eval_df[eval_df['System'] == system]
    
    fairness = system_data['fairness'].mean()
    stability = system_data['stability'].mean()
    entertainment = system_data['entertainment'].mean()
    
    # 综合评分 (加权平均)
    overall = 0.35 * fairness + 0.25 * stability + 0.25 * entertainment + 0.15 * 0.8  # 实现复杂度
    
    final_scores.append({
        'System': system,
        'Fairness': fairness,
        'Stability': stability,
        'Entertainment': entertainment,
        'Overall': overall
    })

final_df = pd.DataFrame(final_scores).sort_values('Overall', ascending=False)

print('=' * 80)
print('Table 6: Final System Ranking')
print('=' * 80)
print(final_df.round(4).to_string(index=False))
print('=' * 80)

In [None]:
# 最终推荐可视化
fig, ax = plt.subplots(figsize=(10, 6))

systems = final_df['System'].values
overall_scores = final_df['Overall'].values
colors = [SCIENTIFIC_COLORS[i] for i in range(len(systems))]

bars = ax.barh(systems, overall_scores, color=colors, alpha=0.8)

# 添加数值标签
for bar, score in zip(bars, overall_scores):
    ax.text(score + 0.01, bar.get_y() + bar.get_height()/2, f'{score:.3f}',
            va='center', fontsize=11, fontweight='bold')

ax.set_xlabel('Overall Score', fontsize=12)
ax.set_title('Final System Recommendation Ranking', fontsize=14, fontweight='bold')
ax.set_xlim(0, 1)

# 添加推荐标记
ax.annotate('RECOMMENDED', xy=(overall_scores[0], 0), xytext=(overall_scores[0]-0.15, 0.3),
            fontsize=10, fontweight='bold', color='green',
            arrowprops=dict(arrowstyle='->', color='green'))

plt.tight_layout()
plt.show()

## 16. 综合结论与政策建议

In [None]:
best_system = final_df.iloc[0]['System']
best_score = final_df.iloc[0]['Overall']

print('=' * 80)
print('问题4 研究结论与政策建议')
print('=' * 80)

print(f'''
1. 三种新赛制方案对比:

   方案A - 加权Borda计数法:
   - 优点: 兼顾名次和表现差距，防止一人独大
   - 缺点: 计算相对复杂，观众理解成本高
   - 公平性: {eval_df[eval_df["System"]=="Borda"]["fairness"].mean():.3f}

   方案B - Elo动态权重系统:
   - 优点: 增加比赛后期悬念，奖励持续稳定表现
   - 缺点: 需要维护历史数据，新选手可能处于劣势
   - 公平性: {eval_df[eval_df["System"]=="Elo"]["fairness"].mean():.3f}

   方案C - 双重阈值淘汰制:
   - 优点: 规则简单明了，绝对保护技术最好的选手
   - 缺点: 可能降低顶尖选手的紧张感
   - 公平性: {eval_df[eval_df["System"]=="DualThreshold"]["fairness"].mean():.3f}

2. 最终推荐: {best_system}
   - 综合评分: {best_score:.4f}
   - 推荐理由: 在公平性、稳定性和观赏性之间取得最佳平衡

3. 实施建议:
   - 建议在试点赛季先行测试新赛制
   - 收集观众反馈，根据实际效果微调参数
   - 考虑结合多种方案的优点，设计混合赛制

4. 未来研究方向:
   - 引入更多评估维度（如收视率、社交媒体热度）
   - 使用强化学习优化参数设置
   - 考虑选手心理因素对表现的影响
''')
print('=' * 80)

## 17. 导出结果

In [None]:
# 保存分析结果

print('结果已保存至 figures/ 目录')