# Position Bias 分析 - 精简版

分析 GPT-4.1 在评判 **deepseek-r1 vs o3-mini-2025-01-31** 时的位置偏差。

## 修正后的分类逻辑：

1. **无位置偏差**: 方向完全对立且强度完全一致
   - `B>>A ↔ A>>B`, `B>A ↔ A>B`, `A=B ↔ A=B`

2. **微弱位置偏差**: 方向完全对立但强度有变化  
   - `B>>A → A>B`, `B>A → A>>B`, `A>>B → B>A`, `A>B → B>>A`

3. **显著位置偏差**: 方向没有对立
   - 所有其他情况 (如 `B>A → B>A`, `A>B → A=B` 等)

In [6]:
import json

# 加载数据
# file_path = './data/arena-hard-v2.0/model_judgment/gpt-4.1/deepseek-r1.jsonl'
# file_path = './data/arena-hard-v2.0/model_judgment/gpt-4.1/qwq-32b.jsonl'
# file_path = './data/arena-hard-v2.1/model_judgment/gemini-2.5/deepseek-r1.jsonl'
file_path = './data/arena-hard-v2.1/model_judgment/gemini-2.5/qwq-32b.jsonl'
with open(file_path, 'r') as f:
    data = [json.loads(line) for line in f]

print(f"数据条数: {len(data)} (每条包含2个games - A/B位置互换)")

数据条数: 748 (每条包含2个games - A/B位置互换)


In [None]:
# 位置偏差分析函数
def analyze_position_bias(data):
    """
    分析位置偏差，分为三类：
    1. 无位置偏差: 方向完全对立且强度完全一致
    2. 微弱位置偏差: 方向完全对立但强度有变化
    3. 显著位置偏差: 方向没有对立
    """
    total = len(data)
    no_bias = weak_bias = significant_bias = 0
    category_stats = {}
    
    for item in data:
        if len(item['games']) != 2:
            continue
            
        score1, score2 = item['games'][0]['score'], item['games'][1]['score']
        category = item['category']
        
        # 初始化类别统计
        if category not in category_stats:
            category_stats[category] = {'total': 0, 'no_bias': 0, 'weak_bias': 0, 'significant_bias': 0}
        category_stats[category]['total'] += 1
        
        # 分类判断
        # 1. 无位置偏差: 方向完全对立且强度完全一致
        if ((score1 == "B>>A" and score2 == "A>>B") or
            (score1 == "A>>B" and score2 == "B>>A") or
            (score1 == "B>A" and score2 == "A>B") or
            (score1 == "A>B" and score2 == "B>A") or
            (score1 == "A=B" and score2 == "A=B")):
            no_bias += 1
            category_stats[category]['no_bias'] += 1
            
        # 2. 微弱位置偏差: 方向完全对立但强度有变化
        elif ((score1 == "B>>A" and score2 == "A>B") or
              (score1 == "A>>B" and score2 == "B>A") or
              (score1 == "B>A" and score2 == "A>>B") or
              (score1 == "A>B" and score2 == "B>>A")):
            weak_bias += 1
            category_stats[category]['weak_bias'] += 1
            
        # 3. 显著位置偏差: 方向没有对立
        else:
            significant_bias += 1
            category_stats[category]['significant_bias'] += 1
    
    return {
        'total': total,
        'no_bias': no_bias,
        'weak_bias': weak_bias,
        'significant_bias': significant_bias,
        'category_stats': category_stats
    }

# 执行分析
results = analyze_position_bias(data)

# 执行分析并显示结果
results = analyze_position_bias(data)

# 核心结果
total = results['total']
no_bias = results['no_bias']
weak_bias = results['weak_bias']
significant_bias = results['significant_bias']

print(f"📊 Position Bias 分析结果 (总计 {total} 条):")
print(f"✅ 无位置偏差: {no_bias} 条 ({no_bias/total*100:.1f}%)")
print(f"⚠️ 微弱位置偏差: {weak_bias} 条 ({weak_bias/total*100:.1f}%)")
print(f"❌ 显著位置偏差: {significant_bias} 条 ({significant_bias/total*100:.1f}%)")
print(f"📈 微弱+显著位置偏差: {significant_bias + weak_bias} 条 ({(significant_bias + weak_bias)/total*100:.1f}%)")
print(f"🎯 可接受评判率: {no_bias + weak_bias} 条 ({(no_bias + weak_bias)/total*100:.1f}%)")

📊 Position Bias 分析结果 (总计 748 条):
✅ 无位置偏差: 529 条 (70.7%)
⚠️ 微弱位置偏差: 78 条 (10.4%)
❌ 显著位置偏差: 141 条 (18.9%)
📈 微弱+显著位置偏差: 219 条 (29.3%)
🎯 可接受评判率: 607条 (81.1%)


In [8]:
# 分析"显著位置偏差"的具体情景
from collections import defaultdict

# 只分析显著位置偏差的模式
significant_patterns = defaultdict(int)

for item in data:
    if len(item['games']) != 2:
        continue
        
    score1, score2 = item['games'][0]['score'], item['games'][1]['score']
    
    # 判断是否为显著位置偏差 (不是无偏差，也不是微弱偏差)
    is_no_bias = ((score1 == "B>>A" and score2 == "A>>B") or
                  (score1 == "A>>B" and score2 == "B>>A") or
                  (score1 == "B>A" and score2 == "A>B") or
                  (score1 == "A>B" and score2 == "B>A") or
                  (score1 == "A=B" and score2 == "A=B"))
    
    is_weak_bias = ((score1 == "B>>A" and score2 == "A>B") or
                    (score1 == "A>>B" and score2 == "B>A") or
                    (score1 == "B>A" and score2 == "A>>B") or
                    (score1 == "A>B" and score2 == "B>>A"))
    
    if not is_no_bias and not is_weak_bias:
        pattern = f"{score1} → {score2}"
        significant_patterns[pattern] += 1

# 显示所有显著位置偏差的情景
print("❌ 显著位置偏差的所有情景 (方向没有对立):")
print("=" * 50)

for pattern, count in sorted(significant_patterns.items(), key=lambda x: x[1], reverse=True):
    print(f"   {pattern}: {count} 次")

print(f"\n📊 显著位置偏差总数: {sum(significant_patterns.values())} 条")
print(f"📊 占总数据的比例: {sum(significant_patterns.values())/750*100:.1f}%")

❌ 显著位置偏差的所有情景 (方向没有对立):
   A>>B → A>>B: 73 次
   B>>A → B>>A: 14 次
   A>>B → A>B: 9 次
   B>A → B>A: 9 次
   A>B → A>>B: 8 次
   B>>A → B>A: 7 次
   A>B → A>B: 4 次
   B>A → B>>A: 4 次
   B>A → A=B: 3 次
   A=B → B>A: 2 次
   A>>B → None: 1 次
   None → B>A: 1 次
   None → A>>B: 1 次
   B>>A → A=B: 1 次
   A=B → A>B: 1 次
   None → A: 1 次
   None → B>>A: 1 次
   None → None: 1 次

📊 显著位置偏差总数: 141 条
📊 占总数据的比例: 18.8%
