# 1. Setup
!git clone https://github.com/technoob05/Trembling-Triads.git
%cd Trembling-Triads

# üß™ Experiment A: The Robustness Test (Pillar 1)

## üéØ Research Question
**"Can AI coalitions survive the Trembling Hand?"**

### Pillar 1: Coalition Stability Under Noise (3-Player Iterated Prisoner's Dilemma)

**Core Concept:**  
Test **Trembling Hand Perfection** - the ability to distinguish accidental errors from malicious intent.

**Hypothesis:**
- **Small Models (7B)**: Fragile coalitions. One accidental defection ‚Üí endless retaliation (DDD Nash)
- **Medium Models (32B)**: Transitional. Emerging forgiveness mechanisms
- **Large Models (70B/120B)**: Robust. Can maintain cooperation despite noise

**Key Metrics:**
- **Coalition Entropy (H)**: How quickly does CCC ‚Üí DDD when noise increases?
- **Trembling Robustness Score (R)**: Slope of cooperation curve as Œµ increases

---

## üìã Experimental Protocol

### Game Setup
- **Game Type**: 3-Player Prisoner's Dilemma (Triadic PD)
- **Payoff Structure**: CCC: 7-7-7 | CCD: 9-0-0 | DDD: 1-1-1
- **Rounds**: 100 (long enough to observe coalition dynamics)
- **Languages**: English (en), Vietnamese (vn)
- **Features**: `--reasoning` + `--meta-prompt` at rounds 1,25,50,75,100

### Noise Levels (Trembling Hand)
- **Œµ = 0.0**: Pure strategic play (baseline)
- **Œµ = 0.05**: Light accidents (5% execution error)
- **Œµ = 0.10**: High uncertainty (10% error)

In [None]:
# 1. Setup Environment
!pip install --upgrade -qqq uv
!uv pip install --system -qqq "unsloth[base] @ git+https://github.com/unslothai/unsloth" "unsloth_zoo" "transformers==4.56.2" bitsandbytes accelerate pandas openai anthropic mistralai python-dotenv

print("‚úì Dependencies installed")
print("‚úì Ready for Pillar 1: Robustness Test")

### üü¢ Phase 1: Baseline (0% Noise)
Establish the "Ideal Interaction" pattern across scales.

In [None]:
# ‚ö° PRIORITY: Medium Model (32B) - Baseline (0% noise)
!python triad_experiment.py --game PD --models "Qwen2.5-32B" --rounds 100 --languages en,vn --noise 0.0 --reasoning --meta-prompt --meta-rounds "1,25,50,75,100" --save-incremental

print("\nüìä Phase 1 Complete: Baseline data collected")
print("Expected: High cooperation rate (~90-100%)")
print("JSON file saved incrementally (every round) - safe from crashes!")

In [None]:
# --- Super Large Models (Slow, use only if A100 is available) ---
# !python triad_experiment.py --game PD --models "/kaggle/input/gpt-oss-120b/transformers/default/1" --noise 0.0 --rounds 100 --languages en,vn

# --- Small Models ---
# !python triad_experiment.py --game PD --models Qwen2.5-14B --noise 0.0 --rounds 100 --languages en,vn
# !python triad_experiment.py --game PD --models Qwen2.5-7B --noise 0.0 --rounds 100 --languages en,vn

### ‚ö†Ô∏è Phase 2: The Trembling Hand (5% Noise)
Introduce light accidents. Does the coalition survive?

In [None]:
# ‚ö° PRIORITY: Medium Model with 5% Noise
!python triad_experiment.py --game PD --models "Qwen2.5-32B" --rounds 100 --languages en,vn --noise 0.05 --reasoning --meta-prompt --meta-rounds "1,25,50,75,100" --save-incremental

print("\nüìä Phase 2 Complete: Light noise data")
print("Check: Does cooperation degrade? Do agents forgive accidents?")

### üî¥ Phase 3: Chaos (10% Noise)
High uncertainty. Only the most sophisticated agents should maintain cooperation here.

In [None]:
# ‚ö° PRIORITY: Medium Model with 10% Noise
!python triad_experiment.py --game PD --models "Qwen2.5-32B" --rounds 100 --languages en,vn --noise 0.1 --reasoning --meta-prompt --meta-rounds "1,25,50,75,100" --save-incremental

print("\nüìä Phase 3 Complete: High chaos data")
print("Check: Coalition stability under stress?")

---

## üìä Quick Analysis: Coalition Stability

Preview results before full analysis in Exp_C


In [None]:
import json
import glob
import pandas as pd

# Load all PD results
results_files = glob.glob('experiment_results_PD_*.json')
print(f"Found {len(results_files)} result files\n")

if results_files:
    summary = []
    
    for file in results_files:
        with open(file, 'r') as f:
            data = json.load(f)
        
        for exp_name, exp_data in data.items():
            if 'ERROR' in exp_name:
                continue
            
            # Extract noise level
            noise = float(exp_name.split('Noise')[1]) if 'Noise' in exp_name else 0.0
            
            # Calculate cooperation rate
            history = exp_data['history']
            total_actions = 0
            cooperations = 0
            noise_events = 0
            
            for round_data in history.values():
                for agent in round_data:
                    total_actions += 1
                    if agent['strategy'] == 'Cooperate':
                        cooperations += 1
                    if agent.get('is_noise', False):
                        noise_events += 1
            
            coop_rate = cooperations / total_actions if total_actions > 0 else 0
            
            summary.append({
                'Experiment': exp_name[:50],
                'Noise (Œµ)': f"{noise:.0%}",
                'Cooperation Rate': f"{coop_rate:.1%}",
                'Noise Events': noise_events,
                'Rounds': len(history)
            })
    
    df = pd.DataFrame(summary)
    print("\nüìà PILLAR 1 SUMMARY: Coalition Robustness")
    print("=" * 80)
    print(df.to_string(index=False))
    
    print("\nüí° Key Insights:")
    print("- Look for cooperation decline as noise increases")
    print("- Check if agents maintain CCC despite accidents")
    print("- Reasoning logs show if agents understand 'trembling hand'")
else:
    print("‚ö†Ô∏è No results yet. Run experiments above first!")


---

## ‚úÖ Experiment A Checklist

- [ ] Phase 1 (0% noise) completed for Qwen2.5-32B
- [ ] Phase 2 (5% noise) completed
- [ ] Phase 3 (10% noise) completed
- [ ] Quick analysis shows cooperation patterns
- [ ] Result JSON files saved
- [ ] Ready for comprehensive analysis in Exp_C

**Next Step:** Run `Exp_B_Games_MultiLang.ipynb` for Pillars 2 & 3

**Expected Results:**
- Baseline (0%): ~95-100% cooperation
- Light noise (5%): ~80-90% cooperation (some forgiveness)
- High noise (10%): ~60-70% cooperation (coalition stress)

**Files Generated:**
- `experiment_results_PD_Qwen2.5-32B_en_Noise0.0_[timestamp].json`
- `experiment_results_PD_Qwen2.5-32B_vn_Noise0.0_[timestamp].json`
- `experiment_results_PD_Qwen2.5-32B_en_Noise0.05_[timestamp].json`
- `experiment_results_PD_Qwen2.5-32B_vn_Noise0.05_[timestamp].json`
- `experiment_results_PD_Qwen2.5-32B_en_Noise0.1_[timestamp].json`
- `experiment_results_PD_Qwen2.5-32B_vn_Noise0.1_[timestamp].json`
