# ðŸ¤– Project Triad: Trembling Hand & Strategic Robustness

This notebook runs the experiments for the paper **"Trembling Hands and Hidden Coalitions"**.
It benchmarks LLM agents in 3-player games (PGG, Volunteer's Dilemma, Triadic PD) to analyze social intelligence and robustness against noise.

**Repository:** [https://github.com/technoob05/Trembling-Triads](https://github.com/technoob05/Trembling-Triads)

## 1. Setup Environment
Clone the repository and install dependencies.

In [None]:
!git clone https://github.com/technoob05/Trembling-Triads.git
%cd Trembling-Triads
!pip install -r requirements.txt

## 2. Models Check (H100 Ready)
Supported Models in this benchmark:
*   **Qwen2.5**: `Qwen2.5-7B`, `Qwen2.5-14B`, `Qwen2.5-32B`, `Qwen2.5-72B`
*   **Llama 3**: `Llama3-8B`, `Llama3-70B`
*   **DeepSeek R1**: `DeepSeek-R1-8B`, `DeepSeek-R1-70B`
*   **Mistral**: `Mistral-7B`

*(Note: 70B models require 2xT4 or 1xH100 GPU)*

## 3. Experiment A: The Scale Test (Small vs Large)
Comparing Llama3-8B vs Llama3-70B in PGG with 5% Noise.
Hypothesis: Larger models will show higher 'Trembling Robustness' (forgiveness).

In [None]:
# Run Small Model
!python triad_experiment.py --game PGG --models Llama3-8B --noise 0.05 --rounds 10

# Run Large Model (Uncomment if GPU allows)
# !python triad_experiment.py --game PGG --models Llama3-70B --noise 0.05 --rounds 10

## 4. Experiment B: Volunteer's Dilemma
Testing coordination capability with Qwen2.5-14B.

In [None]:
!python triad_experiment.py --game VD --models Qwen2.5-14B --rounds 10

## 5. Experiment C: Paper Metrics Export
The Output JSON now contains granular data to calculate:
*   **Survival Rate**: Check `strategy` sequence.
*   **Trembling Robustness**: Compare `intended_strategy` vs `strategy` outcomes.
*   **Payoff Gap**: Derived from `score`.

In [None]:
import os
import json
import pandas as pd

# Load Latest Results
result_files = [f for f in os.listdir('.') if f.endswith('.json')]
latest_file = max(result_files, key=os.path.getctime)

with open(latest_file, 'r', encoding='utf-8') as f:
    data = json.load(f)

# Example: Parse Metrics
print("Analyzing File:", latest_file)
for game_key, game_data in data.items():
    if "ERROR" in game_key: continue
    
    history = game_data['history']
    noise_events = 0
    total_rounds = len(history)
    
    for round_name, agents in history.items():
        for agent in agents:
            if agent.get('is_noise'):
                noise_events += 1
                
    print(f"Game: {game_key} | Total Rounds: {total_rounds} | Noise Events Triggered: {noise_events}")