# Bahrami Parameter Sweep with REM Engine

This notebook demonstrates the Bahrami parameter sweep experiment using the REM (Retrieving Effectively from Memory) model.

**Experiment Design:**
- Fixed expert agent (A) with c = 0.7
- Sweep novice agent (B) ability from c = 0.1 to 0.9
- Test 5 group decision rules: CF, UW, DMC, DSS, BF
- Measure Collective Benefit Ratio = d'_team / max(d'_A, d'_B)

**Key Question:** Do groups perform better than their best individual member?

In [None]:
# Enable auto-reload for module changes
%load_ext autoreload
%autoreload 2

import sys
sys.path.insert(0, '../src')

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from IPython.display import Image, display

import run_simulation

In [None]:
# Run Bahrami parameter sweep
df = run_simulation.run_bahrami_sweep()

## Part 1: Bahrami Parameter Sweep (Tim's Analysis)

Compare all 5 group decision rules across varying ability heterogeneity.

In [None]:
# Load and display results
results = pd.read_csv('../outputs/bahrami_sweep_final.csv')
print("Results Summary:")
print(f"Total rows: {len(results)}")
print(f"\nColumns: {list(results.columns)}")
print("\nFirst 10 rows:")
results.head(10)

In [None]:
# Display the Bahrami plot (5 rules)
display(Image('../outputs/bahrami_sweep_plot.png'))

---

## Part 2: Theoretical Verification (Rich's Request)

Verify that the DSS (Direct Signal Sharing) rule matches the theoretical optimal sensitivity under SDT assumptions with independent noise.

**Theoretical Prediction (Orthogonal Sum):**

$$d'_{\text{theory}} = \sqrt{d'_A^2 + d'_B^2}$$

This is the analytical upper bound for optimal integration of independent evidence sources.

**Goal:** Check if DSS (simulated via REM) recovers this SDT prediction.

In [None]:
# Display Rich's verification plot
display(Image('../outputs/rich_theory_verification.png'))

In [None]:
# Quantitative comparison: DSS vs Theory
dss_data = results[results['rule'] == 'DSS'].sort_values('c_B')
print("DSS vs Theoretical Prediction:")
print("="*60)
comparison = pd.DataFrame({
    'c_B': dss_data['c_B'],
    'DSS_simulated': dss_data['collective_benefit_ratio'],
    'Theory': dss_data['ratio_theory'],
    'Difference': dss_data['collective_benefit_ratio'] - dss_data['ratio_theory']
})
print(comparison.to_string(index=False))
print("\nMean Absolute Difference: {:.4f}".format(comparison['Difference'].abs().mean()))

---

## Part 3: Confidence Miscalibration (Prelec Weighting)

This section explores how **confidence miscalibration** affects group decision-making using Prelec probability weighting.

### Experimental Design

**Fixed Parameters:**
- Equal memory ability: c_A = c_B = 0.7
- Agent A miscalibration: α_A = 1.2 (overconfident)

**Sweep Parameter:**
- Agent B miscalibration: α_B from 0.5 to 1.5 (step 0.1)

**Models Tested:**
1. **WCS_Miscal**: Weighted Confidence Sharing with Prelec weighting
2. **DMC_Miscal**: Defer to Max Confidence with Prelec weighting
3. **DSS**: Direct Signal Sharing (α-independent benchmark)
4. **CF**: Coin Flip (α-independent benchmark)

### Prelec Weighting Function

The Prelec function transforms objective probabilities into subjective weights:

$$w(p) = \exp(-\beta \cdot (-\ln p)^\alpha)$$

where:
- $\alpha$: Miscalibration parameter
  - $\alpha = 1$: Calibrated (w = p)
  - $\alpha > 1$: Overconfident (inflated extremes)
  - $\alpha < 1$: Underconfident (compressed extremes)
- $\beta = (\ln 2)^{1-\alpha}$: Ensures w(0.5) = 0.5

### Key Question

**How does matching vs mismatching miscalibration affect group performance?**
- When both agents are overconfident (α_A = α_B = 1.2), do they perform better/worse?
- When agents have opposite biases (α_A = 1.2, α_B = 0.8), does this help or hurt?

In [None]:
# Run miscalibration parameter sweep
df_miscal = run_simulation.run_miscalibration_sweep()

In [None]:
# Load and display results
results_miscal = pd.read_csv('../outputs/miscalibration_sweep.csv')
print("Miscalibration Sweep Results Summary:")
print(f"Total rows: {len(results_miscal)}")
print(f"\nColumns: {list(results_miscal.columns)}")
print("\nFirst 10 rows:")
results_miscal.head(10)

In [None]:
# Display the miscalibration plot
display(Image('../outputs/miscalibration_plot.png'))

In [None]:
# Quantitative analysis: Examine performance at key alpha_B values
print("Performance at Key Miscalibration Levels:")
print("="*70)

key_alphas = [0.5, 0.8, 1.0, 1.2, 1.5]
for alpha_B in key_alphas:
    subset = results_miscal[np.isclose(results_miscal['alpha_B'], alpha_B)]
    if len(subset) > 0:
        print(f"\nα_B = {alpha_B:.1f}:")
        for model in ['WCS_Miscal', 'DMC_Miscal', 'DSS', 'CF']:
            row = subset[subset['model'] == model]
            if len(row) > 0:
                ratio = row['collective_benefit_ratio'].values[0]
                print(f"  {model:12s}: {ratio:.4f}")

---

## Part 4: Rich's Conflict Resolution Model

This section implements Rich Shiffrin's model for how groups resolve disagreements based on evidence strength.

### Core Question

When two agents disagree (one says "Old", one says "New"), how does the group decide who to follow?

### Model Logic

1. **Convert log-odds to Strength:**
   - Odds: $\phi = \exp(L)$
   - Scaled Odds: $\phi' = \phi^{1/11}$ (fixed power)
   - Strength: $S = \max(\phi', 1/\phi')$

2. **Identify Conflict Trials:**
   - Conflict occurs when $(L_A > 0 \text{ and } L_B < 0)$ OR $(L_A < 0 \text{ and } L_B > 0)$

3. **Compute Difference:**
   - $D = |S_A - S_B|$

4. **Decision Rule:**
   - Probability of choosing stronger agent: $P = \frac{1 + D}{2 + D}$
   - When $D = 0$: Random guess (P = 0.5)
   - As $D \to \infty$: Deterministic choice (P → 1)

### Goal

Verify that the empirical probability curve follows the theoretical prediction $P = \frac{1 + D}{2 + D}$

In [None]:
# Run Rich's conflict resolution simulation
df_rich = run_simulation.run_rich_conflict_simulation()

In [None]:
# Display the conflict resolution plot
display(Image('../outputs/rich_conflict_plot.png'))