# The "Poker Face" Analysis
## Correlating Neural Decodability with Behavioral Predictability

**Authors:** Rico Bolos, Shelly Chen, Yichen Zeng

**Research Question:** Is a player who is behaviorally easy to predict (i.e., plays in patterns) also "easier to read" neurally?

**Hypothesis:** Players with high behavioral predictability (high Markov accuracy) will also have higher neural decoding accuracy, suggesting behavioral patterns are supported by distinct neural patterns.

## Prerequisites

### Required Data Files
This analysis requires output from the authors' MATLAB processing pipeline:

1. **`derivatives/markov_chain_pred.mat`** - from `step2b_markovchain.m`
   - Contains `Mean_Accuracy` matrix (31 pairs × 2 players × 100 windows)
   
2. **`derivatives/pair-XX_player-X_task-RPS_decoding.mat`** - from `step2a_decoding.m`  
   - Contains `decoding_accuracy` for each player
   - Need files for all 62 players (31 pairs × 2 players)

### How to Get the Data
**Option A (Recommended):** Download from OSF repository  
Visit: https://doi.org/10.17605/OSF.IO/YJXKN  
Download the `derivatives/` folder and place it in this directory.

**Option B:** Run the MATLAB scripts  
Requires MATLAB with FieldTrip and CoSMoMVPA toolboxes:
```matlab
cd scripts
step2a_decoding
step2b_markovchain
```

## 1. Setup and Imports

In [None]:
# Import required libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats
from scipy.io import loadmat
from pathlib import Path
import warnings
warnings.filterwarnings('ignore')

# Set plotting style
sns.set_style('whitegrid')
sns.set_context('notebook', font_scale=1.2)
plt.rcParams['figure.dpi'] = 100

print("✓ Libraries imported successfully")

In [None]:
# Set paths
DATA_PATH = Path('.')
DERIVATIVES_PATH = DATA_PATH / 'derivatives'

# Check if derivatives folder exists
if not DERIVATIVES_PATH.exists():
    raise FileNotFoundError(
        f"\n{'='*70}\n"
        f"ERROR: derivatives/ folder not found!\n\n"
        f"Please download the processed data from OSF:\n"
        f"https://doi.org/10.17605/OSF.IO/YJXKN\n\n"
        f"Or run the MATLAB scripts (step2a_decoding.m and step2b_markovchain.m)\n"
        f"{'='*70}"
    )

print(f"✓ Data path set: {DERIVATIVES_PATH}")
print(f"✓ Derivatives folder exists")

## 2. Load Behavioral Predictability Data

Extract behavioral predictability scores from the Markov chain analysis (Step 2b).

In [None]:
# Load Markov chain data
markov_file = DERIVATIVES_PATH / 'markov_chain_pred.mat'

if not markov_file.exists():
    raise FileNotFoundError(
        f"Markov chain file not found: {markov_file}\n"
        f"Expected: derivatives/markov_chain_pred.mat"
    )

# Load the .mat file
markov_data = loadmat(markov_file)

# Extract Mean_Accuracy: (31 pairs × 2 players × 100 windows)
mean_accuracy = markov_data['Mean_Accuracy']

print(f"✓ Loaded Markov chain data")
print(f"  Shape: {mean_accuracy.shape}")
print(f"  (31 pairs × 2 players × 100 window sizes)")

In [None]:
# Extract behavioral predictability for each of the 62 players
# We'll use the mean across all window sizes for each player

num_pairs = mean_accuracy.shape[0]
num_players = num_pairs * 2  # 62 players total

# Reshape to get one value per player
# Take mean across window sizes (axis 2)
behavioral_predictability = np.mean(mean_accuracy, axis=2)  # (31 pairs × 2 players)

# Flatten to get 62 individual values
behavioral_predictability = behavioral_predictability.flatten()

print(f"✓ Extracted behavioral predictability scores")
print(f"  Number of players: {len(behavioral_predictability)}")
print(f"  Mean: {np.mean(behavioral_predictability):.4f}")
print(f"  Std: {np.std(behavioral_predictability):.4f}")
print(f"  Range: [{np.min(behavioral_predictability):.4f}, {np.max(behavioral_predictability):.4f}]")

## 3. Load Neural Decodability Data

Extract neural decoding accuracy from Step 2a ("Self Response" decoding).

In [None]:
# Define pair IDs (excluding 10, 23, 24)
pair_ids = [1, 2, 3, 4, 5, 6, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34]

print(f"Loading decoding data for {len(pair_ids)} pairs ({len(pair_ids)*2} players)...")

# Storage for neural decoding accuracy
neural_decodability = []
player_ids = []

# Loop through all pairs and players
for pair_idx, pair in enumerate(pair_ids):
    for player in [1, 2]:
        # Construct filename
        decoding_file = DERIVATIVES_PATH / f'pair-{pair:02d}_player-{player}_task-RPS_decoding.mat'
        
        if not decoding_file.exists():
            print(f"  ⚠ Warning: File not found: {decoding_file}")
            neural_decodability.append(np.nan)
            continue
        
        # Load decoding data
        dec_data = loadmat(decoding_file)
        
        # Extract 'Self Response' decoding (index 0 in decoding_accuracy cell array)
        # The data structure is: decoding_accuracy{1} for 'Self Response'
        self_response = dec_data['decoding_accuracy'][0, 0]
        
        # Get accuracy over time (samples.accuracy)
        accuracy_over_time = self_response['samples'][0, 0]['accuracy'][0, 0].flatten()
        
        # Get time points
        time_points = self_response['a'][0, 0]['fdim'][0, 0]['values'][0, 0][0, 1].flatten()
        
        # Find indices for 0-500ms window after decision
        time_mask = (time_points >= 0) & (time_points <= 0.5)
        
        # Extract peak accuracy in this window
        peak_accuracy = np.max(accuracy_over_time[time_mask])
        
        # Store results
        neural_decodability.append(peak_accuracy)
        player_ids.append(f"pair-{pair:02d}_player-{player}")

# Convert to numpy array
neural_decodability = np.array(neural_decodability)

print(f"\n✓ Extracted neural decodability scores")
print(f"  Number of players: {len(neural_decodability)}")
print(f"  Mean: {np.nanmean(neural_decodability):.4f}")
print(f"  Std: {np.nanstd(neural_decodability):.4f}")
print(f"  Range: [{np.nanmin(neural_decodability):.4f}, {np.nanmax(neural_decodability):.4f}]")

## 4. Create Analysis DataFrame

In [None]:
# Combine data into a dataframe
df = pd.DataFrame({
    'player_id': player_ids,
    'behavioral_predictability': behavioral_predictability,
    'neural_decodability': neural_decodability
})

# Remove any rows with missing data
df_clean = df.dropna()

print(f"✓ Created analysis dataframe")
print(f"  Total players: {len(df)}")
print(f"  Complete data: {len(df_clean)}")
print(f"\nFirst few rows:")
display(df_clean.head(10))

In [None]:
# Summary statistics
print("Summary Statistics:")
print("="*60)
display(df_clean.describe())

## 5. Correlation Analysis

Calculate Pearson's correlation coefficient to test our hypothesis.

In [None]:
# Calculate Pearson correlation
r, p_value = stats.pearsonr(
    df_clean['behavioral_predictability'],
    df_clean['neural_decodability']
)

# Calculate 95% confidence interval for r
# Using Fisher z-transformation
n = len(df_clean)
z = np.arctanh(r)
se = 1 / np.sqrt(n - 3)
ci_z = [z - 1.96 * se, z + 1.96 * se]
ci_r = [np.tanh(ci_z[0]), np.tanh(ci_z[1])]

print("\n" + "="*70)
print("CORRELATION ANALYSIS RESULTS")
print("="*70)
print(f"\nPearson's r: {r:.4f}")
print(f"p-value: {p_value:.4f}")
print(f"95% CI: [{ci_r[0]:.4f}, {ci_r[1]:.4f}]")
print(f"Sample size: n = {n}")
print(f"\nStatistical significance: {'YES' if p_value < 0.05 else 'NO'} (α = 0.05)")

# Interpret effect size (Cohen's guidelines)
abs_r = abs(r)
if abs_r < 0.1:
    effect_size = "negligible"
elif abs_r < 0.3:
    effect_size = "small"
elif abs_r < 0.5:
    effect_size = "medium"
else:
    effect_size = "large"

print(f"Effect size: {effect_size} (|r| = {abs_r:.4f})")

# Calculate coefficient of determination
r_squared = r ** 2
print(f"\nVariance explained: R² = {r_squared:.4f} ({r_squared*100:.2f}%)")
print("="*70)

## 6. Visualization

Create a scatter plot showing the relationship between behavioral predictability and neural decodability.

In [None]:
# Create figure
fig, ax = plt.subplots(figsize=(10, 8))

# Scatter plot
ax.scatter(
    df_clean['behavioral_predictability'],
    df_clean['neural_decodability'],
    alpha=0.6,
    s=100,
    edgecolors='black',
    linewidth=1.5,
    color='steelblue'
)

# Add regression line
z = np.polyfit(df_clean['behavioral_predictability'], df_clean['neural_decodability'], 1)
p = np.poly1d(z)
x_line = np.linspace(
    df_clean['behavioral_predictability'].min(),
    df_clean['behavioral_predictability'].max(),
    100
)
ax.plot(x_line, p(x_line), 'r-', linewidth=2, label=f'Linear fit (r = {r:.3f})')

# Labels and title
ax.set_xlabel('Behavioral Predictability\n(Markov Chain Accuracy)', fontsize=14, fontweight='bold')
ax.set_ylabel('Neural Decodability\n(Peak Decoding Accuracy, 0-500ms)', fontsize=14, fontweight='bold')
ax.set_title(
    'The "Poker Face" Analysis:\nBehavioral Predictability vs. Neural Decodability',
    fontsize=16,
    fontweight='bold',
    pad=20
)

# Add statistics text box
stats_text = (
    f"Pearson's r = {r:.4f}\n"
    f"p = {p_value:.4f}\n"
    f"95% CI: [{ci_r[0]:.3f}, {ci_r[1]:.3f}]\n"
    f"n = {n}"
)
ax.text(
    0.05, 0.95,
    stats_text,
    transform=ax.transAxes,
    fontsize=11,
    verticalalignment='top',
    bbox=dict(boxstyle='round', facecolor='wheat', alpha=0.5)
)

# Legend
ax.legend(loc='lower right', fontsize=11)

# Grid
ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

# Save figure
fig.savefig('poker_face_correlation.png', dpi=300, bbox_inches='tight')
print("\n✓ Figure saved as 'poker_face_correlation.png'")

## 7. Interpretation

Interpret the results in the context of our hypothesis.

In [None]:
print("\n" + "="*70)
print("INTERPRETATION")
print("="*70)

print("\nHypothesis: Players with high behavioral predictability will have")
print("higher neural decoding accuracy.\n")

if p_value < 0.05:
    if r > 0:
        print("✓ HYPOTHESIS SUPPORTED")
        print("\nWe found a significant POSITIVE correlation between behavioral")
        print("predictability and neural decodability.")
        print("\nInterpretation: Players who are 'transparent' in their behavioral")
        print("strategies (i.e., play in predictable patterns) are also more")
        print("'transparent' in their brain signals. This suggests that behavioral")
        print("patterns are supported by distinct, decodable neural patterns.")
    else:
        print("✗ HYPOTHESIS NOT SUPPORTED")
        print("\nWe found a significant NEGATIVE correlation - the opposite of our")
        print("prediction.")
        print("\nInterpretation: Players who are behaviorally predictable actually have")
        print("LESS decodable neural signals. This could suggest compensatory neural")
        print("mechanisms or that predictable behavior arises from noisier neural")
        print("processing.")
else:
    print("○ NO SIGNIFICANT CORRELATION FOUND")
    print("\nWe did not find a significant relationship between behavioral")
    print("predictability and neural decodability.")
    print("\nInterpretation: This implies a dissociation between behavior and how")
    print("the brain represents such behaviors. Behavioral patterns may not be")
    print("directly reflected in the neural signals we measured, or other factors")
    print("not captured in this analysis may be at play.")

print("\n" + "="*70)

## 8. Export Results

In [None]:
# Save results to CSV
df_clean.to_csv('poker_face_results.csv', index=False)
print("✓ Data saved to 'poker_face_results.csv'")

# Save summary statistics
summary_stats = {
    'Analysis': 'Poker Face - Behavioral Predictability vs Neural Decodability',
    'Sample Size': n,
    'Pearson r': r,
    'p-value': p_value,
    'CI_lower': ci_r[0],
    'CI_upper': ci_r[1],
    'R_squared': r_squared,
    'Effect_size': effect_size,
    'Significant': 'Yes' if p_value < 0.05 else 'No'
}

summary_df = pd.DataFrame([summary_stats])
summary_df.to_csv('poker_face_summary.csv', index=False)
print("✓ Summary statistics saved to 'poker_face_summary.csv'")

print("\n" + "="*70)
print("ANALYSIS COMPLETE!")
print("="*70)