# Joint-RL Real-Time Equal Distance Validation Analysis

This notebook analyzes joint-RL real-time data to validate equal distance conditions for `equal_to_both` trials in 2P3G games.

## Validation Criteria

For `equal_to_both` trials in 2p3g, we validate that each player is equidistant between the two goals:
1. Player 1's distance to new goal == Player 1's distance to first shared goal
2. Player 2's distance to new goal == Player 2's distance to first shared goal  
3. Joint distance sum is equal (sum of distances to new goal == sum of distances to first shared goal)

**Note**: Player positions should use positions when New-Goal Moment is present.


In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from pathlib import Path
import warnings
from scipy import stats
import ast
warnings.filterwarnings('ignore')

# Set publication standards
plt.rcParams.update({
    'font.size': 12,
    'axes.titlesize': 14,
    'axes.labelsize': 12,
    'xtick.labelsize': 11,
    'ytick.labelsize': 11,
    'legend.fontsize': 11,
    'figure.titlesize': 16,
    'font.family': 'Arial'
})

# Define consistent color scheme
CONSISTENT_COLORS = {
    'individual': '#2E86C1',  # Blue
    'joint': '#E74C3C',       # Red/coral
    'success': '#28B463',     # Green
    'failure': '#F39C12',     # Orange
    'human': '#3498DB',       # Light blue
    'ai': '#9B59B6',          # Purple
    'equal': '#2ECC71',       # Green for equal distances
    'unequal': '#E67E22'      # Orange for unequal distances
}


## Load Joint-RL Real-Time Data


In [2]:
# Load joint-RL real-time data
data_dir = Path('human-RLs/joint-rl-realTime')
excel_files = [f for f in data_dir.glob('*.xlsx') if not f.name.startswith('~$')]

def load_and_process_file(file_path):
    """Load Excel file and add metadata"""
    try:
        df = pd.read_excel(file_path)
        participant_id = file_path.name.split('_')[2]
        df = df.assign(participantID=participant_id)
        df['source_file'] = file_path.name
        # Mark as joint-rl-realTime for consistency
        df['rlAgentType'] = 'joint-rl-realTime'
        return df
    except Exception as e:
        print(f"Error loading {file_path.name}: {e}")
        return None

# Load all files
print(f"Found {len(excel_files)} joint-rl-realTime files")
data_frames = []
for file in excel_files:
    df = load_and_process_file(file)
    if df is not None:
        data_frames.append(df)
        print(f'✅ Loaded: {file.name} ({len(df)} trials)')

if data_frames:
    combined_data = pd.concat(data_frames, ignore_index=True)
    print(f"\n📊 Total data loaded: {len(combined_data)} trials from {combined_data['participantID'].nunique()} participants")
else:
    print("❌ No data loaded")
    combined_data = pd.DataFrame()

# Focus on 2P3G experiments only
if not combined_data.empty:
    data_2p3g = combined_data[combined_data['experimentType'] == '2P3G'].copy()
    print(f"\n🎯 2P3G trials: {len(data_2p3g)}")
    print(f"RL Agent types: {data_2p3g['rlAgentType'].value_counts()}")
    print(f"Distance conditions: {data_2p3g['distanceCondition'].value_counts()}")
else:
    data_2p3g = pd.DataFrame()


Found 15 joint-rl-realTime files
✅ Loaded: experiment_data_ 67630f751df95f8fb373e275_2025-09-30T19-23-45-667Z.xlsx (35 trials)
✅ Loaded: experiment_data_ 672bf480b98fd6d326fd1f69_2025-09-30T19-19-16-244Z.xlsx (35 trials)
✅ Loaded: experiment_data_ 66469ef793a8127e49ca1992_2025-09-30T19-30-14-966Z.xlsx (35 trials)
✅ Loaded: experiment_data_ 667178d77f8b6ed980c256ff_2025-09-30T19-20-29-183Z.xlsx (35 trials)
✅ Loaded: experiment_data_ 6751acc5dc78128951a34f1f_2025-09-30T19-12-21-369Z.xlsx (35 trials)
✅ Loaded: experiment_data_67630f751df95f8fb373e275_2025-09-30T19-39-50-935Z.xlsx (35 trials)
✅ Loaded: experiment_data_ 6480bfd6d433e2e158870605_2025-09-30T19-11-55-119Z.xlsx (35 trials)
✅ Loaded: experiment_data_ 57adbbe8bcf54e000152816b_2025-09-30T19-03-22-396Z.xlsx (35 trials)
✅ Loaded: experiment_data_ 663139cbb8723d403f4e7fc8_2025-09-30T19-45-01-702Z.xlsx (35 trials)
✅ Loaded: experiment_data_ 60f43c3a0fd9723feba1af38_2025-09-30T19-46-45-503Z.xlsx (35 trials)
✅ Loaded: experiment_data_ 5

## Helper Functions for Position and Distance Calculations


In [3]:
def parse_point(s):
    """Parse a point from string representation"""
    if pd.isna(s) or s is None:
        return None
    if isinstance(s, (list, tuple)) and len(s) >= 2:
        return tuple(s[:2])
    if isinstance(s, str):
        try:
            s_clean = s.replace('null', 'None')
            parsed = ast.literal_eval(s_clean)
            if isinstance(parsed, (list, tuple)) and len(parsed) >= 2:
                return tuple(parsed[:2])
        except:
            pass
    return None

def parse_positions(s):
    """Parse a sequence of positions from string representation"""
    if pd.isna(s) or s is None:
        return []
    if isinstance(s, str):
        try:
            s_clean = s.replace('null', 'None')
            parsed = ast.literal_eval(s_clean)
            if isinstance(parsed, list):
                return [tuple(p) if isinstance(p, (list, tuple)) else p for p in parsed if p is not None]
        except:
            pass
    return []

def manhattan_distance(a, b):
    """Calculate Manhattan distance between two points"""
    if a is None or b is None:
        return np.nan
    return abs(a[0] - b[0]) + abs(a[1] - b[1])

def get_player_pos_at_new_goal_moment(row, player_num):
    """Get player position at the moment when new goal is presented"""
    step_val = row.get('newGoalPresentedTime')
    if pd.isna(step_val) or step_val == '':
        return None

    step = int(float(step_val))

    # Get trajectory for the specified player
    traj_col = f'player{player_num}Trajectory'
    traj = parse_positions(row.get(traj_col))

    if not traj:
        # Fallback to initial position if available
        init_col = 'initPlayerGrid' if player_num == 1 else 'initAIGrid'
        return parse_point(row.get(init_col))

    if player_num == 1:
        # Human player: use direct step index
        if step < len(traj):
            return traj[step]
        return traj[-1] if traj else None
    else:
        # AI player: use odd indices (AI moves after human)
        odd_positions = [traj[i] for i in range(1, len(traj), 2)]
        if odd_positions:
            idx = min(step, len(odd_positions) - 1)
            return odd_positions[idx]
        # Fallback: clamp to trajectory bounds
        idx = max(0, min(len(traj) - 1, 2 * step - 1))
        return traj[idx]

def get_first_shared_goal_position(row):
    """Get the position of the first detected shared goal"""
    first_shared_idx = row.get('firstDetectedSharedGoal')
    if pd.isna(first_shared_idx):
        return None

    try:
        first_shared_idx = int(first_shared_idx)
        t1 = parse_point(row.get('target1'))
        t2 = parse_point(row.get('target2'))

        if first_shared_idx == 0 and t1 is not None:
            return t1
        elif first_shared_idx == 1 and t2 is not None:
            return t2
    except:
        pass

    return None

print("✅ Helper functions defined")


✅ Helper functions defined


## Compute Distances for Equal Distance Validation


In [4]:
def compute_equal_distance_validation(df):
    """Compute distances for equal distance validation"""
    if df.empty:
        return df

    df = df.copy()

    # Initialize lists to store computed values
    player1_pos_at_new = []
    player2_pos_at_new = []
    first_shared_goal_positions = []
    new_goal_positions = []

    # Distance calculations
    player1_dist_to_new = []
    player1_dist_to_first_shared = []
    player2_dist_to_new = []
    player2_dist_to_first_shared = []

    for _, row in df.iterrows():
        # Get positions at new goal moment
        p1_pos = get_player_pos_at_new_goal_moment(row, 1)
        p2_pos = get_player_pos_at_new_goal_moment(row, 2)

        # Get goal positions
        first_shared_pos = get_first_shared_goal_position(row)
        new_goal_pos = parse_point(row.get('newGoalPosition'))

        # Store positions
        player1_pos_at_new.append(p1_pos)
        player2_pos_at_new.append(p2_pos)
        first_shared_goal_positions.append(first_shared_pos)
        new_goal_positions.append(new_goal_pos)

        # Calculate distances
        player1_dist_to_new.append(manhattan_distance(p1_pos, new_goal_pos))
        player1_dist_to_first_shared.append(manhattan_distance(p1_pos, first_shared_pos))
        player2_dist_to_new.append(manhattan_distance(p2_pos, new_goal_pos))
        player2_dist_to_first_shared.append(manhattan_distance(p2_pos, first_shared_pos))

    # Add computed values to dataframe
    df = df.assign(
        player1_pos_at_new_goal=player1_pos_at_new,
        player2_pos_at_new_goal=player2_pos_at_new,
        first_shared_goal_pos=first_shared_goal_positions,
        new_goal_pos=new_goal_positions,
        player1_dist_to_new_goal=player1_dist_to_new,
        player1_dist_to_first_shared=player1_dist_to_first_shared,
        player2_dist_to_new_goal=player2_dist_to_new,
        player2_dist_to_first_shared=player2_dist_to_first_shared
    )

    # Calculate distance differences for validation
    df['player1_distance_diff'] = df['player1_dist_to_new_goal'] - df['player1_dist_to_first_shared']
    df['player2_distance_diff'] = df['player2_dist_to_new_goal'] - df['player2_dist_to_first_shared']

    # Calculate joint distance sums
    df['joint_dist_to_new_goal'] = df['player1_dist_to_new_goal'] + df['player2_dist_to_new_goal']
    df['joint_dist_to_first_shared'] = df['player1_dist_to_first_shared'] + df['player2_dist_to_first_shared']
    df['joint_distance_diff'] = df['joint_dist_to_new_goal'] - df['joint_dist_to_first_shared']

    return df

# Apply distance calculations to 2P3G data
if not data_2p3g.empty:
    # Filter for trials where new goal was presented
    data_with_new_goal = data_2p3g[data_2p3g['newGoalPresented'] == True].copy()
    print(f"📊 Trials with new goal presented: {len(data_with_new_goal)}")

    # Compute distances
    data_with_distances = compute_equal_distance_validation(data_with_new_goal)
    print(f"✅ Distance calculations completed")

    # Show data summary
    print(f"\n📈 Data Summary:")
    print(f"Total trials with distances: {len(data_with_distances)}")
    print(f"Distance conditions: {data_with_distances['distanceCondition'].value_counts()}")
else:
    data_with_distances = pd.DataFrame()
    print("❌ No 2P3G data available")


📊 Trials with new goal presented: 130
✅ Distance calculations completed

📈 Data Summary:
Total trials with distances: 130
Distance conditions: distanceCondition
equal_to_both        45
closer_to_player1    45
closer_to_player2    40
Name: count, dtype: int64


## Validate Equal-to-Both Trials


In [5]:
# Filter for equal_to_both trials
if not data_with_distances.empty:
    equal_to_both_trials = data_with_distances[
        data_with_distances['distanceCondition'] == 'equal_to_both'
    ].copy()

    print(f"🎯 Equal-to-both trials: {len(equal_to_both_trials)}")

    if len(equal_to_both_trials) > 0:
        print("\n=== EQUAL DISTANCE VALIDATION ===\n")

        # Define validation criteria (allowing small floating point errors)
        tolerance = 1e-10

        # Validation 1: Player 1's distance equality
        player1_equal = np.abs(equal_to_both_trials['player1_distance_diff']) <= tolerance
        player1_equal_count = player1_equal.sum()
        player1_equal_pct = (player1_equal_count / len(equal_to_both_trials)) * 100

        print(f"1️⃣ Player 1 Distance Equality:")
        print(f"   ✅ Equal distances: {player1_equal_count}/{len(equal_to_both_trials)} ({player1_equal_pct:.1f}%)")
        print(f"   ❌ Unequal distances: {len(equal_to_both_trials) - player1_equal_count}")

        # Validation 2: Player 2's distance equality
        player2_equal = np.abs(equal_to_both_trials['player2_distance_diff']) <= tolerance
        player2_equal_count = player2_equal.sum()
        player2_equal_pct = (player2_equal_count / len(equal_to_both_trials)) * 100

        print(f"\n2️⃣ Player 2 Distance Equality:")
        print(f"   ✅ Equal distances: {player2_equal_count}/{len(equal_to_both_trials)} ({player2_equal_pct:.1f}%)")
        print(f"   ❌ Unequal distances: {len(equal_to_both_trials) - player2_equal_count}")

        # Validation 3: Joint distance sum equality
        joint_equal = np.abs(equal_to_both_trials['joint_distance_diff']) <= tolerance
        joint_equal_count = joint_equal.sum()
        joint_equal_pct = (joint_equal_count / len(equal_to_both_trials)) * 100

        print(f"\n3️⃣ Joint Distance Sum Equality:")
        print(f"   ✅ Equal sums: {joint_equal_count}/{len(equal_to_both_trials)} ({joint_equal_pct:.1f}%)")
        print(f"   ❌ Unequal sums: {len(equal_to_both_trials) - joint_equal_count}")

        # Overall validation: All three criteria must be met
        all_criteria_met = player1_equal & player2_equal & joint_equal
        all_criteria_count = all_criteria_met.sum()
        all_criteria_pct = (all_criteria_count / len(equal_to_both_trials)) * 100

        print(f"\n🏆 OVERALL VALIDATION:")
        print(f"   ✅ Perfect equality (all criteria): {all_criteria_count}/{len(equal_to_both_trials)} ({all_criteria_pct:.1f}%)")
        print(f"   ❌ Failed validation: {len(equal_to_both_trials) - all_criteria_count}")

        # Add validation flags to dataframe
        equal_to_both_trials['player1_distances_equal'] = player1_equal
        equal_to_both_trials['player2_distances_equal'] = player2_equal
        equal_to_both_trials['joint_distances_equal'] = joint_equal
        equal_to_both_trials['perfect_equality'] = all_criteria_met

    else:
        print("❌ No equal_to_both trials found")
        equal_to_both_trials = pd.DataFrame()
else:
    print("❌ No data with distances available")
    equal_to_both_trials = pd.DataFrame()


🎯 Equal-to-both trials: 45

=== EQUAL DISTANCE VALIDATION ===

1️⃣ Player 1 Distance Equality:
   ✅ Equal distances: 1/45 (2.2%)
   ❌ Unequal distances: 44

2️⃣ Player 2 Distance Equality:
   ✅ Equal distances: 0/45 (0.0%)
   ❌ Unequal distances: 45

3️⃣ Joint Distance Sum Equality:
   ✅ Equal sums: 10/45 (22.2%)
   ❌ Unequal sums: 35

🏆 OVERALL VALIDATION:
   ✅ Perfect equality (all criteria): 0/45 (0.0%)
   ❌ Failed validation: 45


## Export Validation Results


In [6]:
if not equal_to_both_trials.empty:
    # Create summary dataset for export
    export_columns = [
        'participantID', 'trialIndex', 'experimentType', 'rlAgentType', 'distanceCondition',
        'newGoalPresentedTime', 'firstDetectedSharedGoal',
        'player1_pos_at_new_goal', 'player2_pos_at_new_goal',
        'first_shared_goal_pos', 'new_goal_pos',
        'player1_dist_to_new_goal', 'player1_dist_to_first_shared', 'player1_distance_diff',
        'player2_dist_to_new_goal', 'player2_dist_to_first_shared', 'player2_distance_diff',
        'joint_dist_to_new_goal', 'joint_dist_to_first_shared', 'joint_distance_diff',
        'player1_distances_equal', 'player2_distances_equal', 'joint_distances_equal', 'perfect_equality'
    ]

    # Filter columns that exist in the dataframe
    available_columns = [col for col in export_columns if col in equal_to_both_trials.columns]

    export_data = equal_to_both_trials[available_columns].copy()

    # Save detailed results
    output_file = 'joint_rl_realtime_equal_distance_validation_results.csv'
    export_data.to_csv(output_file, index=False)
    print(f"\n💾 Detailed validation results saved to: {output_file}")
    print(f"   Columns: {len(available_columns)}")
    print(f"   Rows: {len(export_data)}")

    # Create summary statistics
    summary_stats = {
        'total_equal_to_both_trials': len(equal_to_both_trials),
        'player1_equality_success': equal_to_both_trials['player1_distances_equal'].sum(),
        'player1_equality_rate': (equal_to_both_trials['player1_distances_equal'].sum() / len(equal_to_both_trials)) * 100,
        'player2_equality_success': equal_to_both_trials['player2_distances_equal'].sum(),
        'player2_equality_rate': (equal_to_both_trials['player2_distances_equal'].sum() / len(equal_to_both_trials)) * 100,
        'joint_equality_success': equal_to_both_trials['joint_distances_equal'].sum(),
        'joint_equality_rate': (equal_to_both_trials['joint_distances_equal'].sum() / len(equal_to_both_trials)) * 100,
        'perfect_equality_success': equal_to_both_trials['perfect_equality'].sum(),
        'perfect_equality_rate': (equal_to_both_trials['perfect_equality'].sum() / len(equal_to_both_trials)) * 100,
        'player1_distance_diff_mean': equal_to_both_trials['player1_distance_diff'].mean(),
        'player1_distance_diff_std': equal_to_both_trials['player1_distance_diff'].std(),
        'player2_distance_diff_mean': equal_to_both_trials['player2_distance_diff'].mean(),
        'player2_distance_diff_std': equal_to_both_trials['player2_distance_diff'].std(),
        'joint_distance_diff_mean': equal_to_both_trials['joint_distance_diff'].mean(),
        'joint_distance_diff_std': equal_to_both_trials['joint_distance_diff'].std()
    }

    summary_df = pd.DataFrame([summary_stats])
    summary_file = 'joint_rl_realtime_equal_distance_validation_summary.csv'
    summary_df.to_csv(summary_file, index=False)
    print(f"\n📊 Summary statistics saved to: {summary_file}")

    # Perfect equality trials for further analysis
    perfect_trials = equal_to_both_trials[equal_to_both_trials['perfect_equality']]
    if len(perfect_trials) > 0:
        perfect_file = 'joint_rl_realtime_perfect_equality_trials.csv'
        perfect_trials[available_columns].to_csv(perfect_file, index=False)
        print(f"\n✅ Perfect equality trials saved to: {perfect_file}")
        print(f"   These {len(perfect_trials)} trials can be used for further analysis.")

    print("\n🎉 Analysis complete! All validation results have been exported.")

else:
    print("❌ No data available for export")



💾 Detailed validation results saved to: joint_rl_realtime_equal_distance_validation_results.csv
   Columns: 24
   Rows: 45

📊 Summary statistics saved to: joint_rl_realtime_equal_distance_validation_summary.csv

🎉 Analysis complete! All validation results have been exported.


## Conclusion

This notebook validates the equal distance conditions for `equal_to_both` trials in joint-RL real-time 2P3G experiments. The analysis checks three key criteria:

1. **Player 1 Distance Equality**: Player 1's distance to the new goal should equal their distance to the first shared goal
2. **Player 2 Distance Equality**: Player 2's distance to the new goal should equal their distance to the first shared goal  
3. **Joint Distance Sum Equality**: The sum of both players' distances to the new goal should equal the sum of their distances to the first shared goal

### Key Findings:
- The analysis uses player positions at the exact moment when the new goal is presented (`newGoalPresentedTime`)
- Distance calculations use Manhattan distance as specified in the game mechanics
- Trials that meet all three criteria are identified as having "perfect equality" and can be used for further analysis
- Results are exported for downstream analysis and validation

### Files Generated:
- `joint_rl_realtime_equal_distance_validation_results.csv`: Detailed results for all equal_to_both trials
- `joint_rl_realtime_equal_distance_validation_summary.csv`: Summary statistics
- `joint_rl_realtime_perfect_equality_trials.csv`: Trials that pass all validation criteria
