# Gait Dynamics × Mental Scores Analysis

This notebook analyzes the relationship between **gait dynamics** and **mental scores** at the subject level.

## Data Structure

```
Gait Dynamics: [n_subjects, n_timepoints, n_body_factors]
    - Each subject has time-series gait data
    - Time: gait cycle (0-100%)
    - Body factors: joint angles, stride parameters, etc.

Mental Scores: [n_subjects, n_mental_vars]
    - Each subject has multiple mental scores
    - Variables: wellbeing, anxiety, stress, fatigue, etc.
    - These are TIME-INVARIANT (one value per subject)
```

## Goal

Find which gait patterns are associated with which mental variables:
- Does higher anxiety correlate with trunk instability?
- Does better wellbeing correlate with longer strides?
- Can we predict mental scores from gait patterns?

In [None]:
# Setup
import sys
from pathlib import Path
project_root = Path.cwd().parent
sys.path.insert(0, str(project_root))

import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

from src.dpca import MultiVariateMentalDPCA, ContinuousScoreDPCA

%matplotlib inline
plt.rcParams['figure.figsize'] = (12, 8)
plt.rcParams['font.size'] = 12
print("Setup complete!")

## 1. Generate Sample Data

We simulate 50 subjects, each with:
- Gait dynamics over 100 timepoints (gait cycle)
- 15 body factors (joint angles, stride parameters, etc.)
- 4 mental scores (wellbeing, anxiety, stress, fatigue)

In [None]:
# Parameters
np.random.seed(42)
n_subjects = 50
n_timepoints = 100
n_body_factors = 15
n_mental_vars = 4

# Labels
gait_labels = [
    'hip_flexion', 'hip_abduction', 'knee_flexion', 'ankle_dorsiflexion',
    'pelvis_tilt', 'pelvis_obliquity', 'trunk_flexion', 'trunk_rotation',
    'stride_length', 'step_width', 'cadence', 'grf_vertical', 
    'grf_anterior', 'grf_lateral', 'com_velocity'
]
mental_labels = ['wellbeing', 'anxiety', 'stress', 'fatigue']

print(f"=== Data Structure ===")
print(f"Subjects: {n_subjects}")
print(f"Gait: {n_timepoints} timepoints × {n_body_factors} body factors")
print(f"Mental: {n_mental_vars} variables")
print(f"\nGait factors: {gait_labels}")
print(f"Mental variables: {mental_labels}")

In [None]:
# Generate mental scores: [n_subjects, n_mental_vars]
# Create correlated mental scores (e.g., high anxiety often comes with high stress)
latent = np.random.randn(n_subjects)  # Latent mental health factor

mental_scores = np.zeros((n_subjects, n_mental_vars))
mental_scores[:, 0] = 5 + latent * 1.5 + np.random.randn(n_subjects) * 0.5   # wellbeing (+)
mental_scores[:, 1] = 5 - latent * 1.2 + np.random.randn(n_subjects) * 0.5   # anxiety (-)
mental_scores[:, 2] = 5 - latent * 1.0 + np.random.randn(n_subjects) * 0.5   # stress (-)
mental_scores[:, 3] = 5 - latent * 0.8 + np.random.randn(n_subjects) * 0.5   # fatigue (-)
mental_scores = np.clip(mental_scores, 1, 10)  # Scale 1-10

print(f"Mental Scores shape: {mental_scores.shape}")
print(f"\nExample (first 5 subjects):")
print(f"{'Subject':<10} | " + " | ".join(f'{m:>10}' for m in mental_labels))
print("-" * 65)
for i in range(5):
    print(f"Subject {i:<3} | " + " | ".join(f'{s:>10.2f}' for s in mental_scores[i]))

In [None]:
# Generate gait dynamics: [n_subjects, n_timepoints, n_body_factors]
# Define which gait features are affected by which mental variables
gait_mental_effects = {
    # wellbeing affects stride-related features positively
    0: {'wellbeing': 0.3},   # hip_flexion
    2: {'wellbeing': 0.3},   # knee_flexion
    8: {'wellbeing': 0.4},   # stride_length
    14: {'wellbeing': 0.3},  # com_velocity
    
    # anxiety affects trunk stability negatively
    4: {'anxiety': -0.3},    # pelvis_tilt
    5: {'anxiety': -0.25},   # pelvis_obliquity
    6: {'anxiety': -0.3},    # trunk_flexion
    7: {'anxiety': -0.25},   # trunk_rotation
    
    # fatigue affects movement intensity
    10: {'fatigue': -0.2},   # cadence
    11: {'fatigue': -0.15},  # grf_vertical
}

t = np.linspace(0, 2*np.pi, n_timepoints)
gait_data = np.zeros((n_subjects, n_timepoints, n_body_factors))

for subj in range(n_subjects):
    for f in range(n_body_factors):
        # Base gait pattern (common to all subjects)
        base_pattern = np.sin(t + f * 0.3) + 0.5 * np.cos(2*t + f * 0.2)
        
        # Mental-dependent effect (constant offset based on subject's mental scores)
        mental_effect = 0
        if f in gait_mental_effects:
            for mental_var, weight in gait_mental_effects[f].items():
                m_idx = mental_labels.index(mental_var)
                mental_effect += weight * (mental_scores[subj, m_idx] - 5)  # Center around 5
        
        # Individual variation + noise
        individual = np.random.randn() * 0.1
        noise = np.random.randn(n_timepoints) * 0.1
        
        gait_data[subj, :, f] = base_pattern + mental_effect + individual + noise

print(f"Gait Data shape: {gait_data.shape}")
print(f"  = [n_subjects, n_timepoints, n_body_factors]")
print(f"\nGait-Mental Associations (ground truth):")
for f, effects in gait_mental_effects.items():
    for mental_var, weight in effects.items():
        direction = "↑" if weight > 0 else "↓"
        print(f"  {gait_labels[f]:20s} {direction} when {mental_var} ↑")

## 2. Visualize Data

In [None]:
fig, axes = plt.subplots(2, 2, figsize=(14, 10))
time_axis = np.linspace(0, 100, n_timepoints)

# 1. Mental score distributions
ax1 = axes[0, 0]
for m_idx, mental_var in enumerate(mental_labels):
    ax1.hist(mental_scores[:, m_idx], bins=10, alpha=0.6, label=mental_var, edgecolor='white')
ax1.set_xlabel('Score')
ax1.set_ylabel('Count')
ax1.set_title('Mental Score Distributions', fontweight='bold')
ax1.legend()

# 2. Mental score correlations
ax2 = axes[0, 1]
corr_matrix = np.corrcoef(mental_scores.T)
sns.heatmap(corr_matrix, xticklabels=mental_labels, yticklabels=mental_labels,
            cmap='RdBu_r', center=0, annot=True, fmt='.2f', ax=ax2)
ax2.set_title('Mental Score Correlations', fontweight='bold')

# 3. Gait patterns colored by wellbeing
ax3 = axes[1, 0]
high_wb = mental_scores[:, 0] > np.percentile(mental_scores[:, 0], 75)
low_wb = mental_scores[:, 0] < np.percentile(mental_scores[:, 0], 25)
ax3.plot(time_axis, gait_data[high_wb, :, 8].mean(axis=0), 'g-', lw=2, label='High Wellbeing')
ax3.plot(time_axis, gait_data[low_wb, :, 8].mean(axis=0), 'r-', lw=2, label='Low Wellbeing')
ax3.fill_between(time_axis, 
                  gait_data[high_wb, :, 8].mean(axis=0) - gait_data[high_wb, :, 8].std(axis=0),
                  gait_data[high_wb, :, 8].mean(axis=0) + gait_data[high_wb, :, 8].std(axis=0),
                  color='green', alpha=0.2)
ax3.fill_between(time_axis,
                  gait_data[low_wb, :, 8].mean(axis=0) - gait_data[low_wb, :, 8].std(axis=0),
                  gait_data[low_wb, :, 8].mean(axis=0) + gait_data[low_wb, :, 8].std(axis=0),
                  color='red', alpha=0.2)
ax3.set_xlabel('Gait Cycle (%)')
ax3.set_ylabel('Stride Length')
ax3.set_title('Stride Length by Wellbeing Level', fontweight='bold')
ax3.legend()

# 4. Wellbeing vs time-averaged stride length
ax4 = axes[1, 1]
stride_mean = gait_data[:, :, 8].mean(axis=1)
ax4.scatter(mental_scores[:, 0], stride_mean, c=mental_scores[:, 0], cmap='RdYlGn', 
            s=80, edgecolors='black', alpha=0.8)
z = np.polyfit(mental_scores[:, 0], stride_mean, 1)
ax4.plot(mental_scores[:, 0], np.poly1d(z)(mental_scores[:, 0]), 'k--', lw=2)
corr = np.corrcoef(mental_scores[:, 0], stride_mean)[0, 1]
ax4.set_xlabel('Wellbeing Score')
ax4.set_ylabel('Mean Stride Length')
ax4.set_title(f'Wellbeing vs Stride (r = {corr:.3f})', fontweight='bold')

plt.tight_layout()
plt.show()

## 3. Analysis: Find Gait-Mental Associations

We use `MultiVariateMentalDPCA` to find which gait patterns are associated with which mental variables.

The method:
1. Extracts latent gait features (PCA on time-averaged gait)
2. Uses CCA/PLS to find associations between gait latent factors and mental variables

In [None]:
# Fit MultiVariateMentalDPCA
model = MultiVariateMentalDPCA(
    n_gait_components=10,
    method='cca'  # or 'pls'
)

model.fit(
    gait_data,      # [n_subjects, n_timepoints, n_body_factors]
    mental_scores,  # [n_subjects, n_mental_vars]
    gait_labels=gait_labels,
    mental_labels=mental_labels
)

print("=== Model Fitted ===")
print(f"Method: {model.method.upper()}")
print(f"Gait latent components: {model.gait_latent_.shape[1]}")
print(f"Mental variables: {len(mental_labels)}")
print(f"\nCanonical correlations:")
for i, corr in enumerate(model.correlations_):
    bar = "█" * int(abs(corr) * 20)
    print(f"  Component {i+1}: {corr:.3f} {bar}")

## 4. Results: Gait-Mental Associations

In [None]:
# Get associations between mental variables and gait features
associations = model.get_mental_gait_associations()

print("=== Mental Variable → Gait Feature Associations ===\n")
for mental_var, gait_assoc in associations.items():
    print(f"{mental_var.upper()}:")
    for gait_feat, corr in gait_assoc.items():
        direction = "↑" if corr > 0 else "↓"
        bar = "█" * int(abs(corr) * 30)
        print(f"  {gait_feat:22s}: {corr:+.3f} {bar} {direction}")
    print()

In [None]:
# Heatmap of Gait-Mental correlations
plt.figure(figsize=(10, 12))
sns.heatmap(
    model.feature_correlations_,
    xticklabels=mental_labels,
    yticklabels=gait_labels,
    cmap='RdBu_r',
    center=0,
    annot=True,
    fmt='.2f',
    vmin=-1, vmax=1
)
plt.title('Gait Feature × Mental Variable Correlations', fontsize=14, fontweight='bold')
plt.xlabel('Mental Variables')
plt.ylabel('Gait Body Factors')
plt.tight_layout()
plt.show()

print("\nInterpretation:")
print("  Red (+): Higher mental score → Higher gait feature value")
print("  Blue (-): Higher mental score → Lower gait feature value")

## 5. Summary

### Data Structure

| Data | Shape | Description |
|------|-------|-------------|
| Gait Dynamics | `[n_subjects, n_timepoints, n_body_factors]` | Time-varying gait patterns |
| Mental Scores | `[n_subjects, n_mental_vars]` | Time-invariant mental scores |

### Key Findings

The analysis reveals associations between:
- **Wellbeing** → stride-related features (hip, knee, stride length, COM velocity)
- **Anxiety** → trunk stability features (pelvis tilt, trunk flexion/rotation)
- **Fatigue** → movement intensity features (cadence, ground reaction forces)

### Method: `MultiVariateMentalDPCA`

```python
from src.dpca import MultiVariateMentalDPCA

model = MultiVariateMentalDPCA(n_gait_components=10, method='cca')
model.fit(gait_data, mental_scores, gait_labels=gait_labels, mental_labels=mental_labels)

# Get associations
associations = model.get_mental_gait_associations()

# Correlation heatmap
model.feature_correlations_  # [n_body_factors, n_mental_vars]
```

### Applications

- **Mental health monitoring**: Detect mental state changes from gait patterns
- **Clinical assessment**: Objective markers for anxiety, depression, fatigue
- **Rehabilitation**: Track recovery through gait analysis
- **Research**: Understand mind-body connections