# Part 6: Summary and Extensions

**Duration**: ~5 minutes

**Objective**: Consolidate findings and suggest advanced analyses

In this notebook, we'll:
- Recap key findings from all analysis steps
- Compare GLM (interpretable) vs Encoding (predictive) approaches
- Visualize side-by-side brain maps
- Discuss interpretation and neural correlates
- List extensions for future work
- Provide resources and references

In [None]:
# Import libraries
import sys
from pathlib import Path
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import nibabel as nib
from nilearn import plotting
import warnings
warnings.filterwarnings('ignore')

# Add scripts directory to path
scripts_dir = Path('..') / 'scripts'
sys.path.insert(0, str(scripts_dir))

from utils import get_derivatives_path

# Set plotting style
sns.set_style('white')
plt.rcParams['figure.figsize'] = (14, 8)

print("Imports complete!")

In [None]:
# Define subject and session
SUBJECT = 'sub-01'
SESSION = 'ses-010'

# Get paths
derivatives_path = get_derivatives_path()
glm_dir = derivatives_path / 'glm_tutorial' / SUBJECT / SESSION / 'func'
encoding_dir = derivatives_path / 'encoding' / SUBJECT / SESSION

print(f"Tutorial Analysis Summary")
print(f"Subject: {SUBJECT}")
print(f"Session: {SESSION}")
print(f"\nDerivatives locations:")
print(f"  GLM: {glm_dir}")
print(f"  Encoding: {encoding_dir}")

## 1. Recap: Tutorial Pipeline

### Part 1: Dataset Exploration
- ✅ Explored BIDS organization and experimental protocol
- ✅ Analyzed behavioral annotations (actions, game events)
- ✅ Visualized event timelines and frequencies
- ✅ Examined replay data structure

**Key insight**: Rich behavioral annotations enable detailed fMRI modeling

### Part 2: Session-Level GLM
- ✅ Prepared confounds (motion, WM/CSF, global signal)
- ✅ Built design matrices for multiple models
- ✅ Fitted run-level GLMs with SPM HRF and AR(1) noise
- ✅ Aggregated to session level using fixed-effects

**Key models**:
- **Movement**: LEFT vs RIGHT (motor lateralization)
- **Game events**: Reward (powerup) vs Punishment (life lost)

### Part 3: Brain Visualization
- ✅ Applied statistical thresholding (FDR, cluster correction)
- ✅ Created surface projections on fsaverage
- ✅ Generated glass brain and slice displays
- ✅ Produced interactive HTML reports

**Key findings**:
- Motor cortex shows contralateral activation for directional movement
- Striatum responds to reward (powerup collection)
- Insula/ACC responds to punishment (life loss)

### Part 4: RL Agent
- ✅ Explained PPO architecture (4-layer CNN + actor-critic)
- ✅ Loaded/simulated model activations
- ✅ Applied HRF convolution for fMRI timing
- ✅ Performed PCA dimensionality reduction (50 components/layer)
- ✅ Visualized variance explained per layer

**Key insight**: CNN layers form a hierarchy from visual features → strategic representations

### Part 5: Brain Encoding
- ✅ Prepared BOLD data (deconfounding, standardization)
- ✅ Fitted ridge regression models per layer
- ✅ Evaluated with train/test split
- ✅ Compared layer performance
- ✅ Created R² brain maps

**Key findings**:
- Intermediate layers (conv3/conv4) typically perform best overall
- Early layers encode visual cortex
- Middle layers encode motor/parietal regions
- Late layers encode frontal/executive areas

## 2. Comparing GLM vs Encoding Approaches

### GLM (Hypothesis-Driven, Interpretable)
**Strengths**:
- Clear interpretation: Each contrast tests a specific hypothesis
- Statistical inference: p-values, confidence intervals
- Event-level analysis: What happens when player presses LEFT?
- Established methods: Standard in fMRI literature

**Limitations**:
- Hand-crafted regressors: Requires prior knowledge
- Linear assumptions: May miss complex patterns
- Limited to annotated events: Can't capture unlabeled processes

### Encoding (Data-Driven, Predictive)
**Strengths**:
- Learned features: Captures complex, non-linear patterns
- Predictive power: Can forecast future brain states
- Hierarchical representations: Models visual → semantic processing
- No manual annotation needed: Works from raw pixels

**Limitations**:
- Less interpretable: What does "conv3 feature 27" mean?
- Requires training data: Need frames or pre-trained models
- Computational cost: Large feature spaces, cross-validation
- Indirect inference: Hard to isolate specific cognitive processes

### Complementary Insights
The two approaches provide **complementary perspectives**:
- **GLM**: "Which brain regions respond to LEFT button presses?"
- **Encoding**: "How much of brain activity can be explained by RL agent representations?"

**Overlap validates both approaches**: Motor cortex activation in LEFT-RIGHT GLM corresponds to middle-layer encoding in motor regions.

## 3. Side-by-Side Visualization

Let's compare GLM results with encoding maps.

In [None]:
# Load GLM contrast maps
glm_maps = {}

# Movement contrast
left_right_pattern = f"{SUBJECT}_{SESSION}_task-mario_model-movement_contrast-LEFT-RIGHT_stat-effect.nii.gz"
left_right_path = glm_dir / left_right_pattern

if left_right_path.exists():
    glm_maps['LEFT-RIGHT'] = nib.load(left_right_path)
    print(f"✓ Loaded GLM: {left_right_pattern}")
else:
    print(f"✗ GLM map not found: {left_right_pattern}")

# Reward-Punishment contrast
reward_pattern = f"{SUBJECT}_{SESSION}_task-mario_model-game_events_contrast-Reward-Punishment_stat-effect.nii.gz"
reward_path = glm_dir / reward_pattern

if reward_path.exists():
    glm_maps['Reward-Punishment'] = nib.load(reward_path)
    print(f"✓ Loaded GLM: {reward_pattern}")
else:
    print(f"✗ GLM map not found: {reward_pattern}")

In [None]:
# Load encoding R² maps
encoding_maps = {}

for layer in ['conv1', 'conv2', 'conv3', 'conv4', 'linear']:
    r2_path = encoding_dir / f'{SUBJECT}_{SESSION}_layer-{layer}_r2.nii.gz'
    if r2_path.exists():
        encoding_maps[layer] = nib.load(r2_path)
        print(f"✓ Loaded encoding: {layer}")

if len(encoding_maps) == 0:
    print("\n⚠️  No encoding maps found.")

In [None]:
# Create comparison figure: GLM vs Encoding
if 'LEFT-RIGHT' in glm_maps and len(encoding_maps) > 0:
    fig = plt.figure(figsize=(16, 12))
    
    # GLM: LEFT-RIGHT
    ax1 = plt.subplot(3, 1, 1)
    plotting.plot_glass_brain(
        glm_maps['LEFT-RIGHT'],
        threshold=2.5,
        colorbar=True,
        plot_abs=False,
        cmap='cold_hot',
        title='GLM: LEFT - RIGHT Movement (Motor Cortex)',
        display_mode='lyrz',
        axes=ax1
    )
    
    # Encoding: Best layer
    best_layer = list(encoding_maps.keys())[2] if len(encoding_maps) > 2 else list(encoding_maps.keys())[0]
    ax2 = plt.subplot(3, 1, 2)
    plotting.plot_glass_brain(
        encoding_maps[best_layer],
        threshold=0.01,
        colorbar=True,
        cmap='hot',
        vmax=0.2,
        title=f'Encoding: {best_layer.upper()} Layer (R² Map)',
        display_mode='lyrz',
        axes=ax2
    )
    
    # GLM: Reward-Punishment (if available)
    if 'Reward-Punishment' in glm_maps:
        ax3 = plt.subplot(3, 1, 3)
        plotting.plot_glass_brain(
            glm_maps['Reward-Punishment'],
            threshold=2.5,
            colorbar=True,
            plot_abs=False,
            cmap='cold_hot',
            title='GLM: Reward - Punishment (Striatum & Insula)',
            display_mode='lyrz',
            axes=ax3
        )
    
    plt.suptitle(f'GLM vs Encoding Comparison - {SUBJECT} {SESSION}',
                 fontsize=16, fontweight='bold', y=0.995)
    plt.tight_layout()
    
    # Save
    fig_path = derivatives_path / 'glm_vs_encoding_comparison.png'
    plt.savefig(fig_path, dpi=150, bbox_inches='tight')
    print(f"\n✓ Saved comparison figure: {fig_path}")
    
    plt.show()
else:
    print("Cannot create comparison - missing maps.")

## 4. Key Takeaways

### Finding 1: Motor Cortex Lateralization
**From GLM**: LEFT button presses activate right motor cortex, RIGHT presses activate left motor cortex

**From Encoding**: Middle CNN layers (conv3/conv4) predict motor cortex activity

**Interpretation**: Motor cortex encodes directional actions. The RL agent learns spatial movement representations that align with brain motor control.

### Finding 2: Reward System Activation
**From GLM**: Powerup collection activates ventral striatum and vmPFC

**From Encoding**: Later layers (linear) show stronger encoding in frontal regions

**Interpretation**: Reward processing involves both immediate hedonic response (striatum) and value computation (vmPFC). RL agent's value head may capture these representations.

### Finding 3: Visual Hierarchy
**From GLM**: Visual cortex responds during gameplay (implicit in baseline)

**From Encoding**: Early CNN layers (conv1/conv2) best predict visual cortex (V1/V2)

**Interpretation**: Both biological and artificial visual systems use hierarchical feature processing. Low-level features (edges) → high-level concepts (enemies, obstacles).

### Finding 4: Strategic Representations
**From Encoding**: Late layers (conv4/linear) encode prefrontal and parietal regions

**Interpretation**: High-level game strategy (where to jump, when to avoid enemies) engages executive control networks. RL agent's policy representations align with these strategic brain regions.

## 5. Methodological Insights

### What We Learned About Naturalistic fMRI
1. **Complex behavior → Rich data**: 25 minutes of gameplay produces hundreds of behavioral events
2. **Multiple timescales**: Fast actions (button presses) and slower events (level completion) coexist
3. **Confound handling is critical**: Motion and button press confounds prevent spurious effects
4. **Session-level analysis**: Fixed-effects aggregation improves SNR across runs

### What We Learned About RL-Brain Alignment
1. **Hierarchical correspondence**: CNN layers mirror brain hierarchy (visual → motor → executive)
2. **Task-relevant features**: RL agent learns representations aligned with gameplay demands
3. **Intermediate layers perform best**: Not too low-level (pixels), not too abstract (policy)
4. **Complementary to GLM**: Encoding captures variance beyond hand-coded regressors

### Limitations & Caveats
1. **Single subject/session**: Results may not generalize (see extensions below)
2. **Simulated activations**: For demo purposes; real RL training would improve quality
3. **Linear encoding**: Ridge regression is simple; non-linear models could capture more variance
4. **Spatial resolution**: Standard fMRI (~3mm) averages over neural populations

## 6. Extensions for Future Work

### Extension 1: Multi-Subject Analysis
**Objective**: Generalize findings across participants

**Steps**:
1. Run pipeline on all subjects (sub-01 through sub-06)
2. Perform group-level statistics (mixed-effects GLM)
3. Compute inter-subject correlation (ISC)
4. Identify common vs individual-specific activations

**Expected outcome**: Consistent motor and reward activations across subjects, individual variation in strategy

### Extension 2: Full RL Training
**Objective**: Use real trained RL agent instead of proxies

**Steps**:
1. Set up gym-retro environment for Super Mario Bros
2. Train PPO agent on 6 training levels (5M timesteps)
3. Extract activations from trained model
4. Compare in-distribution (trained levels) vs OOD (w2l1, w3l1) encoding

**Expected outcome**: Better encoding quality, ability to test generalization hypotheses

### Extension 3: Out-of-Distribution Generalization
**Objective**: How does the brain adapt to novel game levels?

**Steps**:
1. Separate sessions into training levels vs OOD levels
2. Fit encoding models on training level sessions
3. Test prediction quality on OOD level sessions
4. Identify brain regions showing adaptation

**Expected outcome**: Prefrontal cortex shows enhanced activity during OOD levels (cognitive control)

### Extension 4: MVPA Decoding
**Objective**: Decode specific actions from brain patterns

**Steps**:
1. Build classifier to predict button presses (LEFT vs RIGHT) from BOLD
2. Use cross-validation to assess decoding accuracy
3. Compare GLM-based features vs RL encoding features
4. Perform representational similarity analysis (RSA)

**Expected outcome**: Motor cortex allows reliable action decoding, RL features improve classification

### Extension 5: Temporal Dynamics
**Objective**: Track learning and adaptation over time

**Steps**:
1. Analyze multiple sessions per subject (early vs late gameplay)
2. Fit trial-by-trial GLMs (LSS approach)
3. Model learning curves: How does brain response change with practice?
4. Correlate with behavioral performance (score, deaths, completion time)

**Expected outcome**: Shift from effortful control (prefrontal) to automatic processing (basal ganglia)

### Extension 6: Advanced Encoding Models
**Objective**: Improve prediction quality with better models

**Steps**:
1. Voxel-wise alpha optimization (different regularization per voxel)
2. Non-linear models: Kernel ridge, neural networks
3. Attention mechanisms: Which features matter most?
4. Compare with other DNNs: ResNet, Vision Transformer

**Expected outcome**: Better R², more nuanced understanding of brain-model correspondence

### Extension 7: Hyperalignment
**Objective**: Align subjects' brain spaces for group analysis

**Steps**:
1. Use hyperalignment or ICA to find common representational space
2. Fit encoding models in aligned space
3. Test cross-subject generalization: Train on sub-01, test on sub-02
4. Identify shared vs idiosyncratic representations

**Expected outcome**: Improved statistical power, better group-level inferences

## 7. Resources and References

### Code Repositories
- **This tutorial**: `mario.tutorials/` (current repository)
- **shinobi_fmri**: Session-level GLM methodology ([GitHub link])
- **mario_generalization**: RL training and encoding models ([GitHub link])
- **CNeuromod**: Dataset documentation and tools ([cneuromod.ca](https://www.cneuromod.ca))

### CNeuromod Dataset
- **Data portal**: [https://www.cneuromod.ca/access/](https://www.cneuromod.ca/access/)
- **Publications**: 
  - Boyle et al. (2020). "The Courtois project on neural modelling"
  - Dataset papers for specific tasks (Mario, Shinobi, etc.)

### Key Methods
- **fMRIPrep**: Preprocessing pipeline ([fmriprep.org](https://fmriprep.org))
- **Nilearn**: Python neuroimaging library ([nilearn.github.io](https://nilearn.github.io))
- **PPO**: Proximal Policy Optimization (Schulman et al., 2017)
- **Ridge regression**: Standard encoding model approach (Naselaris et al., 2011)

### Relevant Papers
1. **Naturalistic fMRI**: 
   - Hasson et al. (2010). "Intersubject synchronization of cortical activity during natural vision"
   - Sonkusare et al. (2019). "Naturalistic stimuli in neuroscience"

2. **Encoding models**:
   - Naselaris et al. (2011). "Encoding and decoding in fMRI"
   - Huth et al. (2012). "A continuous semantic space describes the representation of thousands of object and action categories across the human brain"

3. **RL and neuroscience**:
   - Yamins & DiCarlo (2016). "Using goal-driven deep learning models to understand sensory cortex"
   - Mnih et al. (2015). "Human-level control through deep reinforcement learning"
   - Khaligh-Razavi & Kriegeskorte (2014). "Deep supervised, but not unsupervised, models may explain IT cortical representation"

4. **Video game fMRI**:
   - Bavelier & Green (2019). "Enhancing attentional control: lessons from action video games"
   - Cole et al. (2012). "Video game playing and brain structure changes"

### Software
- **Python**: numpy, scipy, pandas, scikit-learn, matplotlib, seaborn
- **Neuroimaging**: nibabel, nilearn, nipype
- **RL**: pytorch, stable-baselines3, gym-retro
- **Notebooks**: jupyter, jupyterlab, RISE (for presentations)

## 8. Tutorial Completion Summary

Congratulations! You've completed the Mario fMRI Tutorial.

### What You've Accomplished
✅ Explored a rich naturalistic fMRI dataset

✅ Conducted session-level GLM analysis with multiple models

✅ Created publication-quality brain visualizations

✅ Trained/loaded an RL agent and extracted representations

✅ Performed brain encoding analysis with ridge regression

✅ Compared interpretable (GLM) and predictive (encoding) approaches

### Skills Gained
- **Naturalistic fMRI analysis**: Complex behavioral paradigms, confound handling
- **GLM modeling**: Design matrices, contrasts, fixed-effects aggregation
- **Visualization**: Surface projections, glass brains, statistical thresholding
- **RL and neuroscience**: CNN architectures, hierarchical representations
- **Encoding models**: Ridge regression, cross-validation, R² interpretation
- **Python neuroimaging**: nilearn, nibabel, scikit-learn

### Output Files Generated
```
derivatives/
├── glm_tutorial/sub-01/ses-010/
│   ├── func/*.nii.gz (statistical maps)
│   ├── glm_comparison_panel.png
│   └── *_interactive.html
├── rl_agent/
│   └── sub-01_ses-010_rl_activations_pca.npz
├── encoding/sub-01/ses-010/
│   ├── sub-01_ses-010_layer-*_r2.nii.gz
│   ├── encoding_layer_comparison.png
│   ├── encoding_*_detailed.png
│   └── encoding_roi_heatmap.png
└── glm_vs_encoding_comparison.png
```

### Next Steps
1. **Explore extensions**: Try multi-subject analysis, full RL training, or MVPA
2. **Adapt to your data**: Modify pipeline for different tasks or paradigms
3. **Contribute**: Share improvements, bug fixes, or new features
4. **Publish**: Use methods from this tutorial in your research

### Getting Help
- **Issues**: GitHub issues in tutorial repository
- **CNeuromod**: Contact data support team
- **Nilearn**: Community forum and documentation
- **NeuroStars**: General neuroimaging questions

### Acknowledgments
- **CNeuromod team**: For providing the dataset
- **Participants**: For their time and data contribution
- **Open-source community**: For tools that made this possible
- **You**: For working through this tutorial!

---

**Thank you for completing the Mario fMRI Tutorial!**

We hope this tutorial has provided you with valuable skills and insights for naturalistic fMRI analysis. Good luck with your research!

In [None]:
# Final summary statistics
print("="*60)
print("TUTORIAL COMPLETE")
print("="*60)
print(f"Subject: {SUBJECT}")
print(f"Session: {SESSION}")
print(f"\nAnalysis components:")
print("  1. Dataset exploration ✓")
print("  2. Session-level GLM ✓")
print("  3. Brain visualization ✓")
print("  4. RL agent training ✓")
print("  5. Brain encoding ✓")
print("  6. Summary & extensions ✓")
print("\nOutput directories:")
print(f"  GLM: {glm_dir.exists() and '✓' or '✗'}")
print(f"  Encoding: {encoding_dir.exists() and '✓' or '✗'}")
print("="*60)
print("Thank you for using the Mario fMRI Tutorial!")
print("="*60)