# üß™ CogniSense Phase 2 Test Notebook

This notebook tests the training infrastructure to ensure everything works correctly.

**Expected Result**: All tests pass ‚úÖ

## Setup

In [None]:
%%capture
# Install dependencies
!pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
!pip install transformers datasets
!pip install pillow numpy pandas matplotlib seaborn
!pip install scikit-learn scipy tqdm

In [None]:
# Clone repository
import os
if not os.path.exists('AI4Alzheimers'):
    !git clone https://github.com/Arnavsharma2/AI4Alzheimers.git
    %cd AI4Alzheimers
else:
    %cd AI4Alzheimers
    !git pull

## Run Comprehensive Phase 2 Tests

In [None]:
# Run the automated test suite
!python test_phase2.py

## Manual Test: Quick Training Run

Let's run a quick training session to verify everything works end-to-end.

In [None]:
# Quick training test (5 epochs, small dataset)
!python train.py --mode fusion --epochs 5 --num-samples 50 --batch-size 8 --save-dir ./test_checkpoints

## Verify Outputs

In [None]:
import json
import os

# Check that checkpoint was created
checkpoint_path = './test_checkpoints/fusion/best_model.pt'
if os.path.exists(checkpoint_path):
    print(f"‚úÖ Checkpoint saved: {checkpoint_path}")
    
    # Check file size
    size_mb = os.path.getsize(checkpoint_path) / (1024 * 1024)
    print(f"   Size: {size_mb:.2f} MB")
else:
    print("‚ùå Checkpoint not found!")

# Check training history
history_path = './test_checkpoints/fusion/training_history.json'
if os.path.exists(history_path):
    print(f"\n‚úÖ Training history saved: {history_path}")
    
    with open(history_path) as f:
        history = json.load(f)
    
    print(f"   Epochs trained: {len(history['train_loss'])}")
    print(f"   Final train loss: {history['train_loss'][-1]:.4f}")
    print(f"   Final val loss: {history['val_loss'][-1]:.4f}")
else:
    print("‚ùå Training history not found!")

# Check test metrics
metrics_path = './test_checkpoints/fusion/test_metrics.json'
if os.path.exists(metrics_path):
    print(f"\n‚úÖ Test metrics saved: {metrics_path}")
    
    with open(metrics_path) as f:
        metrics = json.load(f)
    
    print("\nTest Set Performance:")
    for key, value in metrics.items():
        print(f"   {key}: {value:.4f}")
else:
    print("‚ùå Test metrics not found!")

## Test Individual Modality Training

In [None]:
# Test training a single modality (eye tracking)
!python train.py --mode single --modality eye --epochs 3 --num-samples 40 --batch-size 8 --save-dir ./test_checkpoints

In [None]:
# Verify eye model checkpoint
eye_checkpoint = './test_checkpoints/eye/best_model.pt'
if os.path.exists(eye_checkpoint):
    print(f"‚úÖ Eye model checkpoint saved")
    
    # Load and check metrics
    with open('./test_checkpoints/eye/test_metrics.json') as f:
        eye_metrics = json.load(f)
    
    print("\nEye Tracking Model Performance:")
    print(f"  Accuracy: {eye_metrics['accuracy']:.4f}")
    print(f"  AUC: {eye_metrics['auc']:.4f}")
else:
    print("‚ùå Eye model checkpoint not found!")

## Visualize Training Progress

In [None]:
import matplotlib.pyplot as plt

# Plot training history
with open('./test_checkpoints/fusion/training_history.json') as f:
    history = json.load(f)

fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Loss plot
ax = axes[0]
epochs = range(1, len(history['train_loss']) + 1)
ax.plot(epochs, history['train_loss'], 'b-o', label='Train Loss', linewidth=2)
ax.plot(epochs, history['val_loss'], 'r-o', label='Val Loss', linewidth=2)
ax.set_xlabel('Epoch', fontsize=12)
ax.set_ylabel('Loss', fontsize=12)
ax.set_title('Training Progress - Loss', fontsize=14, fontweight='bold')
ax.legend()
ax.grid(True, alpha=0.3)

# Accuracy plot
ax = axes[1]
train_accs = [m['accuracy'] for m in history['train_metrics']]
val_accs = [m['accuracy'] for m in history['val_metrics']]
ax.plot(epochs, train_accs, 'b-o', label='Train Accuracy', linewidth=2)
ax.plot(epochs, val_accs, 'r-o', label='Val Accuracy', linewidth=2)
ax.set_xlabel('Epoch', fontsize=12)
ax.set_ylabel('Accuracy', fontsize=12)
ax.set_title('Training Progress - Accuracy', fontsize=14, fontweight='bold')
ax.set_ylim([0, 1])
ax.legend()
ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

print("‚úÖ Training visualization complete!")

## Load and Test Trained Model

In [None]:
import torch
from src.fusion.fusion_model import MultimodalFusionModel
from src.data_processing.synthetic_data_generator import (
    EyeTrackingGenerator,
    TypingDynamicsGenerator,
    ClockDrawingGenerator,
    GaitDataGenerator
)
from transformers import ViTImageProcessor

# Load trained model
model = MultimodalFusionModel(
    speech_config={'freeze_encoders': True},
    drawing_config={'freeze_encoder': True},
    fusion_type='attention'
)

checkpoint = torch.load('./test_checkpoints/fusion/best_model.pt')
model.load_state_dict(checkpoint['model_state_dict'])
model.eval()

print("‚úÖ Trained model loaded successfully")
print(f"   Trained for {checkpoint['epoch']} epochs")
print(f"   Val metrics: {checkpoint['metrics']}")

In [None]:
# Test on new samples
eye_gen = EyeTrackingGenerator()
typing_gen = TypingDynamicsGenerator()
clock_gen = ClockDrawingGenerator()
gait_gen = GaitDataGenerator()
vit_processor = ViTImageProcessor.from_pretrained("google/vit-base-patch16-224")

def test_sample(is_ad=False):
    """Generate and test a sample"""
    # Generate data
    eye_data = eye_gen.generate_sequence(is_alzheimers=is_ad)
    typing_data = typing_gen.generate_sequence(is_alzheimers=is_ad)
    clock_img = clock_gen.generate_image(is_alzheimers=is_ad)
    gait_data = gait_gen.generate_sequence(is_alzheimers=is_ad)
    
    # Prepare inputs
    eye_tensor = torch.FloatTensor(eye_data).unsqueeze(0)
    typing_tensor = torch.FloatTensor(typing_data).unsqueeze(0)
    clock_processed = vit_processor(images=clock_img, return_tensors="pt")
    drawing_tensor = clock_processed['pixel_values']
    gait_tensor = torch.FloatTensor(gait_data).unsqueeze(0)
    
    # Predict
    with torch.no_grad():
        risk_score, attention_weights, _ = model(
            eye_gaze=eye_tensor,
            typing_sequence=typing_tensor,
            drawing_image=drawing_tensor,
            gait_sensor=gait_tensor,
            return_attention=True,
            return_modality_features=True
        )
    
    return risk_score.item(), attention_weights[0].cpu().numpy(), clock_img

# Test Control sample
risk_control, att_control, clock_control = test_sample(is_ad=False)
print(f"üîµ CONTROL Sample:")
print(f"   Risk Score: {risk_control*100:.1f}%")
print(f"   Expected: Low risk (<30%)")

# Test AD sample
risk_ad, att_ad, clock_ad = test_sample(is_ad=True)
print(f"\nüî¥ AD Sample:")
print(f"   Risk Score: {risk_ad*100:.1f}%")
print(f"   Expected: High risk (>70%)")

print("\n‚úÖ Model inference working correctly!")

## ‚úÖ Phase 2 Test Summary

If all cells above executed successfully, Phase 2 is **fully functional**!

### What Works:
- ‚úÖ Training infrastructure (dataset, dataloader, collate)
- ‚úÖ Training loop (forward, backward, optimize)
- ‚úÖ Validation and metrics computation
- ‚úÖ Early stopping mechanism
- ‚úÖ Model checkpointing
- ‚úÖ Training history logging
- ‚úÖ Both fusion and single modality training
- ‚úÖ Model loading and inference

### Next Steps:
1. ‚úÖ Phase 1 (Demo) - Complete
2. ‚úÖ Phase 2 (Training) - Complete
3. ‚è≠Ô∏è Phase 3 (Preprocessing) - Skip (using synthetic data)
4. ‚è≠Ô∏è Phase 4 (Visualization) - Next
5. ‚è≠Ô∏è Phase 5 (Generate Results) - Priority
6. ‚è≠Ô∏è Phase 6 (PDF Report) - Final

Ready to proceed to the next phase!