## Setup Instructions

**BEFORE RUNNING THIS NOTEBOOK:**

1. Upload these files to a Kaggle Dataset named `fds-mental-health-scripts-v2`:
   - From your project: `scripts/train_stage1_behavioral_kaggle.py`
   - From your project: `scripts/run_two_stage_pipeline_kaggle.py`
   - From your project: `scripts/model_definitions.py`

2. Add these datasets as inputs to this notebook:
   - `student-life/dataset` (StudentLife data)
   - `fds-mental-health-scripts-v2` (your uploaded scripts)
   - `mental-health-lstm` (or wherever you uploaded mental_health_lstm.pt)

3. Then run the cell below to copy scripts to working directory.

In [None]:
# Copy scripts from your uploaded dataset
!cp /kaggle/input/fds-mental-health-scripts-v2/*.py /kaggle/working/

# Verify files are present
!ls -lh /kaggle/working/*.py

!pwd
# Check current directory

## Step 1: Train Stage 1 Model (Behavioral Forecasting)

This model:
- Trains on StudentLife sensor data (real correlations)
- Predicts NEXT DAY's behavior (sleep, activity, screen time, social)
- Outputs mean + uncertainty for each prediction

In [None]:
# Run from /kaggle/working directory where scripts were copied
%cd /kaggle/working

# Train Stage 1 with fixed parameters
!python train_stage1_behavioral_kaggle.py

## Step 2: Run Two-Stage Pipeline

This script:
1. Loads Stage 1 model (behavioral forecasting)
2. Loads Stage 2 model (mental health prediction from synthetic data)
3. For each student:
   - Stage 1: Predicts day 8 behavior from days 1-7
   - Stage 2: Uses predicted behavior to infer mental health
4. Tracks uncertainty propagation

## ⚠️ IMPORTANT: Upload Stage 2 Model First

Before running the next cell, you need to:

1. **Download from your project:** `models/saved/mental_health_lstm.pt`
2. **Upload to Kaggle Dataset:** Create dataset named `mental-health-lstm` 
3. **Add as input to this notebook**

**OR** modify the path in the script to point to wherever you uploaded the synthetic model.

**OR** skip Stage 2 and just analyze Stage 1 predictions (behavioral forecasting only).

In [None]:
# Re-copy the updated pipeline script (fixed paths)
!cp /kaggle/input/fds-mental-health-scripts-v2/run_two_stage_pipeline_kaggle.py /kaggle/working/

# Ensure we're in the right directory
%cd /kaggle/working

# Run the two-stage pipeline
!python run_two_stage_pipeline_kaggle.py

## Alternative: Analyze Stage 1 Only (Skip Stage 2)

If you don't have Stage 2 model uploaded yet, you can analyze Stage 1 behavioral forecasting alone:

In [None]:
# Test Stage 1 predictions directly
import torch
import pandas as pd
import numpy as np
from train_stage1_behavioral_kaggle import BehavioralForecastingLSTM

# Load model
checkpoint = torch.load('/kaggle/working/stage1_behavioral_forecasting.pt', 
                       map_location='cpu', weights_only=False)

model = BehavioralForecastingLSTM(
    input_dim=len(checkpoint['feature_cols']),
    hidden_dim=32,
    num_layers=1,
    targets=checkpoint['targets']
)
model.load_state_dict(checkpoint['model_state'])
model.eval()

print("Stage 1 Model Test")
print("="*60)
print(f"Targets: {checkpoint['targets']}")
print(f"Val Loss (MSE): {checkpoint['val_loss']:.4f}")
print(f"\n✓ Model loaded successfully!")
print("\nYou can now:")
print("  1. Upload Stage 2 model and run full pipeline")
print("  2. Or download Stage 1 model and use it standalone")
print("\nTo download:")
print("  from google.colab import files")
print("  files.download('stage1_behavioral_forecasting.pt')")

## Step 3: Analyze Results

In [None]:
import json
import pandas as pd
import numpy as np

# Load results
with open('/kaggle/working/two_stage_predictions.json') as f:
    results = json.load(f)

print("Two-Stage Pipeline Results")
print("="*60)
print(f"Total predictions: {results['metadata']['total_predictions']}")
print(f"Students: {results['metadata']['num_students']}")
print(f"\nStage 1: {results['metadata']['stage1_model']}")
print(f"Stage 2: {results['metadata']['stage2_model']}")

# Sample prediction
sample = results['predictions'][0]
print("\nSample Prediction:")
print(f"Student: {sample['student_id']}")
print(f"Date: {sample['date']}")
print("\nStage 1 Behavioral Predictions:")
for k, v in sample['stage1_behavioral_predictions'].items():
    uncertainty = sample['stage1_uncertainties'][k]
    print(f"  {k}: {v:.2f} ± {uncertainty:.2f}")
print("\nStage 2 Mental Health Predictions:")
for k, v in sample['stage2_mental_health_predictions'].items():
    print(f"  {k}: {v:.2f}")

## Step 4: Uncertainty Analysis

In [None]:
# Analyze uncertainty propagation
uncertainties = []
for pred in results['predictions']:
    total_uncertainty = pred['error_propagation']['stage1_total_uncertainty']
    uncertainties.append(total_uncertainty)

print("Stage 1 Uncertainty Statistics")
print("="*60)
print(f"Mean total uncertainty: {np.mean(uncertainties):.3f}")
print(f"Std: {np.std(uncertainties):.3f}")
print(f"Min: {np.min(uncertainties):.3f}")
print(f"Max: {np.max(uncertainties):.3f}")
print("\nInterpretation:")
print("  Higher uncertainty in Stage 1 → Less reliable Stage 2 inputs")
print("  → Larger errors in final mental health predictions")

## Step 5: Export Results

In [None]:
# Check what files were created
import os

print("✓ Files created during execution:")
print()

files_to_check = [
    'two_stage_predictions.json',
    'stage1_behavioral_forecasting.pt'
]

for filename in files_to_check:
    filepath = f'/kaggle/working/{filename}'
    if os.path.exists(filepath):
        size_mb = os.path.getsize(filepath) / (1024 * 1024)
        print(f"  ✓ {filename} ({size_mb:.2f} MB)")
    else:
        print(f"  ✗ {filename} (not found)")

print("\n" + "="*60)
print("HOW TO DOWNLOAD:")
print("="*60)
print("\n1. Click 'Output' tab at the top of this notebook")
print("2. On the right side, you'll see 'Output Files'")
print("3. Click the download icon next to each file:")
print("   - two_stage_predictions.json")
print("   - stage1_behavioral_forecasting.pt")
print("\nOR use Kaggle's built-in file browser in the Data panel →")
print("Navigate to /kaggle/working/ and download from there.")

print("\n✓ All outputs are automatically saved to /kaggle/working/")
print("✓ They persist after notebook execution completes")