# AML/DAAI 2025 - Mistake Detection: Reproduce Results on Colab

This notebook helps you reproduce the results from the paper using `scripts/run.py`.

## What this notebook does:
1. **Step 2**: Feature sanity check - verifies your features are loaded correctly
2. **Step 3**: Evaluation reproduction - runs the evaluation to reproduce paper results

## Prerequisites:
You need to have:
- Pre-extracted features (Omnivore and SlowFast) in `.npz` format
- Checkpoints from the official release (`error_recognition_best` directory)
- Annotation files (should be in the repository or uploaded separately)

## Quick Start:
1. Upload your data to Google Drive (or use direct upload)
2. Configure paths in Section 1
3. Run all cells sequentially


## 1. Setup: Clone Repository & Install Dependencies


In [11]:
# ============================================
# CONFIGURE YOUR REPOSITORY
# ============================================
# Option 1: Clone from GitHub (recommended)
REPO_URL = "https://github.com/aexomir/AML_mistake_detection.git"
REPO_BRANCH = "feat/step02"  # Leave empty for default branch, or specify branch name

# Option 2: Manual upload - set REPO_URL to empty string and upload files manually
# REPO_URL = ""

REPO_DIR = "aml_repo_v2"

print(f"Repository URL: {REPO_URL if REPO_URL else 'Manual upload mode'}")
print(f"Repository branch: {REPO_BRANCH if REPO_BRANCH else 'default'}")
print(f"Repository directory: {REPO_DIR}")


Repository URL: https://github.com/aexomir/AML_mistake_detection.git
Repository branch: feat/step02
Repository directory: aml_repo_v2


In [12]:
import os
import shutil

# Remove existing directory if it exists
if os.path.exists(REPO_DIR):
    print(f"Removing existing {REPO_DIR} directory...")
    shutil.rmtree(REPO_DIR)

# Clone repository
if REPO_URL:
    print(f"Cloning repository from {REPO_URL}...")
    clone_cmd = f"git clone {REPO_URL} {REPO_DIR}"
    result = os.system(clone_cmd)

    if result != 0:
        print(f"⚠ Clone failed. Please check the URL or upload files manually.")
        os.makedirs(REPO_DIR, exist_ok=True)
    else:
        print("✓ Repository cloned successfully")

        # Checkout specific branch if specified
        if REPO_BRANCH:
            print(f"Checking out branch: {REPO_BRANCH}")
            os.chdir(REPO_DIR)
            os.system(f"git checkout {REPO_BRANCH}")
            os.chdir('..')
            print(f"✓ Switched to branch: {REPO_BRANCH}")
else:
    print("Manual upload mode: Creating directory...")
    os.makedirs(REPO_DIR, exist_ok=True)

# Change to repository directory
if os.path.exists(REPO_DIR):
    os.chdir(REPO_DIR)
    print(f"\n✓ Changed to directory: {os.getcwd()}")
    print(f"\nRepository contents:")
    !ls -la
else:
    print(f"✗ Error: {REPO_DIR} directory not found!")


Cloning repository from https://github.com/aexomir/AML_mistake_detection.git...
✓ Repository cloned successfully
Checking out branch: feat/step02
✓ Switched to branch: feat/step02

✓ Changed to directory: /content/aml_repo/aml_repo_v2

Repository contents:
total 6020
drwxr-xr-x  8 root root    4096 Dec 15 14:16 .
drwxr-xr-x 12 root root    4096 Dec 15 14:16 ..
-rw-r--r--  1 root root 6042142 Dec 15 14:16 3_Mistake_Detection.pdf
-rw-r--r--  1 root root   18838 Dec 15 14:16 base.py
-rw-r--r--  1 root root    1661 Dec 15 14:16 constants.py
drwxr-xr-x  3 root root    4096 Dec 15 14:16 core
drwxr-xr-x  2 root root    4096 Dec 15 14:16 dataloader
-rw-r--r--  1 root root    6148 Dec 15 14:16 .DS_Store
drwxr-xr-x  2 root root    4096 Dec 15 14:16 er_annotations
drwxr-xr-x  8 root root    4096 Dec 15 14:16 .git
-rw-r--r--  1 root root      65 Dec 15 14:16 .gitignore
-rwxr-xr-x  1 root root    1904 Dec 15 14:16 install_deps.py
-rw-r--r--  1 root root   11357 Dec 15 14:16 LICENSE
drwxr-xr-x  2 ro

In [13]:
# Verify repository structure
import os

print(f"Current working directory: {os.getcwd()}")
print(f"\nChecking repository structure...")

required_items = [
    'scripts/run.py',
    'core/evaluate.py',
    'dataloader',
    'base.py',
    'constants.py'
]

missing = []
for item in required_items:
    if os.path.exists(item):
        print(f"✓ Found: {item}")
    else:
        print(f"✗ Missing: {item}")
        missing.append(item)

if missing:
    print(f"\n⚠ Warning: Some required files/directories are missing!")
    print(f"Please ensure all files are present before proceeding.")
else:
    print(f"\n✓ Repository structure looks good!")


Current working directory: /content/aml_repo/aml_repo_v2

Checking repository structure...
✓ Found: scripts/run.py
✓ Found: core/evaluate.py
✓ Found: dataloader
✓ Found: base.py
✓ Found: constants.py

✓ Repository structure looks good!


In [14]:
# Install dependencies
# Colab comes with PyTorch pre-installed, so we'll work with that
# Remove PyTorch version constraints to avoid conflicts
if os.path.exists('requirements.txt'):
    !sed -i '/^torch==/d' requirements.txt 2>/dev/null || true
    !sed -i '/^torchvision==/d' requirements.txt 2>/dev/null || true

# Install torcheval (required for evaluation metrics)
!pip install -q torcheval

# Install all remaining dependencies from requirements.txt
if os.path.exists('requirements.txt'):
    !pip install -q -r requirements.txt
elif os.path.exists('requirements-cpu.txt'):
    !pip install -q -r requirements-cpu.txt

print("✓ All dependencies installed successfully")

# Verify PyTorch installation
import torch
print(f"\nPyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"CUDA device: {torch.cuda.get_device_name(0)}")


✓ All dependencies installed successfully

PyTorch version: 2.9.0+cu126
CUDA available: True
CUDA device: Tesla T4


## 2. Load Data: Features, Checkpoints, and Annotations


In [15]:
# ============================================
# CONFIGURE DATA PATHS
# ============================================
# Option 1: From Google Drive (recommended for large files)
USE_GOOGLE_DRIVE = True  # Set to False if uploading directly

# Paths on Google Drive (update these to match your Drive structure)
OMNIVORE_DRIVE_PATH = "/content/drive/MyDrive/AML_mistake_detection/omnivore.zip"  # Can be .zip or directory
SLOWFAST_DRIVE_PATH = "/content/drive/MyDrive/AML_mistake_detection/slowfast.zip"  # Can be .zip or directory
CHECKPOINTS_DRIVE_PATH = "/content/drive/MyDrive/AML_mistake_detection/error_recognition_best.zip"  # Can be .zip or directory
ANNOTATIONS_DRIVE_PATH = "/content/drive/MyDrive/AML_mistake_detection/annotations"  # Optional if in repo

# Option 2: Direct upload - set USE_GOOGLE_DRIVE = False and upload files in next cell

print("Data paths configured:")
print(f"  Use Google Drive: {USE_GOOGLE_DRIVE}")
print(f"  Omnivore: {OMNIVORE_DRIVE_PATH}")
print(f"  SlowFast: {SLOWFAST_DRIVE_PATH}")
print(f"  Checkpoints: {CHECKPOINTS_DRIVE_PATH}")
print(f"  Annotations: {ANNOTATIONS_DRIVE_PATH}")


Data paths configured:
  Use Google Drive: True
  Omnivore: /content/drive/MyDrive/AML_mistake_detection/omnivore.zip
  SlowFast: /content/drive/MyDrive/AML_mistake_detection/slowfast.zip
  Checkpoints: /content/drive/MyDrive/AML_mistake_detection/error_recognition_best.zip
  Annotations: /content/drive/MyDrive/AML_mistake_detection/annotations


In [16]:
# Mount Google Drive if using it
if USE_GOOGLE_DRIVE:
    from google.colab import drive
    drive.mount('/content/drive')
    print("✓ Google Drive mounted")
else:
    print("⚠ Google Drive not mounted. Please upload files directly using the file browser.")


Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
✓ Google Drive mounted


In [17]:
# Create data directory structure
import os
os.makedirs('data/video/omnivore', exist_ok=True)
os.makedirs('data/video/slowfast', exist_ok=True)
os.makedirs('checkpoints', exist_ok=True)
os.makedirs('annotations/annotation_json', exist_ok=True)
os.makedirs('annotations/data_splits', exist_ok=True)
os.makedirs('er_annotations', exist_ok=True)

print("✓ Directory structure created")


✓ Directory structure created


In [18]:
# Load features from Google Drive or direct upload
import os
import shutil
import subprocess
import glob

def load_features(source_path, dest_path, feature_name):
    """Load features from source (zip file or directory) to destination."""
    if not os.path.exists(source_path):
        print(f"⚠ {feature_name}: Source path not found: {source_path}")
        return False

    print(f"Loading {feature_name} features from: {source_path}")

    # Check if it's a zip file
    is_zip = source_path.lower().endswith('.zip') or (os.path.isfile(source_path) and 'zip' in str(source_path))

    if is_zip:
        print(f"  Detected zip file, extracting...")
        temp_zip = f'/tmp/{feature_name.lower()}.zip'
        temp_extracted = f'/tmp/{feature_name.lower()}_extracted'

        try:
            shutil.copy(source_path, temp_zip)
            subprocess.run(['unzip', '-q', temp_zip, '-d', temp_extracted], check=True)

            # Find .npz files in extracted directory
            npz_files = glob.glob(os.path.join(temp_extracted, '**/*.npz'), recursive=True)

            if npz_files:
                # Copy all .npz files to destination
                for npz_file in npz_files:
                    shutil.copy2(npz_file, dest_path)
                print(f"  ✓ Extracted and copied {len(npz_files)} .npz files")

                # Cleanup
                shutil.rmtree(temp_extracted, ignore_errors=True)
                os.remove(temp_zip)
                return True
            else:
                print(f"  ⚠ No .npz files found in extracted zip")
                shutil.rmtree(temp_extracted, ignore_errors=True)
                os.remove(temp_zip)
                return False
        except Exception as e:
            print(f"  ✗ Error extracting {feature_name} zip: {e}")
            if os.path.exists(temp_extracted):
                shutil.rmtree(temp_extracted, ignore_errors=True)
            if os.path.exists(temp_zip):
                os.remove(temp_zip)
            return False
    else:
        # It's a directory
        print(f"  Detected directory, copying .npz files...")
        npz_files = glob.glob(os.path.join(source_path, '**/*.npz'), recursive=True)

        if npz_files:
            # Copy all .npz files to destination
            for npz_file in npz_files:
                shutil.copy2(npz_file, dest_path)
            print(f"  ✓ Copied {len(npz_files)} .npz files")
            return True
        else:
            print(f"  ⚠ No .npz files found in {source_path}")
            return False

# Load Omnivore and SlowFast features
if USE_GOOGLE_DRIVE:
    load_features(OMNIVORE_DRIVE_PATH, 'data/video/omnivore', 'Omnivore')
    load_features(SLOWFAST_DRIVE_PATH, 'data/video/slowfast', 'SlowFast')
else:
    print("⚠ Please upload features manually:")
    print("  1. Use the file browser to upload .npz files or zip files")
    print("  2. Extract/copy them to data/video/omnivore/ and data/video/slowfast/")

# Verify features
omnivore_count = len([f for f in os.listdir('data/video/omnivore') if f.endswith('.npz')]) if os.path.exists('data/video/omnivore') else 0
slowfast_count = len([f for f in os.listdir('data/video/slowfast') if f.endswith('.npz')]) if os.path.exists('data/video/slowfast') else 0
print(f"\nFeature file counts:")
print(f"  Omnivore: {omnivore_count} .npz files")
print(f"  SlowFast: {slowfast_count} .npz files")


Loading Omnivore features from: /content/drive/MyDrive/AML_mistake_detection/omnivore.zip
  Detected zip file, extracting...
  ✓ Extracted and copied 384 .npz files
Loading SlowFast features from: /content/drive/MyDrive/AML_mistake_detection/slowfast.zip
  Detected zip file, extracting...
  ✓ Extracted and copied 384 .npz files

Feature file counts:
  Omnivore: 384 .npz files
  SlowFast: 384 .npz files


In [19]:
# Load checkpoints
import os
import shutil
import subprocess

checkpoint_path = CHECKPOINTS_DRIVE_PATH if USE_GOOGLE_DRIVE else None

if checkpoint_path and os.path.exists(checkpoint_path):
    print(f"Loading checkpoints from: {checkpoint_path}")

    # Check if it's a zip file
    is_zip = checkpoint_path.lower().endswith('.zip') or (os.path.isfile(checkpoint_path) and 'zip' in str(checkpoint_path))

    if is_zip:
        print("Detected zip file, extracting...")
        shutil.copy(checkpoint_path, '/tmp/checkpoints.zip')

        try:
            subprocess.run(['unzip', '-q', '/tmp/checkpoints.zip', '-d', '/tmp/checkpoints_extracted'], check=True)

            # Find error_recognition_best directory
            extracted_base = '/tmp/checkpoints_extracted'
            extracted_path = None

            # Check common locations
            if os.path.exists(os.path.join(extracted_base, 'error_recognition_best')):
                extracted_path = os.path.join(extracted_base, 'error_recognition_best')
            elif os.path.exists(os.path.join(extracted_base, 'MLP')) or os.path.exists(os.path.join(extracted_base, 'Transformer')):
                extracted_path = extracted_base
            else:
                # Search recursively
                for root, dirs, files in os.walk(extracted_base):
                    if 'error_recognition_best' in dirs:
                        extracted_path = os.path.join(root, 'error_recognition_best')
                        break
                    if 'MLP' in dirs or 'Transformer' in dirs:
                        extracted_path = root
                        break

                if extracted_path is None:
                    extracted_path = extracted_base

            print(f"Copying from: {extracted_path}")
            shutil.copytree(extracted_path, 'checkpoints/error_recognition_best', dirs_exist_ok=True)

            # Cleanup
            shutil.rmtree('/tmp/checkpoints_extracted', ignore_errors=True)
            os.remove('/tmp/checkpoints.zip')
            print("✓ Checkpoints extracted")
        except Exception as e:
            print(f"✗ Error extracting checkpoints: {e}")
    else:
        # It's a directory
        print("Detected directory, copying...")
        if os.path.basename(checkpoint_path) == 'error_recognition_best':
            shutil.copytree(checkpoint_path, 'checkpoints/error_recognition_best', dirs_exist_ok=True)
        else:
            os.makedirs('checkpoints/error_recognition_best', exist_ok=True)
            for item in os.listdir(checkpoint_path):
                src = os.path.join(checkpoint_path, item)
                dst = os.path.join('checkpoints/error_recognition_best', item)
                if os.path.isdir(src):
                    shutil.copytree(src, dst, dirs_exist_ok=True)
                else:
                    shutil.copy2(src, dst)
        print("✓ Checkpoints copied")
else:
    print("⚠ Checkpoints not found. Please upload manually:")
    print("  1. Download from: https://utdallas.app.box.com/s/uz3s1alrzucz03sleify8kazhuc1ksl3")
    print("  2. Extract error_recognition_best directory")
    print("  3. Upload to checkpoints/error_recognition_best/")

# Verify checkpoints
if os.path.exists('checkpoints/error_recognition_best'):
    pt_files = []
    for root, dirs, files in os.walk('checkpoints/error_recognition_best'):
        pt_files.extend([os.path.join(root, f) for f in files if f.endswith('.pt')])
    print(f"\n✓ Found {len(pt_files)} checkpoint files")
    if pt_files:
        print("\nSample checkpoint files:")
        for f in pt_files[:3]:
            print(f"  {f}")
else:
    print("\n✗ Checkpoints directory not found")


Loading checkpoints from: /content/drive/MyDrive/AML_mistake_detection/error_recognition_best.zip
Detected zip file, extracting...
Copying from: /tmp/checkpoints_extracted/error_recognition_best
✓ Checkpoints extracted

✓ Found 54 checkpoint files

Sample checkpoint files:
  checkpoints/error_recognition_best/MLP/omnivore/error_recognition_MLP_omnivore_recordings_epoch_33.pt
  checkpoints/error_recognition_best/MLP/omnivore/error_recognition_MLP_omnivore_step_epoch_43.pt
  checkpoints/error_recognition_best/MLP/omnivore/error_recognition_MLP_omnivore_environment_epoch_5.pt


In [20]:
# Load annotations (if not already in repository)
import os
import shutil

if USE_GOOGLE_DRIVE and os.path.exists(ANNOTATIONS_DRIVE_PATH):
    print(f"Loading annotations from: {ANNOTATIONS_DRIVE_PATH}")

    # Copy annotation_json
    annotation_json_src = os.path.join(ANNOTATIONS_DRIVE_PATH, 'annotation_json')
    if os.path.exists(annotation_json_src):
        for file in os.listdir(annotation_json_src):
            src = os.path.join(annotation_json_src, file)
            dst = os.path.join('annotations/annotation_json', file)
            if os.path.isfile(src):
                shutil.copy2(src, dst)
                print(f"  ✓ Copied {file}")

    # Copy data_splits
    data_splits_src = os.path.join(ANNOTATIONS_DRIVE_PATH, 'data_splits')
    if os.path.exists(data_splits_src):
        for file in os.listdir(data_splits_src):
            src = os.path.join(data_splits_src, file)
            dst = os.path.join('annotations/data_splits', file)
            if os.path.isfile(src):
                shutil.copy2(src, dst)
                print(f"  ✓ Copied {file}")

    # Copy er_annotations
    er_annotations_src = os.path.join(ANNOTATIONS_DRIVE_PATH, 'er_annotations')
    if os.path.exists(er_annotations_src):
        for file in os.listdir(er_annotations_src):
            src = os.path.join(er_annotations_src, file)
            dst = os.path.join('er_annotations', file)
            if os.path.isfile(src):
                shutil.copy2(src, dst)
                print(f"  ✓ Copied {file}")
else:
    print("⚠ Annotations not found in Drive. Checking repository...")

# Verify required annotation files
print("\nVerifying annotation files...")
required_files = [
    'annotations/annotation_json/step_annotations.json',
    'annotations/annotation_json/error_annotations.json',
    'er_annotations/recordings_combined_splits.json'
]

missing = []
for file in required_files:
    if os.path.exists(file):
        print(f"✓ Found: {file}")
    else:
        print(f"✗ Missing: {file}")
        missing.append(file)

if missing:
    print(f"\n⚠ Warning: {len(missing)} required annotation file(s) are missing!")
    print("Please ensure these files are available before running Step 3.")
else:
    print("\n✓ All required annotation files are present!")


Loading annotations from: /content/drive/MyDrive/AML_mistake_detection/annotations
  ✓ Copied error_category_idx.json
  ✓ Copied activity_idx_step_idx.json
  ✓ Copied error_annotations.json
  ✓ Copied step_annotations.json
  ✓ Copied complete_step_annotations.json
  ✓ Copied step_idx_description.json
  ✓ Copied recording_id_step_idx.json
  ✓ Copied environment_data_split_combined.json
  ✓ Copied person_data_split_normal.json
  ✓ Copied recordings_data_split_normal.json
  ✓ Copied person_data_split_combined.json
  ✓ Copied environment_data_split_normal.json
  ✓ Copied recordings_data_split_combined.json
  ✓ Copied recipes_data_split_normal.json
  ✓ Copied recipes_data_split_combined.json

Verifying annotation files...
✓ Found: annotations/annotation_json/step_annotations.json
✓ Found: annotations/annotation_json/error_annotations.json
✓ Found: er_annotations/recordings_combined_splits.json

✓ All required annotation files are present!


## 3. Step 2: Feature Sanity Check

This step verifies that your features are loaded correctly and can be read.


In [21]:
# Run Step 2: Feature sanity check
!python scripts/run.py step2 --features_root data


Step 2: Feature Sanity Check

Features root: /content/aml_repo/aml_repo_v2/data

--- Checking OMNIVORE ---
✓ Directory found: /content/aml_repo/aml_repo_v2/data/video/omnivore
  Found 384 .npz files
  Example files:
    - 10_16_360p.mp4_1s_1s.npz
    - 10_18_360p.mp4_1s_1s.npz
    - 10_24_360p.mp4_1s_1s.npz
    - 10_26_360p.mp4_1s_1s.npz
    - 10_31_360p.mp4_1s_1s.npz

  Loading sample file: 10_16_360p.mp4_1s_1s.npz
  Keys in file: ['arr_0']
  Shape: (974, 1024)
  Dtype: float32
  Min: -3.1500, Max: 3.1098, Mean: -0.0166

--- Checking SLOWFAST ---
✓ Directory found: /content/aml_repo/aml_repo_v2/data/video/slowfast
  Found 384 .npz files
  Example files:
    - 10_16_360p.mp4_1s_1s.npz
    - 10_18_360p.mp4_1s_1s.npz
    - 10_24_360p.mp4_1s_1s.npz
    - 10_26_360p.mp4_1s_1s.npz
    - 10_31_360p.mp4_1s_1s.npz

  Loading sample file: 10_16_360p.mp4_1s_1s.npz
  Keys in file: ['arr_0']
  Shape: (974, 400)
  Dtype: float32
  Min: -9.9709, Max: 26.4474, Mean: -0.0000

Summary:
✓ OMNIVORE: exis

## 4. Step 3: Evaluation Reproduction

Run evaluations to reproduce the results from the paper. Update checkpoint paths with actual epoch numbers from your checkpoints directory.

**Note**: Use threshold 0.6 for `step` split and 0.4 for `recordings` split.


In [22]:
# Example: Omnivore - MLP - Step split
# This should reproduce: F1=24.26, AUC=75.74
!python scripts/run.py step3 --split step --backbone omnivore --variant MLP \
  --ckpt checkpoints/error_recognition_best/MLP/omnivore/error_recognition_MLP_omnivore_step_epoch_43.pt \
  --threshold 0.6


Step 3: Evaluation Reproduction

Running: /usr/bin/python3 -m core.evaluate --split step --backbone omnivore --variant MLP --ckpt checkpoints/error_recognition_best/MLP/omnivore/error_recognition_MLP_omnivore_step_epoch_43.pt --threshold 0.6

Loaded annotations...... 
Loading recording ids from recordings_combined_splits.json
test Progress: 42347/798: 100% 798/798 [00:11<00:00, 71.64it/s]
----------------------------------------------------------------
test Sub Step Level Metrics: {'precision': 0.4096162736939436, 'recall': 0.2989708115404083, 'f1': 0.3456549302643129, 'accuracy': 0.6831416629277163, 'auc': np.float64(0.6541560352028618), 'pr_auc': tensor(0.3187)}
test Step Level Metrics: {'precision': 0.6607142857142857, 'recall': 0.14859437751004015, 'f1': 0.24262295081967214, 'accuracy': 0.7105263157894737, 'auc': np.float64(0.7573902166041213), 'pr_auc': tensor(0.3638)}
----------------------------------------------------------------


In [27]:
# Example: Omnivore - MLP - Recordings split
# This should reproduce: F1=55.42, AUC=63.03
# Update the epoch number in the checkpoint path
!python scripts/run.py step3 --split recordings --backbone omnivore --variant MLP \
  --ckpt checkpoints/error_recognition_best/MLP/omnivore/error_recognition_MLP_omnivore_recordings_epoch_33.pt \
  --threshold 0.4


Step 3: Evaluation Reproduction

Running: /usr/bin/python3 -m core.evaluate --split recordings --backbone omnivore --variant MLP --ckpt checkpoints/error_recognition_best/MLP/omnivore/error_recognition_MLP_omnivore_recordings_epoch_33.pt --threshold 0.4

Loaded annotations...... 
Loading recording ids from recordings_combined_splits.json
test Progress: 38340/671: 100% 671/671 [00:04<00:00, 148.32it/s]
----------------------------------------------------------------
test Sub Step Level Metrics: {'precision': 0.3964945261528254, 'recall': 0.5688109780280797, 'f1': 0.46727266803505685, 'accuracy': 0.5735263432446531, 'auc': np.float64(0.5988330748775713), 'pr_auc': tensor(0.3673)}
test Step Level Metrics: {'precision': 0.4090909090909091, 'recall': 0.8589211618257261, 'f1': 0.5542168674698795, 'accuracy': 0.503725782414307, 'auc': np.float64(0.6302808067162018), 'pr_auc': tensor(0.4020)}
----------------------------------------------------------------


In [28]:
# Example: Omnivore - Transformer - Step split
# This should reproduce: F1=55.39, AUC=75.62
# Update the epoch number in the checkpoint path
!python scripts/run.py step3 --split step --backbone omnivore --variant Transformer \
  --ckpt checkpoints/error_recognition_best/Transformer/omnivore/error_recognition_Transformer_omnivore_step_epoch_9.pt \
  --threshold 0.6


Step 3: Evaluation Reproduction

Running: /usr/bin/python3 -m core.evaluate --split step --backbone omnivore --variant Transformer --ckpt checkpoints/error_recognition_best/Transformer/omnivore/error_recognition_Transformer_omnivore_step_epoch_9.pt --threshold 0.6

Loaded annotations...... 
Loading recording ids from recordings_combined_splits.json
test Progress: 42347/798: 100% 798/798 [00:05<00:00, 137.96it/s]
----------------------------------------------------------------
test Sub Step Level Metrics: {'precision': 0.4445452483556362, 'recall': 0.6613801248523705, 'f1': 0.5317056629365887, 'accuracy': 0.6738848088412402, 'auc': np.float64(0.7461755308526944), 'pr_auc': tensor(0.3888)}
test Step Level Metrics: {'precision': 0.5155709342560554, 'recall': 0.5983935742971888, 'f1': 0.5539033457249071, 'accuracy': 0.6992481203007519, 'auc': np.float64(0.7561832027563805), 'pr_auc': tensor(0.4338)}
----------------------------------------------------------------


In [29]:
# Example: Omnivore - Transformer - Recordings split
# This should reproduce: F1=40.73, AUC=62.27
# Update the epoch number in the checkpoint path
!python scripts/run.py step3 --split recordings --backbone omnivore --variant Transformer \
  --ckpt checkpoints/error_recognition_best/Transformer/omnivore/error_recognition_Transformer_omnivore_recordings_epoch_31.pt \
  --threshold 0.4


Step 3: Evaluation Reproduction

Running: /usr/bin/python3 -m core.evaluate --split recordings --backbone omnivore --variant Transformer --ckpt checkpoints/error_recognition_best/Transformer/omnivore/error_recognition_Transformer_omnivore_recordings_epoch_31.pt --threshold 0.4

Loaded annotations...... 
Loading recording ids from recordings_combined_splits.json
test Progress: 38340/671: 100% 671/671 [00:04<00:00, 138.06it/s]
----------------------------------------------------------------
test Sub Step Level Metrics: {'precision': 0.4491327720864185, 'recall': 0.35123344173871657, 'f1': 0.39419567346212053, 'accuracy': 0.645018257694314, 'auc': np.float64(0.6254427005929003), 'pr_auc': tensor(0.3711)}
test Step Level Metrics: {'precision': 0.45408163265306123, 'recall': 0.36929460580912865, 'f1': 0.4073226544622426, 'accuracy': 0.6140089418777943, 'auc': np.float64(0.6226768310334846), 'pr_auc': tensor(0.3942)}
----------------------------------------------------------------


## Additional Evaluations

You can also run evaluations for SlowFast backbone or other configurations. Make sure to update the checkpoint paths with the correct epoch numbers from your checkpoints directory.


In [30]:
# List available checkpoints to find correct epoch numbers
import os
import glob

checkpoint_base = 'checkpoints/error_recognition_best'
if os.path.exists(checkpoint_base):
    print("Available checkpoints:")
    for ckpt_file in sorted(glob.glob(os.path.join(checkpoint_base, '**/*.pt'), recursive=True)):
        print(f"  {ckpt_file}")
else:
    print("Checkpoints directory not found")


Available checkpoints:
  checkpoints/error_recognition_best/MLP/3dresnet/error_recognition_MLP_3dresnet_environment_epoch_11.pt
  checkpoints/error_recognition_best/MLP/3dresnet/error_recognition_MLP_3dresnet_person_epoch_39.pt
  checkpoints/error_recognition_best/MLP/3dresnet/error_recognition_MLP_3dresnet_recordings_epoch_45.pt
  checkpoints/error_recognition_best/MLP/3dresnet/error_recognition_MLP_3dresnet_step_epoch_41.pt
  checkpoints/error_recognition_best/MLP/imagebind/error_recognition_MLP_imagebind_audio_environment_epoch_50.pt
  checkpoints/error_recognition_best/MLP/imagebind/error_recognition_MLP_imagebind_audio_person_epoch_8.pt
  checkpoints/error_recognition_best/MLP/imagebind/error_recognition_MLP_imagebind_audio_recordings_epoch_2.pt
  checkpoints/error_recognition_best/MLP/imagebind/error_recognition_MLP_imagebind_audio_step_epoch_28.pt
  checkpoints/error_recognition_best/MLP/imagebind/error_recognition_MLP_imagebind_video_audio_environment_epoch_31.pt
  checkpoints/

## Troubleshooting

### Common Issues:

1. **Missing features**: Make sure you have uploaded both Omnivore and SlowFast features
2. **Missing checkpoints**: Download from the official Box link and extract properly
3. **Missing annotations**: Ensure annotation files are in the correct directories
4. **Wrong checkpoint path**: Use the cell above to list available checkpoints and update paths
5. **CUDA errors**: The code will automatically fall back to CPU if CUDA is not available

### Getting Help:
- Check the README.md in the repository
- Verify all file paths are correct
- Ensure all dependencies are installed
