<a href="https://colab.research.google.com/github/aexomir/AML_mistake_detection/blob/feat%2Ferror-types/notebooks/colab_error_type_analysis.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# AML/DAAI 2025 - Error Type Analysis on Colab

This notebook runs error-type analysis on the mistake detection models.

## What this notebook does:
1. **Setup**: Clone repository and install dependencies
2. **Load Data**: Features, checkpoints, and annotations
3. **Run Evaluations**: Evaluate models with error-type analysis
4. **View Results**: Display and download error-type analysis results

## Prerequisites:
You need to have:
- Pre-extracted features (Omnivore and SlowFast) in `.npz` format
- Checkpoints from the official release (`error_recognition_best` directory)
- Annotation files (should be in the repository or uploaded separately)

## Quick Start:
1. Upload your data to Google Drive (or use direct upload)
2. Configure paths in Section 1
3. Run all cells sequentially


## 1. Setup: Clone Repository & Install Dependencies


In [1]:
# ============================================
# CONFIGURE YOUR REPOSITORY
# ============================================
# Option 1: Clone from GitHub (recommended)
REPO_URL = "https://github.com/aexomir/AML_mistake_detection.git"
REPO_BRANCH = "feat/error-types"  # Leave empty for default branch, or specify branch name

# Option 2: Manual upload - set REPO_URL to empty string and upload files manually
# REPO_URL = ""

REPO_DIR = "aml_repo_v2"

print(f"Repository URL: {REPO_URL if REPO_URL else 'Manual upload mode'}")
print(f"Repository branch: {REPO_BRANCH if REPO_BRANCH else 'default'}")
print(f"Repository directory: {REPO_DIR}")


Repository URL: https://github.com/aexomir/AML_mistake_detection.git
Repository branch: feat/error-types
Repository directory: aml_repo_v2


In [2]:
import os
import shutil

# Remove existing directory if it exists
if os.path.exists(REPO_DIR):
    print(f"Removing existing {REPO_DIR} directory...")
    shutil.rmtree(REPO_DIR)

# Clone repository
if REPO_URL:
    print(f"Cloning repository from {REPO_URL}...")
    clone_cmd = f"git clone {REPO_URL} {REPO_DIR}"
    result = os.system(clone_cmd)

    if result != 0:
        print(f"⚠ Clone failed. Please check the URL or upload files manually.")
        os.makedirs(REPO_DIR, exist_ok=True)
    else:
        print("✓ Repository cloned successfully")

        # Checkout specific branch if specified
        if REPO_BRANCH:
            print(f"Checking out branch: {REPO_BRANCH}")
            os.chdir(REPO_DIR)
            os.system(f"git checkout {REPO_BRANCH}")
            os.chdir('..')
            print(f"✓ Switched to branch: {REPO_BRANCH}")
else:
    print("Manual upload mode: Creating directory...")
    os.makedirs(REPO_DIR, exist_ok=True)

# Change to repository directory
if os.path.exists(REPO_DIR):
    os.chdir(REPO_DIR)
    print(f"\n✓ Changed to directory: {os.getcwd()}")
    print(f"\nRepository contents:")
    !ls -la
else:
    print(f"✗ Error: {REPO_DIR} directory not found!")


Removing existing aml_repo_v2 directory...
Cloning repository from https://github.com/aexomir/AML_mistake_detection.git...
✓ Repository cloned successfully
Checking out branch: feat/error-types
✓ Switched to branch: feat/error-types

✓ Changed to directory: /content/aml_repo_v2

Repository contents:
total 6032
drwxr-xr-x 8 root root    4096 Dec 15 20:41 .
drwxr-xr-x 1 root root    4096 Dec 15 20:41 ..
-rw-r--r-- 1 root root 6042142 Dec 15 20:41 3_Mistake_Detection.pdf
-rw-r--r-- 1 root root   28923 Dec 15 20:41 base.py
-rw-r--r-- 1 root root    1940 Dec 15 20:41 constants.py
drwxr-xr-x 3 root root    4096 Dec 15 20:41 core
drwxr-xr-x 2 root root    4096 Dec 15 20:41 dataloader
-rw-r--r-- 1 root root    6148 Dec 15 20:41 .DS_Store
drwxr-xr-x 2 root root    4096 Dec 15 20:41 er_annotations
drwxr-xr-x 8 root root    4096 Dec 15 20:41 .git
-rw-r--r-- 1 root root      65 Dec 15 20:41 .gitignore
-rwxr-xr-x 1 root root    1904 Dec 15 20:41 install_deps.py
-rw-r--r-- 1 root root   11357 Dec 15

In [3]:
# Verify repository structure
import os

print(f"Current working directory: {os.getcwd()}")
print(f"\nChecking repository structure...")

required_items = [
    'core/evaluate.py',
    'dataloader',
    'base.py',
    'constants.py'
]

missing = []
for item in required_items:
    if os.path.exists(item):
        print(f"✓ Found: {item}")
    else:
        print(f"✗ Missing: {item}")
        missing.append(item)

if missing:
    print(f"\n⚠ Warning: Some required files/directories are missing!")
    print(f"Please ensure all files are present before proceeding.")
else:
    print(f"\n✓ Repository structure looks good!")


Current working directory: /content/aml_repo_v2

Checking repository structure...
✓ Found: core/evaluate.py
✓ Found: dataloader
✓ Found: base.py
✓ Found: constants.py

✓ Repository structure looks good!


In [4]:
# Install dependencies
# Colab comes with PyTorch pre-installed, so we'll work with that
# Remove PyTorch version constraints to avoid conflicts
if os.path.exists('requirements.txt'):
    !sed -i '/^torch==/d' requirements.txt 2>/dev/null || true
    !sed -i '/^torchvision==/d' requirements.txt 2>/dev/null || true

# Install torcheval (required for evaluation metrics)
!pip install -q torcheval

# Install all remaining dependencies from requirements.txt
if os.path.exists('requirements.txt'):
    !pip install -q -r requirements.txt
elif os.path.exists('requirements-cpu.txt'):
    !pip install -q -r requirements-cpu.txt

print("✓ All dependencies installed successfully")

# Verify PyTorch installation
import torch
print(f"\nPyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"CUDA device: {torch.cuda.get_device_name(0)}")


✓ All dependencies installed successfully

PyTorch version: 2.9.0+cu126
CUDA available: True
CUDA device: Tesla T4


## 2. Load Data: Features, Checkpoints, and Annotations


In [5]:
# ============================================
# CONFIGURE DATA PATHS
# ============================================
# Option 1: From Google Drive (recommended for large files)
USE_GOOGLE_DRIVE = True  # Set to False if uploading directly

# Paths on Google Drive (update these to match your Drive structure)
OMNIVORE_DRIVE_PATH = "/content/drive/MyDrive/AML_mistake_detection/omnivore.zip"  # Can be .zip or directory
SLOWFAST_DRIVE_PATH = "/content/drive/MyDrive/AML_mistake_detection/slowfast.zip"  # Can be .zip or directory
CHECKPOINTS_DRIVE_PATH = "/content/drive/MyDrive/AML_mistake_detection/error_recognition_best.zip"  # Can be .zip or directory
ANNOTATIONS_DRIVE_PATH = "/content/drive/MyDrive/AML_mistake_detection/annotations"  # Optional if in repo

# Option 2: Direct upload - set USE_GOOGLE_DRIVE = False and upload files in next cell

print("Data paths configured:")
print(f"  Use Google Drive: {USE_GOOGLE_DRIVE}")
print(f"  Omnivore: {OMNIVORE_DRIVE_PATH}")
print(f"  SlowFast: {SLOWFAST_DRIVE_PATH}")
print(f"  Checkpoints: {CHECKPOINTS_DRIVE_PATH}")
print(f"  Annotations: {ANNOTATIONS_DRIVE_PATH}")


Data paths configured:
  Use Google Drive: True
  Omnivore: /content/drive/MyDrive/AML_mistake_detection/omnivore.zip
  SlowFast: /content/drive/MyDrive/AML_mistake_detection/slowfast.zip
  Checkpoints: /content/drive/MyDrive/AML_mistake_detection/error_recognition_best.zip
  Annotations: /content/drive/MyDrive/AML_mistake_detection/annotations


In [6]:
# Mount Google Drive if using it
if USE_GOOGLE_DRIVE:
    from google.colab import drive
    drive.mount('/content/drive')
    print("✓ Google Drive mounted")
else:
    print("⚠ Google Drive not mounted. Please upload files directly using the file browser.")


Mounted at /content/drive
✓ Google Drive mounted


In [7]:
# Create data directory structure
import os
os.makedirs('data/video/omnivore', exist_ok=True)
os.makedirs('data/video/slowfast', exist_ok=True)
os.makedirs('checkpoints', exist_ok=True)
os.makedirs('annotations/annotation_json', exist_ok=True)
os.makedirs('annotations/data_splits', exist_ok=True)
os.makedirs('er_annotations', exist_ok=True)
os.makedirs('results', exist_ok=True)

print("✓ Directory structure created")


✓ Directory structure created


In [8]:
# Load features from Google Drive or direct upload
import os
import shutil
import subprocess
import glob

def load_features(source_path, dest_path, feature_name):
    """Load features from source (zip file or directory) to destination."""
    if not os.path.exists(source_path):
        print(f"⚠ {feature_name}: Source path not found: {source_path}")
        return False

    print(f"Loading {feature_name} features from: {source_path}")

    # Check if it's a zip file
    is_zip = source_path.lower().endswith('.zip') or (os.path.isfile(source_path) and 'zip' in str(source_path))

    if is_zip:
        print(f"  Detected zip file, extracting...")
        temp_zip = f'/tmp/{feature_name.lower()}.zip'
        temp_extracted = f'/tmp/{feature_name.lower()}_extracted'

        try:
            shutil.copy(source_path, temp_zip)
            subprocess.run(['unzip', '-q', temp_zip, '-d', temp_extracted], check=True)

            # Find .npz files in extracted directory
            npz_files = glob.glob(os.path.join(temp_extracted, '**/*.npz'), recursive=True)

            if npz_files:
                # Copy all .npz files to destination
                for npz_file in npz_files:
                    shutil.copy2(npz_file, dest_path)
                print(f"  ✓ Extracted and copied {len(npz_files)} .npz files")

                # Cleanup
                shutil.rmtree(temp_extracted, ignore_errors=True)
                os.remove(temp_zip)
                return True
            else:
                print(f"  ⚠ No .npz files found in extracted zip")
                shutil.rmtree(temp_extracted, ignore_errors=True)
                os.remove(temp_zip)
                return False
        except Exception as e:
            print(f"  ✗ Error extracting {feature_name} zip: {e}")
            if os.path.exists(temp_extracted):
                shutil.rmtree(temp_extracted, ignore_errors=True)
            if os.path.exists(temp_zip):
                os.remove(temp_zip)
            return False
    else:
        # It's a directory
        print(f"  Detected directory, copying .npz files...")
        npz_files = glob.glob(os.path.join(source_path, '**/*.npz'), recursive=True)

        if npz_files:
            # Copy all .npz files to destination
            for npz_file in npz_files:
                shutil.copy2(npz_file, dest_path)
            print(f"  ✓ Copied {len(npz_files)} .npz files")
            return True
        else:
            print(f"  ⚠ No .npz files found in {source_path}")
            return False

# Load Omnivore and SlowFast features
if USE_GOOGLE_DRIVE:
    load_features(OMNIVORE_DRIVE_PATH, 'data/video/omnivore', 'Omnivore')
    load_features(SLOWFAST_DRIVE_PATH, 'data/video/slowfast', 'SlowFast')
else:
    print("⚠ Please upload features manually:")
    print("  1. Use the file browser to upload .npz files or zip files")
    print("  2. Extract/copy them to data/video/omnivore/ and data/video/slowfast/")

# Verify features
omnivore_count = len([f for f in os.listdir('data/video/omnivore') if f.endswith('.npz')]) if os.path.exists('data/video/omnivore') else 0
slowfast_count = len([f for f in os.listdir('data/video/slowfast') if f.endswith('.npz')]) if os.path.exists('data/video/slowfast') else 0
print(f"\nFeature file counts:")
print(f"  Omnivore: {omnivore_count} .npz files")
print(f"  SlowFast: {slowfast_count} .npz files")


Loading Omnivore features from: /content/drive/MyDrive/AML_mistake_detection/omnivore.zip
  Detected zip file, extracting...
  ✓ Extracted and copied 384 .npz files
Loading SlowFast features from: /content/drive/MyDrive/AML_mistake_detection/slowfast.zip
  Detected zip file, extracting...
  ✓ Extracted and copied 384 .npz files

Feature file counts:
  Omnivore: 384 .npz files
  SlowFast: 384 .npz files


In [9]:
# Load checkpoints
import os
import shutil
import subprocess

checkpoint_path = CHECKPOINTS_DRIVE_PATH if USE_GOOGLE_DRIVE else None

if checkpoint_path and os.path.exists(checkpoint_path):
    print(f"Loading checkpoints from: {checkpoint_path}")

    # Check if it's a zip file
    is_zip = checkpoint_path.lower().endswith('.zip') or (os.path.isfile(checkpoint_path) and 'zip' in str(checkpoint_path))

    if is_zip:
        print("Detected zip file, extracting...")
        shutil.copy(checkpoint_path, '/tmp/checkpoints.zip')

        try:
            subprocess.run(['unzip', '-q', '/tmp/checkpoints.zip', '-d', '/tmp/checkpoints_extracted'], check=True)

            # Find error_recognition_best directory
            extracted_base = '/tmp/checkpoints_extracted'
            extracted_path = None

            # Check common locations
            if os.path.exists(os.path.join(extracted_base, 'error_recognition_best')):
                extracted_path = os.path.join(extracted_base, 'error_recognition_best')
            elif os.path.exists(os.path.join(extracted_base, 'MLP')) or os.path.exists(os.path.join(extracted_base, 'Transformer')):
                extracted_path = extracted_base
            else:
                # Search recursively
                for root, dirs, files in os.walk(extracted_base):
                    if 'error_recognition_best' in dirs:
                        extracted_path = os.path.join(root, 'error_recognition_best')
                        break
                    if 'MLP' in dirs or 'Transformer' in dirs:
                        extracted_path = root
                        break

                if extracted_path is None:
                    extracted_path = extracted_base

            print(f"Copying from: {extracted_path}")
            shutil.copytree(extracted_path, 'checkpoints/error_recognition_best', dirs_exist_ok=True)

            # Cleanup
            shutil.rmtree('/tmp/checkpoints_extracted', ignore_errors=True)
            os.remove('/tmp/checkpoints.zip')
            print("✓ Checkpoints extracted")
        except Exception as e:
            print(f"✗ Error extracting checkpoints: {e}")
    else:
        # It's a directory
        print("Detected directory, copying...")
        if os.path.basename(checkpoint_path) == 'error_recognition_best':
            shutil.copytree(checkpoint_path, 'checkpoints/error_recognition_best', dirs_exist_ok=True)
        else:
            os.makedirs('checkpoints/error_recognition_best', exist_ok=True)
            for item in os.listdir(checkpoint_path):
                src = os.path.join(checkpoint_path, item)
                dst = os.path.join('checkpoints/error_recognition_best', item)
                if os.path.isdir(src):
                    shutil.copytree(src, dst, dirs_exist_ok=True)
                else:
                    shutil.copy2(src, dst)
        print("✓ Checkpoints copied")
else:
    print("⚠ Checkpoints not found. Please upload manually:")
    print("  1. Download from: https://utdallas.app.box.com/s/uz3s1alrzucz03sleify8kazhuc1ksl3")
    print("  2. Extract error_recognition_best directory")
    print("  3. Upload to checkpoints/error_recognition_best/")

# Verify checkpoints
if os.path.exists('checkpoints/error_recognition_best'):
    pt_files = []
    for root, dirs, files in os.walk('checkpoints/error_recognition_best'):
        pt_files.extend([os.path.join(root, f) for f in files if f.endswith('.pt')])
    print(f"\n✓ Found {len(pt_files)} checkpoint files")
    if pt_files:
        print("\nSample checkpoint files:")
        for f in pt_files[:3]:
            print(f"  {f}")
else:
    print("\n✗ Checkpoints directory not found")


Loading checkpoints from: /content/drive/MyDrive/AML_mistake_detection/error_recognition_best.zip
Detected zip file, extracting...
Copying from: /tmp/checkpoints_extracted/error_recognition_best
✓ Checkpoints extracted

✓ Found 54 checkpoint files

Sample checkpoint files:
  checkpoints/error_recognition_best/Transformer/imagebind/error_recognition_Transformer_imagebind_audio_recordings_epoch_31.pt
  checkpoints/error_recognition_best/Transformer/imagebind/error_recognition_Transformer_imagebind_audio_step_epoch_44.pt
  checkpoints/error_recognition_best/Transformer/imagebind/error_recognition_Transformer_imagebind_video_audio_step_epoch_9.pt


In [10]:
# Load annotations (if not already in repository)
import os
import shutil

if USE_GOOGLE_DRIVE and os.path.exists(ANNOTATIONS_DRIVE_PATH):
    print(f"Loading annotations from: {ANNOTATIONS_DRIVE_PATH}")

    # Copy annotation_json
    annotation_json_src = os.path.join(ANNOTATIONS_DRIVE_PATH, 'annotation_json')
    if os.path.exists(annotation_json_src):
        for file in os.listdir(annotation_json_src):
            src = os.path.join(annotation_json_src, file)
            dst = os.path.join('annotations/annotation_json', file)
            if os.path.isfile(src):
                shutil.copy2(src, dst)
                print(f"  ✓ Copied {file}")

    # Copy data_splits
    data_splits_src = os.path.join(ANNOTATIONS_DRIVE_PATH, 'data_splits')
    if os.path.exists(data_splits_src):
        for file in os.listdir(data_splits_src):
            src = os.path.join(data_splits_src, file)
            dst = os.path.join('annotations/data_splits', file)
            if os.path.isfile(src):
                shutil.copy2(src, dst)
                print(f"  ✓ Copied {file}")

    # Copy er_annotations
    er_annotations_src = os.path.join(ANNOTATIONS_DRIVE_PATH, 'er_annotations')
    if os.path.exists(er_annotations_src):
        for file in os.listdir(er_annotations_src):
            src = os.path.join(er_annotations_src, file)
            dst = os.path.join('er_annotations', file)
            if os.path.isfile(src):
                shutil.copy2(src, dst)
                print(f"  ✓ Copied {file}")
else:
    print("⚠ Annotations not found in Drive. Checking repository...")

# Verify required annotation files
print("\nVerifying annotation files...")
required_files = [
    'annotations/annotation_json/step_annotations.json',
    'annotations/annotation_json/error_annotations.json',
    'er_annotations/recordings_combined_splits.json'
]

missing = []
for file in required_files:
    if os.path.exists(file):
        print(f"✓ Found: {file}")
    else:
        print(f"✗ Missing: {file}")
        missing.append(file)

if missing:
    print(f"\n⚠ Warning: {len(missing)} required annotation file(s) are missing!")
    print("Please ensure these files are available before running evaluations.")
else:
    print("\n✓ All required annotation files are present!")


Loading annotations from: /content/drive/MyDrive/AML_mistake_detection/annotations
  ✓ Copied error_category_idx.json
  ✓ Copied activity_idx_step_idx.json
  ✓ Copied error_annotations.json
  ✓ Copied step_annotations.json
  ✓ Copied complete_step_annotations.json
  ✓ Copied step_idx_description.json
  ✓ Copied recording_id_step_idx.json
  ✓ Copied environment_data_split_combined.json
  ✓ Copied person_data_split_normal.json
  ✓ Copied recordings_data_split_normal.json
  ✓ Copied person_data_split_combined.json
  ✓ Copied environment_data_split_normal.json
  ✓ Copied recordings_data_split_combined.json
  ✓ Copied recipes_data_split_normal.json
  ✓ Copied recipes_data_split_combined.json

Verifying annotation files...
✓ Found: annotations/annotation_json/step_annotations.json
✓ Found: annotations/annotation_json/error_annotations.json
✓ Found: er_annotations/recordings_combined_splits.json

✓ All required annotation files are present!


## 3. Run Error-Type Analysis Evaluations

Run evaluations with error-type analysis. The results will include:
- Standard metrics (sub-step and step level)
- **Error-type analysis** showing performance per error category

**Note**: Use threshold 0.6 for `step` split and 0.4 for `recordings` split.


In [11]:
# Example 1: Omnivore - MLP - Step split
# This will show error-type analysis in the output
!python -m core.evaluate \
    --split step \
    --backbone omnivore \
    --variant MLP \
    --ckpt checkpoints/error_recognition_best/MLP/omnivore/error_recognition_MLP_omnivore_step_epoch_43.pt \
    --threshold 0.6


Loaded annotations...... 
Loading recording ids from recordings_combined_splits.json
test Progress: 42347/798: 100% 798/798 [00:10<00:00, 76.45it/s]
----------------------------------------------------------------
test Sub Step Level Metrics: {'precision': 0.4096162736939436, 'recall': 0.2989708115404083, 'f1': 0.3456549302643129, 'accuracy': 0.6831416629277163, 'auc': np.float64(0.6541560352028618), 'pr_auc': tensor(0.3187)}
test Step Level Metrics: {'precision': 0.6607142857142857, 'recall': 0.14859437751004015, 'f1': 0.24262295081967214, 'accuracy': 0.7105263157894737, 'auc': np.float64(0.7573902166041213), 'pr_auc': tensor(0.3638)}
----------------------------------------------------------------
----------------------------------------------------------------
test Error Type Analysis:
----------------------------------------------------------------
Error Type                Count    Precision    Recall       F1           Accuracy     AUC         
-----------------------------------

In [12]:
# Example 2: Omnivore - Transformer - Step split
# This will show error-type analysis in the output
!python -m core.evaluate \
    --split step \
    --backbone omnivore \
    --variant Transformer \
    --ckpt checkpoints/error_recognition_best/Transformer/omnivore/error_recognition_Transformer_omnivore_step_epoch_9.pt \
    --threshold 0.6


Loaded annotations...... 
Loading recording ids from recordings_combined_splits.json
test Progress: 42347/798: 100% 798/798 [00:05<00:00, 133.97it/s]
----------------------------------------------------------------
test Sub Step Level Metrics: {'precision': 0.4445452483556362, 'recall': 0.6613801248523705, 'f1': 0.5317056629365887, 'accuracy': 0.6738848088412402, 'auc': np.float64(0.7461755308526944), 'pr_auc': tensor(0.3888)}
test Step Level Metrics: {'precision': 0.5155709342560554, 'recall': 0.5983935742971888, 'f1': 0.5539033457249071, 'accuracy': 0.6992481203007519, 'auc': np.float64(0.7561832027563805), 'pr_auc': tensor(0.4338)}
----------------------------------------------------------------
----------------------------------------------------------------
test Error Type Analysis:
----------------------------------------------------------------
Error Type                Count    Precision    Recall       F1           Accuracy     AUC         
------------------------------------

In [13]:
# Example 3: Omnivore - MLP - Recordings split
# This will show error-type analysis in the output
!python -m core.evaluate \
    --split recordings \
    --backbone omnivore \
    --variant MLP \
    --ckpt checkpoints/error_recognition_best/MLP/omnivore/error_recognition_MLP_omnivore_recordings_epoch_33.pt \
    --threshold 0.4


Loaded annotations...... 
Loading recording ids from recordings_combined_splits.json
test Progress: 38340/671: 100% 671/671 [00:04<00:00, 150.57it/s]
----------------------------------------------------------------
test Sub Step Level Metrics: {'precision': 0.3964945261528254, 'recall': 0.5688109780280797, 'f1': 0.46727266803505685, 'accuracy': 0.5735263432446531, 'auc': np.float64(0.5988330748775713), 'pr_auc': tensor(0.3673)}
test Step Level Metrics: {'precision': 0.4090909090909091, 'recall': 0.8589211618257261, 'f1': 0.5542168674698795, 'accuracy': 0.503725782414307, 'auc': np.float64(0.6302808067162018), 'pr_auc': tensor(0.4020)}
----------------------------------------------------------------
----------------------------------------------------------------
test Error Type Analysis:
----------------------------------------------------------------
Error Type                Count    Precision    Recall       F1           Accuracy     AUC         
------------------------------------

In [14]:
# Example 4: SlowFast - MLP - Step split
# This will show error-type analysis in the output
!python -m core.evaluate \
    --split step \
    --backbone slowfast \
    --variant MLP \
    --ckpt checkpoints/error_recognition_best/MLP/slowfast/error_recognition_MLP_slowfast_step_epoch_15.pt \
    --threshold 0.6


Loaded annotations...... 
Loading recording ids from recordings_combined_splits.json
test Progress: 42347/798: 100% 798/798 [00:05<00:00, 155.78it/s]
----------------------------------------------------------------
test Sub Step Level Metrics: {'precision': 0.3910761154855643, 'recall': 0.03770879028176143, 'f1': 0.06878510425482803, 'accuracy': 0.7141946300800529, 'auc': np.float64(0.5777348914133424), 'pr_auc': tensor(0.2841)}
test Step Level Metrics: {'precision': 0.31917631917631917, 'recall': 0.9959839357429718, 'f1': 0.4834307992202729, 'accuracy': 0.3358395989974937, 'auc': np.float64(0.6309610024798646), 'pr_auc': tensor(0.3191)}
----------------------------------------------------------------
----------------------------------------------------------------
test Error Type Analysis:
----------------------------------------------------------------
Error Type                Count    Precision    Recall       F1           Accuracy     AUC         
---------------------------------

## 4. View and Download Results

The error-type analysis results are saved to CSV files. Let's view and download them.


In [15]:
# List all error-type analysis results
import os
import glob

results_dir = 'results/error_recognition/error_type_analysis'
if os.path.exists(results_dir):
    csv_files = glob.glob(os.path.join(results_dir, '*.csv'))
    if csv_files:
        print(f"Found {len(csv_files)} error-type analysis CSV file(s):\n")
        for csv_file in sorted(csv_files):
            print(f"  {csv_file}")
    else:
        print("No error-type analysis CSV files found.")
        print("\n⚠ Note: If you just ran an evaluation, make sure:")
        print("  1. The evaluation completed successfully")
        print("  2. Error-type analysis was computed (check the evaluation output)")
        print("  3. The dataset contains error category annotations")
else:
    print(f"⚠ Results directory not found: {results_dir}")
    print("\nThis means no evaluations with error-type analysis have been run yet.")
    print("Please run the evaluation cells in Section 3 first.")
    print("\nThe results will be saved to:")
    print(f"  {results_dir}/")


Found 2 error-type analysis CSV file(s):

  results/error_recognition/error_type_analysis/error_type_analysis_step_True_substep_True_threshold_0.4.csv
  results/error_recognition/error_type_analysis/error_type_analysis_step_True_substep_True_threshold_0.6.csv


In [16]:
# View the latest error-type analysis results
import pandas as pd
import os
import glob

results_dir = 'results/error_recognition/error_type_analysis'
if os.path.exists(results_dir):
    csv_files = glob.glob(os.path.join(results_dir, '*.csv'))
    if csv_files:
        # Get the most recent file
        latest_file = max(csv_files, key=os.path.getmtime)
        print(f"Viewing: {latest_file}\n")

        # Read and display
        df = pd.read_csv(latest_file)
        print(df.to_string(index=False))
    else:
        print("No CSV files found.")
        print("\n⚠ Note: If you just ran an evaluation, make sure:")
        print("  1. The evaluation completed successfully")
        print("  2. Error-type analysis was computed (check the evaluation output)")
        print("  3. The dataset contains error category annotations")
else:
    print(f"⚠ Results directory not found: {results_dir}")
    print("\nThis means no evaluations with error-type analysis have been run yet.")
    print("Please run the evaluation cells in Section 3 first.")


Viewing: results/error_recognition/error_type_analysis/error_type_analysis_step_True_substep_True_threshold_0.6.csv

Split Backbone     Variant Modality        Error Type  Count  Precision  Recall     F1  Accuracy   AUC  PR AUC
 step omnivore         MLP    video Measurement Error     42     100.00    9.52  17.39      9.52  0.00  100.00
 step omnivore         MLP    video          No Error    626      40.62   16.88  23.85     86.74 76.91   17.08
 step omnivore         MLP    video Preparation Error     49     100.00   16.33  28.07     16.33  0.00  100.00
 step omnivore         MLP    video   Technique Error     62     100.00   22.58  36.84     22.58  0.00  100.00
 step omnivore         MLP    video Temperature Error      8       0.00    0.00   0.00      0.00  0.00  100.00
 step omnivore         MLP    video      Timing Error     34     100.00    5.88  11.11      5.88  0.00  100.00
 step omnivore Transformer    video Measurement Error     42     100.00   64.29  78.26     64.29  0.00  10

In [17]:
# Download results to your local machine
from google.colab import files
import shutil
import os
import glob

results_dir = 'results/error_recognition/error_type_analysis'
if os.path.exists(results_dir):
    csv_files = glob.glob(os.path.join(results_dir, '*.csv'))
    if csv_files:
        # Create a zip file of all results
        zip_path = '/tmp/error_type_analysis_results.zip'
        shutil.make_archive(zip_path.replace('.zip', ''), 'zip', results_dir)

        print(f"Found {len(csv_files)} CSV file(s) to download.")
        print("Downloading error-type analysis results...")
        files.download(zip_path)
        print("✓ Download complete!")
    else:
        print("No CSV files found to download.")
        print("\n⚠ Note: If you just ran an evaluation, make sure:")
        print("  1. The evaluation completed successfully")
        print("  2. Error-type analysis was computed (check the evaluation output)")
        print("  3. The dataset contains error category annotations")
else:
    print(f"⚠ Results directory not found: {results_dir}")
    print("\nThis means no evaluations with error-type analysis have been run yet.")
    print("Please run the evaluation cells in Section 3 first.")


Found 2 CSV file(s) to download.
Downloading error-type analysis results...


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

✓ Download complete!


## 5. List Available Checkpoints

Use this cell to find the correct checkpoint paths for different configurations.


In [18]:
# List available checkpoints to find correct epoch numbers
import os
import glob

checkpoint_base = 'checkpoints/error_recognition_best'
if os.path.exists(checkpoint_base):
    print("Available checkpoints:")
    for ckpt_file in sorted(glob.glob(os.path.join(checkpoint_base, '**/*.pt'), recursive=True)):
        print(f"  {ckpt_file}")
else:
    print("Checkpoints directory not found")


Available checkpoints:
  checkpoints/error_recognition_best/MLP/3dresnet/error_recognition_MLP_3dresnet_environment_epoch_11.pt
  checkpoints/error_recognition_best/MLP/3dresnet/error_recognition_MLP_3dresnet_person_epoch_39.pt
  checkpoints/error_recognition_best/MLP/3dresnet/error_recognition_MLP_3dresnet_recordings_epoch_45.pt
  checkpoints/error_recognition_best/MLP/3dresnet/error_recognition_MLP_3dresnet_step_epoch_41.pt
  checkpoints/error_recognition_best/MLP/imagebind/error_recognition_MLP_imagebind_audio_environment_epoch_50.pt
  checkpoints/error_recognition_best/MLP/imagebind/error_recognition_MLP_imagebind_audio_person_epoch_8.pt
  checkpoints/error_recognition_best/MLP/imagebind/error_recognition_MLP_imagebind_audio_recordings_epoch_2.pt
  checkpoints/error_recognition_best/MLP/imagebind/error_recognition_MLP_imagebind_audio_step_epoch_28.pt
  checkpoints/error_recognition_best/MLP/imagebind/error_recognition_MLP_imagebind_video_audio_environment_epoch_31.pt
  checkpoints/

## Troubleshooting

### Common Issues:

1. **Missing features**: Make sure you have uploaded both Omnivore and SlowFast features
2. **Missing checkpoints**: Download from the official Box link and extract properly
3. **Missing annotations**: Ensure annotation files are in the correct directories
4. **Wrong checkpoint path**: Use the cell above to list available checkpoints and update paths
5. **CUDA errors**: The code will automatically fall back to CPU if CUDA is not available
6. **No error-type analysis output**: Make sure you're using the latest code with error-type analysis support

### Expected Output:

When running evaluations, you should see:
1. Standard metrics (sub-step and step level)
2. **Error Type Analysis** table showing:
   - Error Type (Technique Error, Preparation Error, etc.)
   - Count (number of steps with that error type)
   - Precision, Recall, F1, Accuracy, AUC for each error type

### Getting Help:
- Check the README.md in the repository
- Verify all file paths are correct
- Ensure all dependencies are installed
