# UCF-Crime Anomaly Detection - Colab Setup

Setup notebook for running MIL Ranking Loss re-implementation on Google Colab.

## 1. Check GPU

In [None]:
!nvidia-smi

# Verify PyTorch can use GPU
import torch
print(f"\nPyTorch CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name(0)}")
    print(f"CUDA version: {torch.version.cuda}")

## 2. Mount Google Drive

**Prerequisites:**
1. Upload features.zip and annotations.zip to Google Drive
2. Google Drive structure:
```
MyDrive/
└── Colab Notebooks/
    └── data_distribution/
        ├── features.zip
        └── annotations.zip
```

In [None]:
from google.colab import drive
drive.mount('/content/drive')

## 3. Clone Repository

In [None]:
!git clone https://github.com/KwonPodo/MILRankingLoss_Sultani2018_ReImplementation.git
%cd MILRankingLoss_Sultani2018_ReImplementation

## 4. Install Packages

**Note:** Using `requirements-colab.txt` to avoid package conflicts

In [None]:
# Install minimal packages for Colab (avoid conflicts)
!pip install -r requirements-colab.txt -q

# Check installed package versions
import torch
import numpy as np
print(f"PyTorch: {torch.__version__}")
print(f"NumPy: {np.__version__}")

## 5. Extract Data

Extract features from Google Drive.

In [None]:
import os

# Verify working directory
%cd /content/MILRankingLoss_Sultani2018_ReImplementation
!pwd

# Create data directory
!mkdir -p data

# Google Drive path
DRIVE_DATA_PATH = '/content/drive/MyDrive/Colab Notebooks/data_distribution'

# Extract features
!cp "{DRIVE_DATA_PATH}/features.zip" data/
!unzip -q data/features.zip -d data/
!rm data/features.zip

print("Features extracted")
!ls -lh data/

## 6. Extract Annotations

Extract annotation files from Google Drive.

In [None]:
# Extract annotations from Google Drive
!cp "{DRIVE_DATA_PATH}/annotations.zip" data/
!unzip -q data/annotations.zip -d data/
!rm data/annotations.zip

print("Annotations extracted")
!ls -lh data/annotations/

## 7. Verify Dataset

In [None]:
# Verify working directory
%cd /content/MILRankingLoss_Sultani2018_ReImplementation

# Check feature categories
!ls data/features/

# Check sample counts
!echo "Train samples:"
!wc -l data/annotations/train_set.txt
!echo "Test samples:"
!wc -l data/annotations/test_set.txt

## 8. Test Dataset Loading

In [None]:
%cd /content/MILRankingLoss_Sultani2018_ReImplementation
!PYTHONPATH=/content/MILRankingLoss_Sultani2018_ReImplementation:$PYTHONPATH python scripts/test_dataset.py

## 9. Test Model

In [None]:
%cd /content/MILRankingLoss_Sultani2018_ReImplementation
!PYTHONPATH=/content/MILRankingLoss_Sultani2018_ReImplementation:$PYTHONPATH python scripts/test_model.py

## 10. Start Training

### Option 1: Train without WandB

In [None]:
%cd /content/MILRankingLoss_Sultani2018_ReImplementation
!PYTHONPATH=/content/MILRankingLoss_Sultani2018_ReImplementation:$PYTHONPATH python train.py --config configs/default.yaml --no-wandb

### Option 2: Train with WandB

In [None]:
%cd /content/MILRankingLoss_Sultani2018_ReImplementation

# Login to WandB (requires API key)
!wandb login

# Start training
!PYTHONPATH=/content/MILRankingLoss_Sultani2018_ReImplementation:$PYTHONPATH python train.py --config configs/default.yaml

## 11. Evaluate

Evaluate trained model.

In [None]:
%cd /content/MILRankingLoss_Sultani2018_ReImplementation
!PYTHONPATH=/content/MILRankingLoss_Sultani2018_ReImplementation:$PYTHONPATH python evaluate.py \
    --config configs/default.yaml \
    --checkpoint checkpoints/best_model.pth \
    --temporal-annotation data/annotations/Temporal_Anomaly_Annotation_for_Testing_Videos.txt

## 12. View Results

In [None]:
%cd /content/MILRankingLoss_Sultani2018_ReImplementation

# Display ROC curve
from IPython.display import Image, display
import os

if os.path.exists('results/roc_curve.png'):
    display(Image('results/roc_curve.png'))
    
# Print evaluation results
if os.path.exists('results/evaluation_summary.txt'):
    !cat results/evaluation_summary.txt

## 13. (Optional) Save Results to Google Drive

In [None]:
%cd /content/MILRankingLoss_Sultani2018_ReImplementation

# Backup checkpoints and results to Drive
DRIVE_DATA_PATH = '/content/drive/MyDrive/Colab Notebooks/data_distribution'
!mkdir -p "{DRIVE_DATA_PATH}/results"
!cp -r checkpoints "{DRIVE_DATA_PATH}/"
!cp -r results "{DRIVE_DATA_PATH}/"

print("Results saved to Google Drive")

---

## Troubleshooting

### CUDA Out of Memory
Change `batch_size: 60` to `40` in `configs/default.yaml`

### Drive Path Error
Modify `DRIVE_DATA_PATH` variable in cell 10 to match your upload path

### Session Timeout
Periodically backup checkpoints to Drive using cell 13

---

## Estimated Time

- Data extraction: 1-2 min
- Training (100 epochs): 2-3 hours (T4 GPU)
- Evaluation: 5 min

**Works with Colab free tier**

In [None]:
%cd /content/MILRankingLoss_Sultani2018_ReImplementation

# Check extracted data size
!echo "Data directory size:"
!du -sh data/

!echo "\nFeatures size:"
!du -sh data/features/

!echo "\nAnnotations size:"
!du -sh data/annotations/

!echo "\nTotal disk usage:"
!df -h /content

%cd /content/MILRankingLoss_Sultani2018_ReImplementation

# Project directory size only
!echo "Project directory total size:"
!du -sh /content/MILRankingLoss_Sultani2018_ReImplementation

!echo "\nBreakdown:"
!du -sh /content/MILRankingLoss_Sultani2018_ReImplementation/data
!du -sh /content/MILRankingLoss_Sultani2018_ReImplementation/checkpoints 2>/dev/null || echo "checkpoints: (not created yet)"

!echo "\nColab VM total disk (all users/files):"
!df -h /content | grep -v Filesystem

!echo "\nEstimated project usage:"
!echo "- Code: ~10MB"
!echo "- Data: 3.7GB"
!echo "- Checkpoints: ~100MB (after training)"
!echo "- Total: ~3.8GB"