### Pruner

Designed to downsample excess videos
  

---

In [1]:
import os
from pathlib import Path

def prune_dataset(root_dir, train_limit=40, test_limit=5, val_limit=5):
    root = Path(root_dir)

    # Define folder limits
    limits = {
        "Train": train_limit,
        "Test": test_limit,
        "Validation": val_limit
    }

    for split, limit in limits.items():
        split_path = root / split
        if not split_path.exists():
            continue  # Skip missing splits like "Validation"

        print(f"\nProcessing {split} (limit={limit}) ...")

        # Each class folder
        for class_dir in split_path.iterdir():
            if not class_dir.is_dir():
                continue

            # List only .mp4 files
            mp4_files = sorted([f for f in class_dir.iterdir() if f.suffix.lower() == ".mp4"])

            # Check if pruning is needed
            if len(mp4_files) > limit:
                to_delete = mp4_files[limit:]
                print(f" - {class_dir.name}: keeping {limit}, deleting {len(to_delete)} extra files")

                for f in to_delete:
                    try:
                        f.unlink()
                    except Exception as e:
                        print(f"Error deleting {f}: {e}")

            else:
                print(f" - {class_dir.name}: {len(mp4_files)} files (no deletion needed)")


prune_dataset(r"C:\Users\rayaa\Downloads\ucf_crime_v2\ucf_crime_v2")



Processing Train (limit=40) ...
 - Abuse: 40 files (no deletion needed)
 - Arrest: 40 files (no deletion needed)
 - Arson: 40 files (no deletion needed)
 - Assault: 40 files (no deletion needed)
 - Burglary: keeping 40, deleting 40 extra files
 - Explosion: 40 files (no deletion needed)
 - Fighting: 40 files (no deletion needed)
 - NormalVideos: keeping 40, deleting 130 extra files
 - RoadAccidents: keeping 40, deleting 80 extra files
 - Robbery: keeping 40, deleting 80 extra files
 - Shooting: 40 files (no deletion needed)
 - Shoplifting: 40 files (no deletion needed)
 - Stealing: keeping 40, deleting 40 extra files
 - Vandalism: 40 files (no deletion needed)

Processing Test (limit=5) ...
 - Abuse: 5 files (no deletion needed)
 - Arrest: 5 files (no deletion needed)
 - Arson: 5 files (no deletion needed)
 - Assault: 5 files (no deletion needed)
 - Burglary: keeping 5, deleting 5 extra files
 - Explosion: 5 files (no deletion needed)
 - Fighting: 5 files (no deletion needed)
 - Norma