# HMDB51 Frame Extraction & Validation Notebook

This notebook is all about to **extracting uniform video frames** from the HMDB51 dataset and ensuring data quality before training any deep learning model.

## Here's why I extract frames:

### 1. Deep Learning Models Can‚Äôt Read Video Files Directly
- PyTorch models (like 3D CNNs) need **tensors**, not media files.
- You can‚Äôt just slap an `.avi` into `model(input)` and expect magic.
- So, we break videos down into their individual frames (images).

### 2. Tensors Must Have Specific Shapes
- For 3D CNNs, each input clip should be a 5D tensor like this: [Batch, Channels, Time, Height, Width]
- Extracting frames lets us manually build this structure from clips ‚Äî clean and precise.

### 3. Preprocessing & Augmentation Becomes Easier
- Once we‚Äôve got frames, we can apply all sorts of data augmentation:
- Random crop, flip, color jitter, temporal jitter, etc.
- This makes my model more robust.

### 4. Speed & Flexibility During Training
- Loading frames is way faster than decoding videos on the fly (anyway I will try to do it on fly in one of the model just to see how it works).
- Especially in environments like **Google Colab**, where real-time video decoding can get proper laggy and consuming my very limited GPU time.
---

## Objectives

| Step   | Description                                                                 |
|--------|-----------------------------------------------------------------------------|
| Step 1 | Extract **16 evenly spaced frames** per video from the dataset      |
| Step 2 | Clean and **skip corrupted or too-short videos**                            |
| Step 3 | Validate that every video folder contains **16 readable images**     |
| Step 4 | Detect any frames with **strange dimensions** (e.g., blank or undersized)   |
| Step 5 | Report all issues for further cleaning or reprocessing                      |

---

## Extract 16 Valid Frames from Each Video

This chunk is the **core frame extraction logic**. It performs the following steps:

1. **Reads each video** using OpenCV.
2. **Filters out unreadable videos** and skips those with fewer than 16 valid frames.
3. **Uniformly samples exactly 16 frames** using `np.linspace` to spread them across the full duration.
4. **Saves frames** into a dedicated folder for each video using the format `0000.jpg`, `0001.jpg`, etc.
5. **Iterates through each class** folder and video file within the dataset.

If the video is too short or cannot be read, it's safely skipped with a warning message (luckily we didn't face this issue).

**Configurable parameter:**
- `target_num_frames`: how many frames to extract (set to 16).



In [None]:
import os
import cv2
import numpy as np
from pathlib import Path
from tqdm import tqdm

# Config
input_video_dir = "/Users/alesarabandi/Downloads/DEEPLEARING/videos"
output_frame_dir = "/Users/alesarabandi/Downloads/DEEPLEARING/frames"
target_num_frames = 16
min_required_frames = 16

def extract_exact_16_valid_frames(video_path, output_dir, target_num_frames=16):
    cap = cv2.VideoCapture(str(video_path))
    if not cap.isOpened():
        print(f"Failed to open {video_path}")
        return

    frames = []
    while True:
        ret, frame = cap.read()
        if not ret:
            break
        frames.append(frame)

    cap.release()

    total_valid = len(frames)
    if total_valid < target_num_frames:
        print(f"Skipping {video_path.name} (only {total_valid} valid frames)")
        return

    # uniformly sample 16 indices from the valid ones
    indices = np.linspace(0, total_valid - 1, target_num_frames, dtype=int)
    os.makedirs(output_dir, exist_ok=True)

    for i, idx in enumerate(indices):
        frame = frames[idx]
        out_path = os.path.join(output_dir, f"{i:04d}.jpg")
        cv2.imwrite(out_path, frame)

    if len(indices) != target_num_frames:
        print(f"{video_path.name}: Extracted {len(indices)} frames instead of {target_num_frames}")

# wwalk through class folders and process videos 
input_dir = Path(input_video_dir)
output_dir = Path(output_frame_dir)

for class_dir in tqdm(sorted(input_dir.iterdir()), desc="Processing classes"):
    if class_dir.is_dir():
        for video_path in sorted(class_dir.glob("*.avi")):
            video_name = video_path.stem
            output_subdir = output_dir / class_dir.name / video_name
            extract_exact_16_valid_frames(video_path, output_subdir, target_num_frames)


üìÇ Processing classes: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 52/52 [05:08<00:00,  5.94s/it]


## Validate Frame Count Per Video

This chunk performs **data integrity checks** on the extracted frames to ensure consistency across the dataset.

### What it does:
- Walks through each class and each video folder in the `frames` directory.
- Counts how many `.jpg` frame files exist per video.
- Increments a counter if a video has exactly **16 frames**.
- Flags and stores the path if a video has **fewer or more** than 16 frames.

### Output:
- Summary of how many videos are correct vs problematic.
- A list of video directories with issues and how many frames they contain.

> **Why this matters:** My models expect a fixed input shape ensuring that every video yields exactly 16 frames is **crucial for consistent training and inference**, I know I did not get the warning when about corrupted video on the chunk that I was trying to extract the frames but here just wanna double check.



In [None]:
import os
from pathlib import Path

frame_root_dir = Path("/Users/alesarabandi/Downloads/DEEPLEARING/frames")


total_videos = 0
valid_16 = 0
problems = []

for class_dir in frame_root_dir.iterdir():
    if class_dir.is_dir():
        for video_dir in class_dir.iterdir():
            if video_dir.is_dir():
                frame_count = len([f for f in video_dir.glob("*.jpg")])
                total_videos += 1
                if frame_count == 16:
                    valid_16 += 1
                else:
                    problems.append((video_dir, frame_count))

print(f"‚úÖ Videos with exactly 16 frames: {valid_16}/{total_videos}")
if problems:
    print(f" Problems found in {len(problems)} videos:")
    for path, count in problems:
        print(f"{path}: {count} frames")
else:
    print("üéâ All videos have exactly 16 frames.")


‚úÖ Videos with exactly 16 frames: 6766/6766
üéâ All videos have exactly 16 frames.


## Check for Corrupted or Unreadable Frame Files

This chunk ensures **every extracted frame image is valid and usable** for training.

### What it does:
- Iterates through all `.jpg` frame files inside the extracted video folders.
- Uses `cv2.imread()` to try reading each image.
- Flags the frame if:
  - It **fails to load** (`None`), or
  - The image has **zero size** (`img.size == 0`).

### Output:
- A list of all corrupted or unreadable frames, if any exist.
- A success message if all frames are valid and properly loaded.

> **Why this matters:** Even if a video has 16 frames, corrupted or unreadable images can silently break training or evaluation. This check ensures **data reliability** before moving forward.



In [None]:
import cv2

corrupted = []

for class_dir in frame_root_dir.iterdir():
    if class_dir.is_dir():
        for video_dir in class_dir.iterdir():
            if video_dir.is_dir():
                for frame_path in video_dir.glob("*.jpg"):
                    img = cv2.imread(str(frame_path))
                    if img is None or img.size == 0:
                        corrupted.append(frame_path)

if corrupted:
    print(f" Found {len(corrupted)} corrupted/unreadable frames:")
    for path in corrupted:
        print(f" {path}")
else:
    print("‚úÖ All frame images are readable and non-empty.")


‚úÖ All frame images are readable and non-empty.


## Detect Frames with Unusual Dimensions

This chunk checks if any extracted frame has **unexpectedly small dimensions**, which might indicate:
- Cropping issues
- Encoding errors
- Or videos with very low resolution

### What it does:
- Iterates through all `.jpg` frames.
- Loads each image using OpenCV.
- Flags any frame where **height or width is less than 100 pixels**.

### Output:
- A list of all frames with unusual (too small) dimensions.
- A success message if all frames are of acceptable size.

> Frames with weird sizes might break my model input pipeline or degrade performance. This check ensures **resolution consistency** across the dataset.


In [None]:
weird_dims = []

for class_dir in frame_root_dir.iterdir():
    if class_dir.is_dir():
        for video_dir in class_dir.iterdir():
            if video_dir.is_dir():
                for frame_path in video_dir.glob("*.jpg"):
                    img = cv2.imread(str(frame_path))
                    if img is not None:
                        h, w = img.shape[:2]
                        if h < 100 or w < 100:  
                            weird_dims.append((frame_path, (w, h)))

if weird_dims:
    print(f"\nüîç Found {len(weird_dims)} frames with unusual size:")
    for path, dim in weird_dims:
        print(f"{path}: {dim}")
else:
    print("üìè All frames have normal dimensions.")


üìè All frames have normal dimensions.
