
# Video Chunker by Frame Count (OpenCV)

This notebook splits a video into consecutive **frame-count–based** chunks and saves them into a subfolder named after the source video.

**Example:** `user/xyz/documents/videos/example.mp4` → outputs into `user/xyz/documents/videos/example/` as `example_chunk01.mp4`, `example_chunk02.mp4`, …



## Overview

**What this does**
- Creates an output subdirectory with the **base filename** of your video.
- Splits the video into consecutive chunks, each containing a user-defined number of **frames**.
- Writes a final remainder chunk if the video length isn't divisible by your chunk size.
- Names chunks as: `<video_stem>_chunkXX.mp4` (e.g., `example_chunk01.mp4`).

**Why frames (not time)?**  
This ensures exact frame counts per chunk, which is helpful when you want deterministic splits for annotation, ML, or analysis workflows.

---

## Requirements

- Python 3.8+
- [OpenCV](https://pypi.org/project/opencv-python/) (`cv2`) installed with a working backend (FFmpeg/GStreamer depending on OS).
- Sufficient disk space to write chunked files.

> **Tip (macOS/Linux):** If your OpenCV lacks codecs, install/enable FFmpeg. On macOS, `brew install ffmpeg` can help; on Linux, install via your package manager.



## Install OpenCV, if you need

Uncomment and run the cell below **if** you need to install OpenCV. If you're in a restricted network, install locally on your machine.


In [None]:

#!pip install --upgrade opencv-python

# If you need FFMPEG-enabled backend, ensure FFmpeg is installed system-wide.
#!pip install ffmpeg




## 1) Set Your Inputs

- `input_path`: Absolute or relative path to your video file.
- `chunk_num_frame`: Number of frames per chunk (positive integer).
- `codec`: FourCC code for output (default `mp4v`; try `avc1` or `H264` if available on your system for smaller files).


In [None]:

from pathlib import Path

# >>> EDIT THESE <<<
input_path = Path("/Users/souvikmandal/Documents/example.mp4")
chunk_num_frame = 1000
codec = "mp4v"   # alternatives: "avc1", "H264" (requires proper system codecs)

# No edits needed below
input_path = input_path.expanduser().resolve()
input_path



## 2) Core Function

The function below reads frames sequentially and writes chunk files with the same FPS and resolution as the source video.


In [4]:

import sys
import cv2

def split_video_by_frames(input_path: Path, chunk_num_frame: int, codec: str = "mp4v") -> Path:
    """Split a video into consecutive chunks by frame count.

    Args:
        input_path: Path to the input video.
        chunk_num_frame: Number of frames per chunk (must be > 0).
        codec: FourCC for output encoding (e.g., 'mp4v', 'avc1', 'H264').

    Returns:
        Path to the output directory where chunks are saved.
    """
    if not input_path.exists() or not input_path.is_file():
        raise FileNotFoundError(f"Input file not found: {input_path}")
    if chunk_num_frame <= 0:
        raise ValueError("--chunk_num_frame must be a positive integer.")

    stem = input_path.stem                      # e.g., "example"
    parent_dir = input_path.parent              # e.g., user/xyz/documents/videos
    output_dir = parent_dir / stem              # e.g., user/xyz/documents/videos/example
    output_dir.mkdir(parents=True, exist_ok=True)

    cap = cv2.VideoCapture(str(input_path))
    if not cap.isOpened():
        raise RuntimeError(f"Could not open video: {input_path}")

    fps = cap.get(cv2.CAP_PROP_FPS)
    width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
    height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))

    if fps <= 0 or width <= 0 or height <= 0:
        print("[WARN] Could not read video metadata reliably. Proceeding with defaults if possible.", file=sys.stderr)

    fourcc = cv2.VideoWriter_fourcc(*codec)
    chunk_idx = 1
    frames_in_current_chunk = 0
    total_frames = 0
    writer = None

    def start_new_writer(index: int):
        nonlocal writer, frames_in_current_chunk
        out_name = f"{stem}_chunk{index:02d}.mp4"
        out_path = output_dir / out_name
        writer = cv2.VideoWriter(str(out_path), fourcc, fps if fps > 0 else 30.0, (width, height))
        if not writer.isOpened():
            cap.release()
            raise RuntimeError(f"Could not open writer for: {out_path}")
        frames_in_current_chunk = 0
        print(f"[INFO] Writing: {out_path}")

    # Initialize writer for the first chunk
    start_new_writer(chunk_idx)

    try:
        while True:
            ok, frame = cap.read()
            if not ok:
                break  # end of video

            writer.write(frame)
            frames_in_current_chunk += 1
            total_frames += 1

            if frames_in_current_chunk >= chunk_num_frame:
                writer.release()
                chunk_idx += 1
                start_new_writer(chunk_idx)
    finally:
        if writer is not None:
            if frames_in_current_chunk == 0:
                # last writer created but no frames written; try to delete empty file
                writer.release()
                empty_out = output_dir / f"{stem}_chunk{chunk_idx:02d}.mp4"
                try:
                    if empty_out.exists() and empty_out.stat().st_size == 0:
                        empty_out.unlink(missing_ok=True)
                except Exception:
                    pass
            else:
                writer.release()
        cap.release()

    print("\n[SUMMARY]")
    print(f"  Input video: {input_path}")
    print(f"  Output dir : {output_dir}")
    print(f"  Total frames processed: {total_frames}")
    return output_dir



## 3) Run the Splitter

Run the cell below to split your video using the parameters defined earlier.


In [None]:

out_dir = split_video_by_frames(input_path, chunk_num_frame, codec)
out_dir



## 4) Verify Outputs

List the chunked files to confirm.


In [None]:

sorted(list(out_dir.glob("*.mp4")))
