# Video Frame Extraction for Violence Detection Dataset

This notebook extracts frames from videos for training a violence detection model. It processes both violence and non-violence videos and saves extracted frames to the appropriate directories.

## Import Required Libraries

In [1]:
import cv2
import os
import glob
from tqdm import tqdm

## Create Directory Structure

In [2]:
# Create data directories if they don't exist
os.makedirs('./data/Violence', exist_ok=True)
os.makedirs('./data/NonViolence', exist_ok=True)

# Check if directories are created
print(f"Violence directory exists: {os.path.exists('./data/Violence')}")
print(f"NonViolence directory exists: {os.path.exists('./data/NonViolence')}")

Violence directory exists: True
NonViolence directory exists: True


## Extract Frames from Violence Videos

In [3]:
# Path to violence videos
PATH_violence = 'video/Violence'

# Check if directory exists
if not os.path.exists(PATH_violence):
    print(f"Warning: Directory {PATH_violence} does not exist!")
    video_files = []
else:
    # Count video files
    video_files = glob.glob(PATH_violence + '/*')
    print(f"Found {len(video_files)} violence videos")

Found 1000 violence videos


In [4]:
# Process Violence videos
violence_frame_count = 0
for path in tqdm(video_files, desc="Processing violence videos"):
    fname = os.path.basename(path).split('.')[0]
    vidcap = cv2.VideoCapture(path)
    success, image = vidcap.read()
    count = 0
    extracted = 0
    
    # Extract every 3rd frame (matching mobilenetv2_tsm.py frame sampling)
    while success:
        if count % 3 == 0:
            cv2.imwrite(f"./data/Violence/{fname}-{str(count).zfill(4)}.jpg", image)
            extracted += 1
        success, image = vidcap.read()
        count += 1
    violence_frame_count += extracted
    vidcap.release()

# Print information about extracted frames
print(f"Total frames extracted from violence videos: {violence_frame_count}")

Processing violence videos:   0%|          | 0/1000 [00:00<?, ?it/s]

Processing violence videos: 100%|██████████| 1000/1000 [12:20<00:00,  1.35it/s] 

Total frames extracted from violence videos: 53371





## Extract Frames from Non-Violence Videos

In [5]:
# Path to non-violence videos
PATH_nonviolence = 'video/NonViolence'

# Check if directory exists
if not os.path.exists(PATH_nonviolence):
    print(f"Warning: Directory {PATH_nonviolence} does not exist!")
    nonviolence_video_files = []
else:
    # Count video files
    nonviolence_video_files = glob.glob(PATH_nonviolence + '/*')
    print(f"Found {len(nonviolence_video_files)} non-violence videos")

Found 1000 non-violence videos


In [6]:
# Process NonViolence videos
nonviolence_frame_count = 0
for path in tqdm(nonviolence_video_files, desc="Processing non-violence videos"):
    fname = os.path.basename(path).split('.')[0]
    vidcap = cv2.VideoCapture(path)
    success, image = vidcap.read()
    count = 0
    extracted = 0
    
    # Extract every 3rd frame (matching mobilenetv2_tsm.py frame sampling)
    while success:
        if count % 3 == 0:
            cv2.imwrite(f"./data/NonViolence/{fname}-{str(count).zfill(4)}.jpg", image)
            extracted += 1
        success, image = vidcap.read()
        count += 1
        
    nonviolence_frame_count += extracted
    vidcap.release()
    
# Print information about extracted frames
print(f"Total frames extracted from non-violence videos: {nonviolence_frame_count}")

Processing non-violence videos: 100%|██████████| 1000/1000 [06:30<00:00,  2.56it/s]

Total frames extracted from non-violence videos: 42911





## Summary

In [7]:
# Count extracted frames
violence_frames = len(glob.glob('./data/Violence/*.jpg'))
nonviolence_frames = len(glob.glob('./data/NonViolence/*.jpg'))

print(f"Total extracted frames:\n- Violence: {violence_frames}\n- NonViolence: {nonviolence_frames}")

Total extracted frames:
- Violence: 53371
- NonViolence: 42911
