Skip to content

Rain-Poon/SubtitleExtraction

Repository files navigation

Subtitle Extraction Tool

A tool for extracting hardcoded subtitles from YouTube videos using computer vision and OCR techniques.

Overview

This application allows you to extract hardcoded subtitles (text embedded directly in video frames) from YouTube videos. It works by:

  1. Downloading the video from YouTube
  2. Extracting unique frames at specified intervals
  3. Providing a GUI to select and extract text from subtitle regions
  4. Using OCR (Optical Character Recognition) to convert the visual text to editable text

Features

  • Download videos directly from YouTube URLs
  • Extract frames at customizable intervals
  • Interactive GUI for frame selection and text region identification
  • OCR processing to extract text from selected regions
  • Copy extracted text to clipboard

Requirements

  • Python 3.6+
  • PyQt5
  • OpenCV
  • NumPy
  • Tesseract OCR
  • yt-dlp

Installation

  1. Clone this repository:
git clone
cd SubtitleExtraction
  1. Install the required Python packages:
pip install -r requirements.txt

Usage

Basic Usage

To download a video and extract frames:

python main.py https://www.youtube.com/watch?v=VIDEO_ID

Command Line Options

  • url: YouTube video URL (optional if using existing frames)
  • -d, --download-dir: Directory to save downloaded videos (default: 'downloads')
  • -f, --frame-dir: Directory to save extracted frames (default: 'frames')
  • -i, --interval: Frame capture interval in seconds (default: 1.0)
  • --use-existing: Use existing frames without downloading a new video

Examples

Download a video and extract frames every 1 second:

python main.py https://www.youtube.com/watch?v=VIDEO_ID

Download a video and extract frames every 0.5 seconds:

python main.py https://www.youtube.com/watch?v=VIDEO_ID -i 0.5

Use existing frames without downloading a new video:

python main.py --use-existing

GUI Instructions

  1. The application will open with a GUI showing extracted frames.
  2. The left sidebar displays thumbnails of all extracted frames.
  3. Click on a thumbnail to select and display that frame in the main view.
  4. In the main view, click and drag to select a region containing subtitles.
  5. Click "Extract Text" to perform OCR on the selected region.
  6. The extracted text will be displayed in a popup window.
  7. Click "Copy Text" to copy the extracted text to your clipboard.

Project Structure

  • main.py: Main entry point for the application
  • video_downloader.py: Handles downloading videos from YouTube
  • frame_analyzer.py: Extracts unique frames from videos
  • ocr_processor.py: Processes images to extract text
  • gui_processor.py: Provides the graphical user interface

License

[Your License Here]

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Notes

Hugging Face models are stored in ~/.cache/huggingface/hub

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages