A tool for extracting hardcoded subtitles from YouTube videos using computer vision and OCR techniques.
This application allows you to extract hardcoded subtitles (text embedded directly in video frames) from YouTube videos. It works by:
- Downloading the video from YouTube
- Extracting unique frames at specified intervals
- Providing a GUI to select and extract text from subtitle regions
- Using OCR (Optical Character Recognition) to convert the visual text to editable text
- Download videos directly from YouTube URLs
- Extract frames at customizable intervals
- Interactive GUI for frame selection and text region identification
- OCR processing to extract text from selected regions
- Copy extracted text to clipboard
- Python 3.6+
- PyQt5
- OpenCV
- NumPy
- Tesseract OCR
- yt-dlp
- Clone this repository:
git clone
cd SubtitleExtraction
- Install the required Python packages:
pip install -r requirements.txt
To download a video and extract frames:
python main.py https://www.youtube.com/watch?v=VIDEO_ID
url: YouTube video URL (optional if using existing frames)-d, --download-dir: Directory to save downloaded videos (default: 'downloads')-f, --frame-dir: Directory to save extracted frames (default: 'frames')-i, --interval: Frame capture interval in seconds (default: 1.0)--use-existing: Use existing frames without downloading a new video
Download a video and extract frames every 1 second:
python main.py https://www.youtube.com/watch?v=VIDEO_ID
Download a video and extract frames every 0.5 seconds:
python main.py https://www.youtube.com/watch?v=VIDEO_ID -i 0.5
Use existing frames without downloading a new video:
python main.py --use-existing
- The application will open with a GUI showing extracted frames.
- The left sidebar displays thumbnails of all extracted frames.
- Click on a thumbnail to select and display that frame in the main view.
- In the main view, click and drag to select a region containing subtitles.
- Click "Extract Text" to perform OCR on the selected region.
- The extracted text will be displayed in a popup window.
- Click "Copy Text" to copy the extracted text to your clipboard.
main.py: Main entry point for the applicationvideo_downloader.py: Handles downloading videos from YouTubeframe_analyzer.py: Extracts unique frames from videosocr_processor.py: Processes images to extract textgui_processor.py: Provides the graphical user interface
[Your License Here]
Contributions are welcome! Please feel free to submit a Pull Request.
Hugging Face models are stored in ~/.cache/huggingface/hub