Skip to content

roybaer/burnt-in-subtitle-extractor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

Burnt-in subtitle extractor

This project provides a set of basic extraction tools for burnt-in subtitles, i.e. subtitles that are part of the picture itself.

Tools

The burnt-in subtitle extractor consists of four different tools:

  • locsubtitles locates subtitles within the picture, creates mask files and writes their names to an initial textual subtitle file along with timing information
  • getsubtitles applies those masks to the picture and writes the averaged out graphical subtitles to a new set of image files
  • remsubtitles deletes the masked out areas from the input frames and blends in the surrounding colors
  • ocrsubtitles runs the extracted graphical subtitles through an external OCR and outputs the final textual subtitle file

Examples

The following examples will in part rely on default parameters. You will typically have to provide parameters tuned to your input file. Type e.g. getsubtitles --help for details.

> ffmpeg -i test.flv -f yuv4mpegpipe - | ./locsubtitles > test.sub
> ffmpeg -i test.flv -f yuv4mpegpipe - | ./remsubtitles -s test.sub | ffplay -

How it works

This program relies on several patterns in how subtitles are typically added to videos:

  • Subtitles typically have borders and a defined text and border color
  • Subtitles are usually located in the same part of the picture and do not move
    • Analysis can be limited to a specified region
  • This program further relies on the presence of a frame without subtitles between frames with different subtitles

The algorithm then does the following:

  • For every frame
    • Create a text mask that includes all pixels that have the text color
    • Expand that text mask in all four directions by the stroke width of the text border
    • Create a border mask that includes all pixels that have the border color
    • Expand that border mask in all four directions by the stroke width of the text
    • Calculate the intersection of both masks
    • If the number of pixels in the resulting mask is above a specified threshold
      • (A subtitle has been found)
      • If the previous frame did not have subtitles
        • Mark the start position
        • Otherwise: Set the subtitle mask to the intersection of both frames' subtitle masks
      • Otherwise (no subtitle)
        • If the previous frame had subtitles
          • Write the previous (i.e. the non-empty) subtitle mask to an image file
          • Write start and end time information and the name of the mask file to the subtitle file
  • When the end of the video has been reached, output any pending information and exit

The OCR component uses the Tesseract engine, which has to be installed on the system. The target language is currently hard-coded to Dutch. This can be changed in the wrapper script.

Prerequisites

The program has been tested with the following setup, only:

  • GNU/Linux operating system
  • GNU C++ compiler + Boost libraries
  • Bash shell + Tesseract OCR
  • Y4M video streams produced by ffmpeg

Other setups might work.

About

Set of basic extraction tools for burnt-in subtitles, i.e. subtitles that are part of the picture itself

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published