Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
b4a293d
commit 660724d
Showing
1 changed file
with
11 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
# extracting-chinese-subs | ||
This repository contains code to extract Chinese hard subs from the TV series 他来了请闭眼 (*Love Me If You Dare*). For further information please see [this post on my blog](http://www.kerrickstaley.com/2017/05/29/extracting-chinese-subs-part-1). | ||
|
||
To get started, install OpenCV, Tesseract, the `chi_sim` data pack for Tesseract, and PyOCR. The following commands will work on Arch Linux: | ||
|
||
``` | ||
sudo pacman -S opencv python-numpy tesseract tesseract-data-chi_sim | ||
sudo pip install pyocr | ||
``` | ||
|
||
Then try running `./main.py --test-all` to test the extraction algorithm on all test cases. To run it on a video file, you'll need to track down a 1280x720 video of one of the 他来了请闭眼 episodes with white hard subs at the bottom, similar to the test frames. |