dataset with annotated text locations in a news broadcast
Matlab
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
IoU.m v1 Oct 23, 2015
README.md Update README.md Oct 23, 2015
adnotari_ocr.mat v1 Oct 23, 2015
convert_output.m v1 Oct 23, 2015
import_ocr_result.m v1 Oct 23, 2015
ocr_eval.m v1 Oct 23, 2015
ocr_pr.m v1 Oct 23, 2015
results_captioncapture.txt v1 Oct 23, 2015
vis_ocr.m v1 Oct 23, 2015

README.md

News broadcast text localization dataset

Dataset description

The set contains 4225 images, extracted from video news broadcasts. They contain both text added by the news service and naturally occurring text. The images are all of 748*432 size.

Download

The dataset can be downloaded at: https://www.dropbox.com/s/l076cfy18nmgvgc/euronews_frames.7z?dl=0

Evaluation

The provided Matlab code computes precision and recall scores for evaluating ocr text localization performance. The common PASCAL IoU threshold of 0.5 was used. As a demo, we have included a set of results in results_captioncapture.txt

Instruction

  1. Download the dataset and provided code
  2. Modify dataset, annotation and ocr output paths as needed
  3. Run ocr_eval.m