Total Text Dataset - ICDAR 2017. It consists of 1555 images with more than 3 different text orientations: Horizontal, Multi-Oriented, and Curved, one of a kind.
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
Annotation_tools Annotation tools Aug 24, 2018
Dataset Update README.md May 30, 2018
Evaluation_Protocol Update README.md May 30, 2018
Groundtruth Update README.md May 30, 2018
ICDAR17.gif Add files via upload Aug 25, 2018
LICENSE Create LICENSE May 15, 2018
README.md Update README.md Aug 25, 2018
ttstatistics.png Add files via upload Nov 4, 2017

README.md

Total-Text-Dataset

Updated on August 24, 2018 (Newly added annotation tool folder.)

Updated on May 15, 2018 (Added groundtruth in '.txt' format.)

Updated on May 14, 2018 (Added feature - 'Do not care' candidates filtering is now available in the latest python scripts.)

Updated on April 03, 2018 (Added pixel level groundtruth)

Updated on November 04, 2017 (Added text level groundtruth)

Released on October 27, 2017

Description

In order to facilitate a new text detection research, we introduce the Total-Text dataset (ICDAR-17 paper) (presentation slides), which is more comprehensive than the existing text datasets. The Total-Text consists of 1555 images with more than 3 different text orientations: Horizontal, Multi-Oriented, and Curved, one of a kind.

Citation

If you find this dataset useful for your research, please cite

@inproceedings{CK2017,
  author    = {Chee Kheng Ch’ng and
               Chee Seng Chan},
  title     = {Total-Text: A Comprehensive Dataset for Scene Text Detection and Recognition},
  booktitle = {14th IAPR International Conference on Document Analysis and Recognition {ICDAR}},
  pages     = {935--942},
  year      = {2017},
  doi       = {10.1109/ICDAR.2017.157},
}

Feedback

Suggestions and opinions of this dataset (both positive and negative) are greatly welcome. Please contact the authors by sending email to chngcheekheng at gmail.comor cs.chan at um.edu.my.

License and Copyright

The project is open source under BSD-3 license (see the LICENSE file). Codes can be used freely only for academic purpose.

Copyright 2018, Center of Image and Signal Processing, Faculty of Computer Science and Information Technology, University of Malaya.