😊😊😊😊😊😊
A curated list of resources dedicated to scene text localization and recognition. Any suggestions and pull requests are welcome.
-
[2015-arxiv] CRNN: An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition
paper
translation
-
[2016-ECCV] CTPN: Detecting Text in Natural Image with Connectionist Text Proposal Network
paper
translation
-
[2017-AAAI] TextBoxes: A Fast Text Detector with a Single Deep Neural Network
paper
code
-
[2018-AAAI] SEE: Towards Semi-Supervised End-to-End Scene Text Recognition
paper
code
-
[2018-arxiv] TextBoxes++: A Single-Shot Oriented Scene Text Detector
paper
code
-
[2018-CVPR] Rotation-Sensitive Regression for Oriented Scene Text Detection
paper
-
[2018-CVPR] Single Shot Text Spotter with Explicit Alignment and Attention
paper
-
[2018-CVPR] Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation
paper
-
[2018-arxiv] FOTS: Fast OrientedText Spotting with a Unified Network
paper
-
[2018-AAAI] PixelLink: Detecting Scene Text via Instance Segmentation
paper
-
[2017-ICCV] Deep TextSpotter: An End-to-End Trainable Scene Text Localization and Recognition Framework
paper
code
-
[2017-CVPR] EAST: An Efficient and Accurate Scene Text Detector
paper
code
-
[2017-CVPR] Detecting oriented text in natural images by linking segments
paper
code
- 运用tensorflow实现自然场景文字检测,keras/pytorch实现crnn+ctc实现不定长中文OCR识别
- Training SEE: Towards Semi-Supervised End-to-End Scene Text Recognition
- Train TextBoxes++: A Single-Shot Oriented Scene Text Detector
- Train CTPN
- SynthText 生成图像文本
- SynthText_Chinese_version
-
COCO-Text (Computer Vision Group, Cornell)
2016
- 63,686 images, 173,589 text instances, 3 fine-grained text attributes.
- Task: text location and recognition
COCO-Text API
-
Synthetic Word Dataset (Oxford, VGG)
2014
- 9 million images covering 90k English words
- Task: text recognition, segmantation
download
-
IIIT 5K-Words
2012
- 5000 images from Scene Texts and born-digital (2k training and 3k testing images)
- Each image is a cropped word image of scene text with case-insensitive labels
- Task: text recognition
download
-
StanfordSynth(Stanford, AI Group)
2012
- Small single-character images of 62 characters (0-9, a-z, A-Z)
- Task: text recognition
download
-
MSRA Text Detection 500 Database (MSRA-TD500)
2012
- 500 natural images(resolutions of the images vary from 1296x864 to 1920x1280)
- Chinese, English or mixture of both
- Task: text detection
-
- 350 high resolution images (average size 1260 × 860) (100 images for training and 250 images for testing)
- Only word level bounding boxes are provided with case-insensitive labels
- Task: text location
-
KAIST Scene_Text Database
2010
- 3000 images of indoor and outdoor scenes containing text
- Korean, English (Number), and Mixed (Korean + English + Number)
- Task: text location, segmantation and recognition
-
Chars74k
2009
- Over 74K images from natural images, as well as a set of synthetically generated characters
- Small single-character images of 62 characters (0-9, a-z, A-Z)
- Task: text recognition
-
ICDAR Benchmark Datasets
Dataset | Discription | Competition Paper |
---|---|---|
ICDAR 2015 | 1000 training images and 500 testing images | paper |
ICDAR 2013 | 229 training images and 233 testing images | paper |
ICDAR 2011 | 229 training images and 255 testing images | paper |
ICDAR 2005 | 1001 training images and 489 testing images | paper |
ICDAR 2003 | 181 training images and 251 testing images(word level and character level) | paper |
-
AlexNet
ImageNet Classification with Deep Convolutional Neural Networks 中文版 中英文对照 -
VGG
Very Deep Convolutional Networks for Large-Scale Image Recognition 中文版 中英文对照 -
ResNet
Deep Residual Learning for Image Recognition 中文版 中英文对照 -
GoogLeNet
Going Deeper With Convolutions 中文版 中英文对照 -
BN-GoogLeNet
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift 中文版 中英文对照 -
Inception-v3
Rethinking the Inception Architecture for Computer Vision 中文版 中英文对照
-
YOLO
You Only Look Once: Unified, Real-Time Object Detection 中文版 中英文对照 -
YOLO9000
YOLO9000: Better, Faster, Stronger 中文版 中英文对照 -
Deformable-ConvNets
Deformable Convolutional Networks 中文版 中英文对照 -
Faster R-CNN
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks 中文版 中英文对照 -
R-FCN
R-FCN: Object Detection via Region-based Fully Convolutional Networks 中文版 中英文对照