Text detection from Natural Images using CTPN

This repo offers varied solution to Scene text detection based on

CTPN (connectionist text proposal network). It is implemented in tensorflow. The origin paper can be found here.
& also ,alternative older approaches, such as
- CRNN Model
- DenseNet-OCR

Please Note:I have to reonstruct this repo. The repo is written based on older dependencies**, such as

tensorflow-gpu==1.4.0 (to be upgraded to latest stable version - tensorflow-gpu==1.12. 0 and cuda==9.0 , the compatible cuDNN version==7.1. 4).
Model is saved in hdf5 (not so optimized model servable / model serialization format). I'll re-factor the code with TF2.0, and save the model as protobuff (.pb) model serialization format.
It's trained on GTX1070 GPU. I'm planning to retrain this on TPU. IT IS VERY WELL POSSIBLE TO run the program on CPU only, but it's extremely slow due to the non-optimal CPU implementation.
I'll reonstruct this repo with serving using TF_Serving for Quantization and Pruning.

Architecure of CTPN-Model, NVIDIA-GPU

CTPN Model Architecture	NVIDIA-GPU Architecture

Please refer to CTPN Architecture explaine	Please refer to NVIDIA-GPU
VGGNet -> BLSTM of 256D ->FC of 512D ->output layer MLP for text/non-text scores	check CUDA version using `nvcc --version`. cuDNN version using `cat /usr/include/cudnn.h -pipeto- grep CUDNN_MAJOR -A 2`. tensorflow-gpu version using `pip freeze -pipeto- grep tensorflow-gpu`

Multiple AI Model Approaches

CTPN Model
DenseNet-OCR
CRNN

CTPN Model is the latest as of date. & that's the way to go about. As

CTPN is latest & the most effective for Scene text detection.
Even though most data sets are based on English, CTPN perform well in Chinese positioning as well.

Codeset for these approaches

CTPN Model for Scene Text Detection
DenseNet-OCR Model for Scene Text Detection
CRNN Model for Scene Text Detection

2 older AI approaches

Vantage view of 2 older AI approaches

CRNN：  VGGNet + BLSTM + BLSTM + CTC 

DenseNet-OCR ：DenseNet + CTC

Result Comparision

Model Architecture	run time on GPU	Accuracy	Model Servable size
CRNN	60ms	0.972	oops missed it.
DenseNet-OCR	8ms	0.982	18.9MB

some results

NOTICE: all the photos used below are collected from the internet. If it affects you, please contact me to delete them.

oriented text connector

oriented text connector has been implemented, i's working, but still need futher improvement.

Additional source

Interactive Live Demo of 'Scene Text Detection' Interactive demo reference

End to End Chinese OCR by CTPN Model End to End Chinese OCR - a forked repo

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Text detection from Natural Images using CTPN

Architecure of CTPN-Model, NVIDIA-GPU

Multiple AI Model Approaches

Codeset for these approaches

2 older AI approaches

some results

oriented text connector

Additional source

Files

README.md

Latest commit

History

README.md

File metadata and controls

Text detection from Natural Images using CTPN

Architecure of CTPN-Model, NVIDIA-GPU

Multiple AI Model Approaches

Codeset for these approaches

2 older AI approaches

some results

oriented text connector

Additional source