Skip to content
/ Textar Public

Arabic text recognition in mutlimedia documents ____ part of IntelligencIA Project

License

Notifications You must be signed in to change notification settings

ooza/Textar

Repository files navigation

Arabic text detection and recognition in scene and video images

This work is dedicated to Arabic text recongition in multimedia documents, implemented using tensorflow and includes two parts: text regions localisation and textline recognition.

Text detection based on CTPN

Arabic text detection in scene/video images based on ctpn (connectionist text proposal network). It is implemented in tensorflow. The origin paper can be found here. Also, the origin repo in caffe can be found in here. For more detail about the paper and code, see this blog. If you got any questions, check the issue first, if the problem persists, open a new issue.


**NOTICE: Thanks to banjin-xjy, which reimplemented the original code using Tensorflow.


roadmap

  • reconstruct the repo
  • cython nms and bbox utils
  • loss function as referred in paper
  • oriented text connector
  • BLSTM

setup

nms and bbox utils are written in cython, hence you have to build the library first.

cd utils/bbox
chmod +x make.sh
./make.sh

It will generate a nms.so and a bbox.so in current folder.


demo

  • follow setup to build the library
  • download the ckpt file from googl drive
  • put checkpoints_mlt/ in Textar/
  • put your images in data/demo, the results will be saved in data/res, and run demo in the root
python3 ./artext_detection/main/demo.py

OCR

demo

  • crop the input images based on the output detection coordinates
  • save the cropped images' name in input_file.txt
  • run demo in the root
python3 ./ocra/demo.py

**NOTICE: the training of this part is work in progress.


End-to-end fashion

to run the code in an end-to-end fashion:

python3 ./run.py

About

Arabic text recognition in mutlimedia documents ____ part of IntelligencIA Project

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published