Skip to content

LevinJ/CNN_LSTM_CTC_Tensorflow

 
 

Repository files navigation

CNN_LSTM_CTC_Tensorflow

The images are first processed by a CNN to extract features, then these extracted features are fed into a LSTM for character recognition.

CNN+LSTM+CTC based OCR(Optical Character Recognition) implemented using tensorflow.

I trained a model with 80k images using this code and got 99.98% accuracy on test dataset (20k images). The images in both dataset:

Overview

This project is based on the great work from here

Below improvements are made:

  1. correct the time step direction
    Previously the time step direction is channel, which is incorrect. Now it has been corrected to the width direction. see here for more discussion on this issue.
  2. optimize training scripts
    Previously all training images are loaded into memroy, now a simple image generator is used to generate training batch.
  3. metrics implementation implement the character and word accuracy in tensorflow.

Dataset

please see this issue about dataset, the lable file (a .txt file) is in the same folder with images after extracting .tar.gz file.

Prerequisite

  1. TensorFlow 1.4

  2. Numpy

Train the model.

python ./train_model.py

Inference

python ./eval_model.py

About

CNN+LSTM+CTC based OCR implemented using tensorflow.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%