An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition
- The paper proposes a combination of Convolutional Neural Network and Recurrent Neural Network.
- Deep Convolutional Neural Networks(DCNN) operates on inputs and outputs of fixed dimensions and is incapable of processing sequences of arbitrary lengths.
- Recurrent Neural Network are an improvement from the DCNN since it can handle sequences but it requires some processing to generate the sequence of image features.
- The model is end-to-end trainable without any preprocessing required for the seequences, performs equally well on lexicon-free and lexicon-based text.
- The model is trained on synthetic data but gives promising results when tested on real images. As an experiment, it will be interesting to try training the model with a real dataset and comparing the results with the current results.