Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Task for offline handwriting #18

Closed
TomJasuroae opened this issue Jun 14, 2015 · 2 comments
Closed

Task for offline handwriting #18

TomJasuroae opened this issue Jun 14, 2015 · 2 comments

Comments

@TomJasuroae
Copy link

@tmbdev I have a question about is this clstm suitable for offline English handwriting. I am newer to this area. I have some questions for discussion.

  1. Feature extraction. Deep learning, such as CNN, which can extract feature map from raw images. I don't know what is the feature extraction method in clstm. Can I replace this feature input part by Deep learning? For example, sliding window from a handwritten text line, from left to right to get a sequence feature.
  2. In clstm examples, there are seems only 1 layer lstm. Is there other examples to show more complex network structures? Such as more layers.
  3. What's your opinion for solving offline English handwriting problem? Thanks.

It seems that clstm network will mapping this character to label. So it's better to do some norm operation of this handwriting words? So it has better segmentation result for easy recognition.
network

@tmbdev
Copy link
Owner

tmbdev commented Jun 14, 2015

(1) There is no feature extraction. There has been a lot of working on using DNNs as preprocessing for LSTMs. For OCR, we haven't observed any improvements. For other applications, you have to try. Some people swear by it. There is text line normalization for OCR, and that turns out to be very important.

(2) The "bidi2" network gives you a two layer network. Look in clstm_prefab.cc to see how to construct more complex networks. You can essentially just write down the network structure as nested calls to "layer(...)". Again, for OCR, deeper layers don't help.

(3) LSTMs work fine for offline handwriting (that was one of the first applications). Most applications so far have used 2D LSTMs, which aren't implemented in CLSTM yet. Based on limited experience, I would guess that 1D LSTMs may work better than 2D LSTMs with proper preprocessing.

@TomJasuroae
Copy link
Author

Thanks, @tmbdev , I have already finished some experiments, such as more complex network and normalization. The result is indeed as your said, simple and good norm dataset has better performance. Would you like to share your experience on text line normalization? For example, methods, papers? Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants