Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training kraken and RTL support? #36

Closed
wrznr opened this issue May 2, 2017 · 17 comments
Closed

Training kraken and RTL support? #36

wrznr opened this issue May 2, 2017 · 17 comments

Comments

@wrznr
Copy link

wrznr commented May 2, 2017

@amitdo commented here on the specific RTL support in kraken. Since I am unsucessfully training OCR models for Hebrew with ocropy, I wonder if kraken could do the job. Can anyone introduce me to the details of kraken's RLT support? I could not find the related information in the documentation. Many thanks in advance!

@amitdo
Copy link
Contributor

amitdo commented May 2, 2017

Since @mittagessen is using kraken for Arabic*, I believe it supports RTL out-of-the-box, thus OCR and training for a RTL script like Hebrew are basically the same as for LTR scripts.

* https://arxiv.org/abs/1703.09550

@wrznr
Copy link
Author

wrznr commented May 2, 2017

This would indeed be fantastic and the published results are promising (@amitdo thanks for the link). I will have a look at ketos training and report back here.

@amitdo
Copy link
Contributor

amitdo commented May 2, 2017

A list of Hebrew fonts from the Open Siddur Project
http://opensiddur.org/tools/fonts/

@wrznr
Copy link
Author

wrznr commented May 2, 2017

Thanks again @amitdo ! Regarding the training process: Did anyone stumble upon the following error?

kmw@lal:~/projects/dta/ocropus/iliad$ ketos train -o DFK -R 2500 -F 2500 -N 20000 train/*/*.png
Building ground truth set ⣾

Initializing model ⣽
Traceback (most recent call last):
...
...
File "/usr/local/lib/python2.7/dist-packages/kraken/ketos.py", line 133, in train
rnn = models.ClstmSeqRecognizer.init_model(lineheight, hiddensize, gt_set.training_alphabet.keys())
AttributeError: 'GroundTruthContainer' object has no attribute 'training_alphabet'

@wrznr wrznr changed the title RTL support? Training kraken and RTL support? May 2, 2017
@mittagessen
Copy link
Owner

mittagessen commented May 2, 2017

There is BiDi (and subsequently RTL) support for training and recognition; the label sequence always gets trained/recognized left-to-right and is then reordered using the (reverse) Unicode BiDi algorithm. There's also top-to-bottom support since a few days ago.

The training subcommand is only partially working because the CLSTM python bindings are incomplete and won't allow creating a model from scratch. On the other hand it is possible the train a BiDi model with the clstmocrtrain tool by reordering the transcription to display order and just using it with kraken. To make this a little bit easier I just added a reordering switch to the linegen subcommand. ketos train on the other hand just assumes everything is in correct order and will automatically convert to display order internally.

PS: display order is the code point ordering where the n-th code point is the n-th leftmost grapheme. For pure RTL texts it is just the reverse of the correct order, for mixed texts there are interleaved RTL and LTR sections.

@amitdo
Copy link
Contributor

amitdo commented May 2, 2017

My answer was somewhat misleading. Sorry about that.

@mittagessen
Copy link
Owner

mittagessen commented May 2, 2017

Recognition is exactly the same, training will be eventually. If you just create the model file using the clstm tools directly and train solely with ketos train it already is. Training through kraken directly is also quite a bit faster (~2x) because the whole training set is kept (preprocessed) in memory. The majority of processing time during training is actually spent running the line image normalization over and over again when using clstmocrtrain.

@mittagessen
Copy link
Owner

@wrznr: There was a bug that should be fixed now. Apparently, it is a good idea to not just write code but even commit and push it sometimes.

@amitdo
Copy link
Contributor

amitdo commented May 2, 2017

What's the reason for using clstm for training instead of doing it in python llke ocropy does?
Speed?

@mittagessen
Copy link
Owner

mittagessen commented May 3, 2017

The main reason is speed, flexibility, and potentially GPU acceleration. Recognition and training is around twice as fast, models are smaller/serialize faster, and CLSTM throws away peephole connection which are just unneeded parameter ballast.

Real flexibility in net architectures would of course require a generic machine learning framework like tensorflow but those have a bunch of drawbacks like exorbitant model compile times and unusable serializations.

@wrznr
Copy link
Author

wrznr commented May 4, 2017

Let's see if I got that correctly: 1. I initialize a model file using CLSTM. 2. I run ketos train with -l model to start the training with kraken?

@mittagessen
Copy link
Owner

Yes, that is basically it. The only part that isn't possible through the python bindings is initializing the codec.

@amitdo
Copy link
Contributor

amitdo commented May 4, 2017

potentially GPU acceleration

tmbdev/clstm#91
tmbdev/clstm#88

@wrznr
Copy link
Author

wrznr commented May 4, 2017

@mittagessen Is it necessary that I use exactly the same files (resp. the same "alphabet") for steps 1 (i.e. initialization) and 2 (i.e. ketos training)? The term "codec" somehow suggests it.

@mittagessen
Copy link
Owner

mittagessen commented May 4, 2017

Yes, that is the whole point.

@ghost
Copy link

ghost commented May 6, 2017

@mittagessen I have seen the training documentation for Kraken and it's still not clear to me.
Can you create a Video of your clstm and pyrnn training process for some Arabic text, and post it along with the example files.
We are grateful for your hard work

@ghost
Copy link

ghost commented May 8, 2017

@mittagessen your replies regarding the training process are so confusing to me, please provide a video to train a new model and to use the Kraken training interface.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants