Training kraken and RTL support? #36

wrznr · 2017-05-02T10:53:30Z

@amitdo commented here on the specific RTL support in kraken. Since I am unsucessfully training OCR models for Hebrew with ocropy, I wonder if kraken could do the job. Can anyone introduce me to the details of kraken's RLT support? I could not find the related information in the documentation. Many thanks in advance!

amitdo · 2017-05-02T11:25:42Z

Since @mittagessen is using kraken for Arabic*, I believe it supports RTL out-of-the-box, thus OCR and training for a RTL script like Hebrew are basically the same as for LTR scripts.

* https://arxiv.org/abs/1703.09550

wrznr · 2017-05-02T11:33:08Z

This would indeed be fantastic and the published results are promising (@amitdo thanks for the link). I will have a look at ketos training and report back here.

amitdo · 2017-05-02T11:59:43Z

A list of Hebrew fonts from the Open Siddur Project
http://opensiddur.org/tools/fonts/

wrznr · 2017-05-02T13:37:23Z

Thanks again @amitdo ! Regarding the training process: Did anyone stumble upon the following error?

kmw@lal:~/projects/dta/ocropus/iliad$ ketos train -o DFK -R 2500 -F 2500 -N 20000 train/*/*.png
Building ground truth set ⣾
✓
Initializing model ⣽
Traceback (most recent call last):
...
...
File "/usr/local/lib/python2.7/dist-packages/kraken/ketos.py", line 133, in train
rnn = models.ClstmSeqRecognizer.init_model(lineheight, hiddensize, gt_set.training_alphabet.keys())
AttributeError: 'GroundTruthContainer' object has no attribute 'training_alphabet'

mittagessen · 2017-05-02T15:31:12Z

There is BiDi (and subsequently RTL) support for training and recognition; the label sequence always gets trained/recognized left-to-right and is then reordered using the (reverse) Unicode BiDi algorithm. There's also top-to-bottom support since a few days ago.

The training subcommand is only partially working because the CLSTM python bindings are incomplete and won't allow creating a model from scratch. On the other hand it is possible the train a BiDi model with the clstmocrtrain tool by reordering the transcription to display order and just using it with kraken. To make this a little bit easier I just added a reordering switch to the linegen subcommand. ketos train on the other hand just assumes everything is in correct order and will automatically convert to display order internally.

PS: display order is the code point ordering where the n-th code point is the n-th leftmost grapheme. For pure RTL texts it is just the reverse of the correct order, for mixed texts there are interleaved RTL and LTR sections.

amitdo · 2017-05-02T15:57:07Z

My answer was somewhat misleading. Sorry about that.

mittagessen · 2017-05-02T21:14:08Z

Recognition is exactly the same, training will be eventually. If you just create the model file using the clstm tools directly and train solely with ketos train it already is. Training through kraken directly is also quite a bit faster (~2x) because the whole training set is kept (preprocessed) in memory. The majority of processing time during training is actually spent running the line image normalization over and over again when using clstmocrtrain.

mittagessen · 2017-05-02T21:43:46Z

@wrznr: There was a bug that should be fixed now. Apparently, it is a good idea to not just write code but even commit and push it sometimes.

amitdo · 2017-05-02T21:54:10Z

What's the reason for using clstm for training instead of doing it in python llke ocropy does?
Speed?

mittagessen · 2017-05-03T09:31:39Z

The main reason is speed, flexibility, and potentially GPU acceleration. Recognition and training is around twice as fast, models are smaller/serialize faster, and CLSTM throws away peephole connection which are just unneeded parameter ballast.

Real flexibility in net architectures would of course require a generic machine learning framework like tensorflow but those have a bunch of drawbacks like exorbitant model compile times and unusable serializations.

wrznr · 2017-05-04T08:11:36Z

Let's see if I got that correctly: 1. I initialize a model file using CLSTM. 2. I run ketos train with -l model to start the training with kraken?

mittagessen · 2017-05-04T10:52:33Z

Yes, that is basically it. The only part that isn't possible through the python bindings is initializing the codec.

amitdo · 2017-05-04T13:50:29Z

potentially GPU acceleration

tmbdev/clstm#91
tmbdev/clstm#88

wrznr · 2017-05-04T15:40:01Z

@mittagessen Is it necessary that I use exactly the same files (resp. the same "alphabet") for steps 1 (i.e. initialization) and 2 (i.e. ketos training)? The term "codec" somehow suggests it.

mittagessen · 2017-05-04T16:04:17Z

Yes, that is the whole point.

ghost · 2017-05-06T23:25:06Z

@mittagessen I have seen the training documentation for Kraken and it's still not clear to me.
Can you create a Video of your clstm and pyrnn training process for some Arabic text, and post it along with the example files.
We are grateful for your hard work

ghost · 2017-05-08T11:53:21Z

@mittagessen your replies regarding the training process are so confusing to me, please provide a video to train a new model and to use the Kraken training interface.

kba mentioned this issue May 2, 2017

Important New Developments in Arabographic Optical Character Recognition (OCR) kba/awesome-ocr#73

Closed

wrznr changed the title ~~RTL support?~~ Training kraken and RTL support? May 2, 2017

mittagessen closed this as completed Jun 15, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training kraken and RTL support? #36

Training kraken and RTL support? #36

wrznr commented May 2, 2017

amitdo commented May 2, 2017

wrznr commented May 2, 2017

amitdo commented May 2, 2017

wrznr commented May 2, 2017

mittagessen commented May 2, 2017 •

edited

Loading

amitdo commented May 2, 2017

mittagessen commented May 2, 2017 •

edited

Loading

mittagessen commented May 2, 2017

amitdo commented May 2, 2017

mittagessen commented May 3, 2017 •

edited

Loading

wrznr commented May 4, 2017

mittagessen commented May 4, 2017

amitdo commented May 4, 2017

wrznr commented May 4, 2017

mittagessen commented May 4, 2017 •

edited

Loading

ghost commented May 6, 2017 •

edited by ghost

Loading

ghost commented May 8, 2017

Training kraken and RTL support? #36

Training kraken and RTL support? #36

Comments

wrznr commented May 2, 2017

amitdo commented May 2, 2017

wrznr commented May 2, 2017

amitdo commented May 2, 2017

wrznr commented May 2, 2017

mittagessen commented May 2, 2017 • edited Loading

amitdo commented May 2, 2017

mittagessen commented May 2, 2017 • edited Loading

mittagessen commented May 2, 2017

amitdo commented May 2, 2017

mittagessen commented May 3, 2017 • edited Loading

wrznr commented May 4, 2017

mittagessen commented May 4, 2017

amitdo commented May 4, 2017

wrznr commented May 4, 2017

mittagessen commented May 4, 2017 • edited Loading

ghost commented May 6, 2017 • edited by ghost Loading

ghost commented May 8, 2017

mittagessen commented May 2, 2017 •

edited

Loading

mittagessen commented May 2, 2017 •

edited

Loading

mittagessen commented May 3, 2017 •

edited

Loading

mittagessen commented May 4, 2017 •

edited

Loading

ghost commented May 6, 2017 •

edited by ghost

Loading