High error rate on training with Impact on RTL language (Kur) Persian-Arabic script #151 #157

sam-kurdi · 2020-04-16T16:22:27Z

I am getting a very high error (85) rate after training with Impact
I have started the training by the following configuration :

tesseract version:

tesseract 4.0.0-beta.1
leptonica-1.75.3
libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 1.5.2) : libpng 1.6.34 : libtiff 4.0.9 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.0
Found AVX2
Found AVX
Found SSE

1:the workflow is utilized https://github.com/tesseract-ocr/tesstrain . Command-line to run make file is (make training MODEL_NAME=krd START_MODEL=ara LANG_TYPE=RTL FINETUNETYPE=Impact)
2:I added inherited.unicharset , ara.config , kur langdata, Arabic.unicharset, and Latin.unercharset provided by https://github.com/tesseract-ocr/langdata_lstm
3:I used ara.traineddata as a start model from https://github.com/tesseract-ocr/tessdata_best
4: (1304 / 2) imagelines + ground truth transcription

could you please tell is there any misconfiguration?
how can I improve the accuracy rate?

Training Log :

pc1@pc:~/Desktop/tesstrain-master$ make training MODEL_NAME=krd START_MODEL=ara LANG_TYPE=RTL FINETUNETYPE=Impact

find data/krd-ground-truth -name '*.gt.txt' | xargs cat | sort | uniq > "data/krd/all-gt"
combine_tessdata -u /home/pc1/Desktop/tesstrain-master/usr/share/tessdata/ara.traineddata data/ara/krd
Extracting tessdata components from /home/pc1/Desktop/tesstrain-master/usr/share/tessdata/ara.traineddata
Wrote data/ara/krd.config
Wrote data/ara/krd.lstm
Wrote data/ara/krd.lstm-punc-dawg
Wrote data/ara/krd.lstm-word-dawg
Wrote data/ara/krd.lstm-number-dawg
Wrote data/ara/krd.lstm-unicharset
Wrote data/ara/krd.lstm-recoder
Wrote data/ara/krd.version
Version string:4.00.00alpha:ara:synth20170629:[1,48,0,1Ct3,3,16Mp3,3Lfys64Lfx96Lrx96Lfx512O1c1]
0:config:size=545, offset=192
17:lstm:size=11582395, offset=737
18:lstm-punc-dawg:size=1986, offset=11583132
19:lstm-word-dawg:size=999442, offset=11585118
20:lstm-number-dawg:size=13250, offset=12584560
21:lstm-unicharset:size=5061, offset=12597810
22:lstm-recoder:size=769, offset=12602871
23:version:size=80, offset=12603640
unicharset_extractor --output_unicharset "data/krd/my.unicharset" --norm_mode 3 "data/krd/all-gt"
Bad box coordinates in boxfile string!
Extracting unicharset from plain text file data/krd/all-gt
Wrote unicharset file data/krd/my.unicharset
merge_unicharsets data/ara/krd.lstm-unicharset data/krd/my.unicharset "data/krd/unicharset"
Loaded unicharset of size 85 from file data/ara/krd.lstm-unicharset
Loaded unicharset of size 73 from file data/krd/my.unicharset
Wrote unicharset file data/krd/unicharset.
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/krd-ground-truth/17.7.png" -t "data/krd-ground-truth/17.7.gt.txt" > "data/krd-ground-truth/17.7.box"

tesseract data/krd-ground-truth/17.7.png data/krd-ground-truth/17.7 --psm 6 lstm.train
Tesseract Open Source OCR Engine v4.0.0-beta.1 with Leptonica
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/krd-ground-truth/65.3.png" -t "data/krd-ground-truth/65.3.gt.txt" > "data/krd-ground-truth/65.3.box"
tesseract data/krd-ground-truth/65.3.png data/krd-ground-truth/65.3 --psm 6 lstm.train
Tesseract Open Source OCR Engine v4.0.0-beta.1 with Leptonica
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/krd-ground-truth/46.10.png" -t "data/krd-ground-truth/46.10.gt.txt" > "data/krd-ground-truth/46.10.box"
tesseract data/krd-ground-truth/46.10.png data/krd-ground-truth/46.10 --psm 6 lstm.train
Tesseract Open Source OCR Engine v4.0.0-beta.1 with Leptonica
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/krd-ground-truth/30.3.png" -t "data/krd-ground-truth/30.3.gt.txt" > "data/krd-ground-truth/30.3.box"
tesseract data/krd-ground-truth/30.3.png data/krd-ground-truth/30.3 --psm 6 lstm.train
Tesseract Open Source OCR Engine v4.0.0-beta.1 with Leptonica
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/krd-ground-truth/16.6.png" -t "data/krd-ground-truth/16.6.gt.txt" > "data/krd-ground-truth/16.6.box"
tesseract data/krd-ground-truth/16.6.png data/krd-ground-truth/16.6 --psm 6 lstm.train
Tesseract Open Source OCR Engine v4.0.0-beta.1 with Leptonica
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/krd-ground-truth/24.24.png" -t "data/krd-ground-truth/24.24.gt.txt" > "data/krd-ground-truth/24.24.box"
tesseract data/krd-ground-truth/24.24.png data/krd-ground-truth/24.24 --psm 6 lstm.train
Tesseract Open Source OCR Engine v4.0.0-beta.1 with Leptonica
(REMOVED...... the log)
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/krd-ground-truth/63.10.png" -t "data/krd-ground-truth/63.10.gt.txt" > "data/krd-ground-truth/63.10.box"
tesseract data/krd-ground-truth/63.10.png data/krd-ground-truth/63.10 --psm 6 lstm.train
Tesseract Open Source OCR Engine v4.0.0-beta.1 with Leptonica
find data/krd-ground-truth -name '*.lstmf' | python3 shuffle.py 0 > "data/krd/all-lstmf"
head -n 582 data/krd/all-lstmf
tail -n 65 data/krd/all-lstmf
combine_lang_model
--input_unicharset data/krd/unicharset
--script_dir data
--numbers data/krd/krd.numbers
--puncs data/krd/krd.punc
--words data/krd/krd.wordlist
--output_dir data
--pass_through_recoder --lang_is_rtl
--lang krd
Loaded unicharset of size 107 from file data/krd/unicharset
Setting unichar properties
Setting script properties
Warning: properties incomplete for index 16 = َ
Warning: properties incomplete for index 20 = ُ
Warning: properties incomplete for index 44 = ٍ
Warning: properties incomplete for index 48 = ّ
Warning: properties incomplete for index 65 = ِ
Warning: properties incomplete for index 66 = ْ
Warning: properties incomplete for index 69 = ً
Warning: properties incomplete for index 71 = ٌ
Warning: properties incomplete for index 87 = ‌
Config file is optional, continuing...
Reducing Trie to SquishedDawg
Reducing Trie to SquishedDawg
Reducing Trie to SquishedDawg
lstmtraining
--debug_interval 0
--traineddata data/krd/krd.traineddata
--old_traineddata /home/pc1/Desktop/tesstrain-master/usr/share/tessdata/ara.traineddata
--continue_from data/ara/krd.lstm
--model_output data/krd/checkpoints/krd
--train_listfile data/krd/list.train
--eval_listfile data/krd/list.eval
--max_iterations 20000
Loaded file data/ara/krd.lstm, unpacking...
Warning: LSTMTrainer deserialized an LSTMRecognizer!
Code range changed from 85 to 107!
Num (Extended) outputs,weights in Series:
1,48,0,1:1, 0
Num (Extended) outputs,weights in Series:
C3,3:9, 0
Ft16:16, 160
Total weights = 160
[C3,3Ft16]:16, 160
Mp3,3:16, 0
Lfys64:64, 20736
Lfx96:96, 61824
Lrx96:96, 74112
Lfx512:512, 1247232
Fc107:107, 54891
Total weights = 1458955
Previous null char=2 mapped to 2
Continuing from data/ara/krd.lstm
Loaded 1/1 pages (1-1) of document data/krd-ground-truth/38.3.lstmf
(LOG REMOVED....)
Loaded 1/1 pages (1-1) of document data/krd-ground-truth/101.2.lstmf
Loaded 1/1 pages (1-1) of document data/krd-ground-truth/57.8.lstmf

2 Percent improvement time=100, best error was 100 @ 0
At iteration 100/100/100, Mean rms=6.343%, delta=52.311%, char train=85.617%, word train=98.507%, skip ratio=0%, New best char error = 85.617 wrote checkpoint.

Loaded 1/1 pages (1-1) of document data/krd-ground-truth/28.4.lstmf
Loaded 1/1 pages (1-1) of document data/krd-ground-truth/39.1.lstmf
(LOG REMOVED....)
Loaded 1/1 pages (1-1) of document data/krd-ground-truth/45.12.lstmf
Loaded 1/1 pages (1-1) of document data/krd-ground-truth/85.3.lstmf
Loaded 1/1 pages (1-1) of document data/krd-ground-truth/12.6.lstmf

At iteration 200/200/200, Mean rms=6.631%, delta=58.281%, char train=92.803%, word train=99.254%, skip ratio=0%, New worst char error = 92.803 wrote checkpoint.

Loaded 1/1 pages (1-1) of document data/krd-ground-truth/72.3.lstmf
Loaded 1/1 pages (1-1) of document data/krd-ground-truth/38.4.lstmf
(LOG REMOVED...)
Loaded 1/1 pages (1-1) of document data/krd-ground-truth/92.5.lstmf

At iteration 300/300/300, Mean rms=6.674%, delta=60.127%, char train=95.198%, word train=99.502%, skip ratio=0%, New worst char error = 95.198 wrote checkpoint.

Loaded 1/1 pages (1-1) of document data/krd-ground-truth/89.1.lstmf
(LOG REMOVED...)
Loaded 1/1 pages (1-1) of document data/krd-ground-truth/25.3.lstmf
Loaded 1/1 pages (1-1) of document data/krd-ground-truth/88.2.lstmf

At iteration 400/400/400, Mean rms=6.706%, delta=61.546%, char train=96.399%, word train=99.627%, skip ratio=0%, New worst char error = 96.399 wrote checkpoint.

Loaded 1/1 pages (1-1) of document data/krd-ground-truth/48.8.lstmf
Loaded 1/1 pages (1-1) of document data/krd-ground-truth/29.11.lstmf
Loaded 1/1 pages (1-1) of document data/krd-ground-truth/43.5.lstmf
(LOG REMOVED...)
Loaded 1/1 pages (1-1) of document data/krd-ground-truth/46.4.lstmf
Loaded 1/1 pages (1-1) of document data/krd-ground-truth/28.13.lstmf
At iteration 500/500/500, Mean rms=6.73%, delta=62.598%, char train=97.117%, word train=99.701%, skip ratio=0%, New worst char error = 97.117 wrote checkpoint.

Loaded 1/1 pages (1-1) of document data/krd-ground-truth/26.3.lstmf
Loaded 1/1 pages (1-1) of document data/krd-ground-truth/87.5.lstmf
(LOG REMOVED)
Loaded 1/1 pages (1-1) of document data/krd-ground-truth/94.1.lstmf

At iteration 600/600/600, Mean rms=6.746%, delta=63.418%, char train=97.598%, word train=99.751%, skip ratio=0%, New worst char error = 97.598 wrote checkpoint.

At iteration 700/700/700, Mean rms=6.757%, delta=63.932%, char train=97.941%, word train=99.787%, skip ratio=0%, New worst char error = 97.941 wrote checkpoint.

At iteration 800/800/800, Mean rms=6.758%, delta=64.207%, char train=98.198%, word train=99.813%, skip ratio=0%, New worst char error = 98.198 wrote checkpoint.

At iteration 900/900/900, Mean rms=6.753%, delta=64.287%, char train=98.398%, word train=99.834%, skip ratio=0%, New worst char error = 98.398 wrote checkpoint.

At iteration 1000/1000/1000, Mean rms=6.748%, delta=64.338%, char train=98.559%, word train=99.851%, skip ratio=0%, New worst char error = 98.559 wrote checkpoint.

At iteration 1100/1100/1100, Mean rms=6.78%, delta=65.498%, char train=99.997%, word train=100%, skip ratio=0%, New worst char error = 99.997 wrote checkpoint.

Warning: LSTMTrainer deserialized an LSTMRecognizer!
Loaded 1/1 pages (1-1) of document data/krd-ground-truth/85.4.lstmf
Loaded 1/1 pages (1-1) of document data/krd-ground-truth/97.7.lstmf
Loaded 1/1 pages (1-1) of document data/krd-ground-truth/35.8.lstmf
Loaded 1/1 pages (1-1) of document data/krd-ground-truth/42.8.lstmf
Loaded 1/1 pages (1-1) of document data/krd-ground-truth/77.1.lstmf
At iteration 1200/1200/1200, Mean rms=6.772%, delta=65.877%, char train=99.998%, word train=100%, skip ratio=0%, New worst char error = 99.998 wrote checkpoint.

Loaded 1/1 pages (1-1) of document data/krd-ground-truth/24.15.lstmf
(LOG REMOVED...)
Loaded 1/1 pages (1-1) of document data/krd-ground-truth/53.1.lstmf

Warning: LSTMTrainer deserialized an LSTMRecognizer!
At iteration 1300/1300/1300, Mean rms=6.764%, delta=65.956%, char train=99.999%, word train=100%, skip ratio=0%, New worst char error = 99.999At iteration 1100, stage 0, Eval Char error rate=100, Word error rate=100 wrote checkpoint.

At iteration 1400/1400/1400, Mean rms=6.753%, delta=65.951%, char train=99.999%, word train=100%, skip ratio=0%, wrote checkpoint.

(LOG REMOVED....)

At iteration 19817/19900/19900, Mean rms=5.893%, delta=41.14%, char train=93.046%, word train=99.661%, skip ratio=0%, wrote checkpoint.

At iteration 19917/20000/20000, Mean rms=5.891%, delta=41.096%, char train=92.917%, word train=99.605%, skip ratio=0%, wrote checkpoint.

Finished! Error rate = 85.617
lstmtraining
--stop_training
--continue_from data/krd/checkpoints/krd_checkpoint
--traineddata data/krd/krd.traineddata
--model_output data/krd.traineddata
Loaded file data/krd/checkpoints/krd_checkpoint, unpacking...
pc1@pc:~/Desktop/tesstrain-master$

Shreeshrii · 2020-04-16T16:45:45Z

Start from script/Arabic rather than ara.

Currently your unicharset is increasing from 85 to over 100. This is not suitable for fine-tuning.

sam-kurdi · 2020-04-16T16:50:09Z

Start from script/Arabic rather than ara.
Currently your unicharset is increasing from 85 to over 100. This is not suitable for fine-tuning.

Thank you for your support.

is the below link is the correct script for LSTM training ?
https://github.com/tesseract-ocr/tessdata_best/raw/master/script/Arabic.traineddata

I am using PSM 6
Which PSM do you recommend?

How to solve this issue (Normalization failed for string ) ?

Shreeshrii · 2020-04-16T16:58:31Z

Yes, tessdata_best/script/Arabic is preferable.

For single lines, I suggest using --psm 13.

Please make sure that correct RTL processing is happening in reversal of text for box files.

Shreeshrii · 2020-04-16T16:59:37Z

See #137

sam-kurdi · 2020-04-16T17:09:05Z

Yes, tessdata_best/script/Arabic is preferable.

For single lines, I suggest using --psm 13.

Please make sure that correct RTL processing is happening in reversal of text for box files.

Thank you, Will do that.

Changes to the generate_wordstr_box.py as follow:

` create WordStr line boxes for Indic & RTL
for line in lines:
line = unicodedata.normalize('NFC', line.strip())
if args.rtl:
FIXME: This should not be necessary. Compare with e.g. kraken
line = line.translate(str.maketrans("()[]{}»«><", ")(][}{«»<>"))
if line:
print("WordStr 0 0 %d %d 0 #%s" % (width, height, line))
print("\t 0 0 %d %d 0" % (width, height))'

is this correct modification?

Shreeshrii · 2020-04-17T02:10:24Z

@sam-kurdi
The text in the box file needs to be reversed using the bidi algorithm.
Regarding the reversed punctuation marks, ( ) [ ] etc, please check whether it is needed or not.

I will upload a new training for ckb that I have done and you can check whether results are as expected on real life images. It gives over 95% accuracy with lstmeval on single line images similar to those used for training.

sam-kurdi · 2020-04-17T09:01:11Z

@Shreeshrii
The above modification is reversing only (punc, para, etc). which I received the same error.
it is correct, the text in the box need to be reversed.
thank you,
please let me know as soon as you updated.

Shreeshrii · 2020-04-17T17:28:23Z

Please see new PR https://github.com/tesseract-ocr/tesstrain/pull/159/commits

stweil · 2020-04-17T17:56:33Z

My first experience with training Arabic handwriting is documented here. The training is still running. I used the old generate_wordstr_box.py with an added line = bidi.algorithm.get_display(line).

Shreeshrii · 2020-04-18T13:05:59Z

After one epoch, the CER is at about 46 %. With sufficient training (200 epochs, about 32 hours), the CER falls below 5 %.

@stweil How is the EPOCH defined? Are you using a custom version of Makefile?

stweil · 2020-04-18T13:40:30Z

1 epoch = 1 iteration over all training data. It is commonly used for training of neural networks, but up to now not for Tesseract training.

Yes, this is currently a local custom version of Makefile which calculates MAX_ITERATIONS from EPOCHS:

@@ -49,8 +51,16 @@ TESSDATA_REPO = _best
 # Ground truth directory. Default: $(GROUND_TRUTH_DIR)
 GROUND_TRUTH_DIR := $(OUTPUT_DIR)-ground-truth

+# Epochs. Default: $(EPOCHS)
+EPOCHS :=
+
 # Max iterations. Default: $(MAX_ITERATIONS)
+ifeq ($(EPOCHS),)
 MAX_ITERATIONS := 10000
+else
+MAX_ITERATIONS := $(($(EPOCHS) * $(wc -l < $(OUTPUT_DIR)/list.train))
+MAX_ITERATIONS := $(shell echo $$(($(EPOCHS) * $$(wc -l < $(OUTPUT_DIR)/list.train))))
+endif

 # Debug Interval. Default:  $(DEBUG_INTERVAL)
 DEBUG_INTERVAL := 0

stweil · 2020-04-18T13:48:47Z

I updated https://github.com/tesseract-ocr/tesstrain/wiki/Arabic-Handwriting#training to explain what epochs means in the context of that training.

Shreeshrii · 2020-04-18T14:22:19Z

@stweil Thanks. Calculating MAX_ITERATIONS from EPOCHS is a good addition.

Since you are testing for RTL, it will be interesting to see tesseract results for https://github.com/OpenITI/OCR_GS_Data - maybe you can do a run for those too. I had tried a test earlier but I change too many things for it to be a valid comparison to their results.

Shreeshrii · 2020-04-18T14:54:28Z

I used the old generate_wordstr_box.py with an added line = bidi.algorithm.get_display(line).

@stweil Please check that your custom Makefile is using generate_wordstr_box.py. The Makefile currently in tesstrain master is using generate_line_box.py for RTL.

PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/krd-ground-truth/17.7.png" -t "data/krd-ground-truth/17.7.gt.txt" > "data/krd-ground-truth/17.7.box"

stweil · 2020-04-18T15:08:50Z

I had called generate_wordstr_box.py manually before running make.

sam-kurdi · 2020-04-18T16:05:09Z

Please see new PR https://github.com/tesseract-ocr/tesstrain/pull/159/commits
thank you very much that helps a lot.

I have tested with the modified Makefile version, it works fine the finished error rate with my own data set is 2.33

@Shreeshrii @stweil
the main problem in recognized images is Zero-width non-joiner and western Arabic numbers which are not handled properly. However, Eastern Arabic numbers handled properly.
any suggestions to solve the related issue of the new model?

is it a mandatory step that the image lines and corresponding ground-truth must be the same font?
I am wondering if you could mention how did you generate the provided dataset?

Shreeshrii · 2020-04-18T16:37:19Z

is it a mandatory step that the image lines and corresponding ground-truth must be the same font?

Ground truth should be in Unicode text format and can be rendered in any Unicode font. So font does not really matter for ground truth as long as it is not a legacy non Unicode font.

The test dataset was extracted from synthetic training data generated using Unicode text and fonts. I think rtltest.tgz has images in Unikurd-Jino font.

Shreeshrii · 2020-04-18T16:38:47Z

Is ZWNJ being used in certain character combinations?

sam-kurdi · 2020-04-18T16:39:49Z

Is ZWNJ being used in certain character combinations?

yes, it has been used in many gt files
Also, western Arabic numbers which are not handled properly. However, Eastern Arabic numbers handled properly.

Shreeshrii · 2020-04-18T16:43:21Z

the finished error rate with my own data set is 2.33

Your training data is very limited number of lines. Try with more training data and include more samples of characters which are in error.

sam-kurdi · 2020-04-18T16:48:25Z

the finished error rate with my own data set is 2.33

Your training data is very limited number of lines. Try with more training data and include more samples of characters which are in error.

I will prepare more training data, how about ZWNJ and WAN

sam-kurdi · 2020-04-19T15:57:12Z

@Shreeshrii
error rate = 4.007 after training with rtltest-ground-truth.

Shreeshrii · 2020-04-19T16:25:00Z

I have tested with the modified Makefile version, it works fine the finished error rate with my own data set is 2.33

error rate = 4.007 after training with rtltest-ground-truth.

Yes, error rate will depend on number of iteration as well as number of lines of training data.

How many lines of text are there in your training set?

sam-kurdi · 2020-04-19T16:30:55Z

I have tested with the modified Makefile version, it works fine the finished error rate with my own data set is 2.33

error rate = 4.007 after training with rtltest-ground-truth.

Yes, error rate will depend on number of iteration as well as number of lines of training data.

How many lines of text are there in your training set?

550 image lines.

sam-kurdi · 2020-04-20T19:18:34Z

@Shreeshrii @stweil @theraysmith
Any suggestion to Training/Fine Tuning Tesseract OCR LSTM for New Fonts with make file, by utilizing tesstrain improvement for rtl?

Shreeshrii · 2020-04-21T03:42:30Z

western Arabic numbers which are not handled properly. However, Eastern Arabic numbers handled properly.

Clarify what you mean by WAN - is it 0-9 or farsi numbers?

EAN I assume is numbers in Arabic script?

sam-kurdi · 2020-04-21T13:13:10Z

western Arabic numbers which are not handled properly. However, Eastern Arabic numbers handled properly.

Clarify what you mean by WAN - is it 0-9 or farsi numbers?

EAN I assume is numbers in Arabic script?

Western Arabic Numerals (WAN) | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10
Eastern Arabic Numerals (EAN)| ٠ | ١ | ٢ | ٣ | ٤ | ٥ | ٦ | ٧ | ٨ | ٩ | ١٠
Persian Numerals system| ۰ | ۱ | ۲ | ۳ | ۴ | ۵ | ۶ | ۷ | ۸ | ۹ | ۱۰

stale · 2020-05-21T13:49:59Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

sam-kurdi changed the title ~~High error rate on training RTL language (Kur) Persian-Arabic script #151~~ High error rate on training with Impact on RTL language (Kur) Persian-Arabic script #151 Apr 16, 2020

This was referenced Apr 16, 2020

high error rate on training RTL Persian-Arabic script #151

Closed

Word started with a combiner:0x200c , Normalization failed for string #158

Closed

stale bot added the stale Issues which require input by the reporter which is not provided label May 21, 2020

stale bot closed this as completed May 28, 2020

Shreeshrii mentioned this issue Dec 25, 2020

From stweil's custom makefile - calculate MAX_ITERATIONS from EPOCHS #223

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

High error rate on training with Impact on RTL language (Kur) Persian-Arabic script #151 #157

High error rate on training with Impact on RTL language (Kur) Persian-Arabic script #151 #157

sam-kurdi commented Apr 16, 2020 •

edited

Loading

Shreeshrii commented Apr 16, 2020 via email

sam-kurdi commented Apr 16, 2020 •

edited

Loading

Shreeshrii commented Apr 16, 2020

Shreeshrii commented Apr 16, 2020

sam-kurdi commented Apr 16, 2020 •

edited

Loading

Shreeshrii commented Apr 17, 2020

sam-kurdi commented Apr 17, 2020

Shreeshrii commented Apr 17, 2020

stweil commented Apr 17, 2020

Shreeshrii commented Apr 18, 2020

stweil commented Apr 18, 2020

stweil commented Apr 18, 2020

Shreeshrii commented Apr 18, 2020

Shreeshrii commented Apr 18, 2020

stweil commented Apr 18, 2020

sam-kurdi commented Apr 18, 2020 •

edited

Loading

Shreeshrii commented Apr 18, 2020

Shreeshrii commented Apr 18, 2020

sam-kurdi commented Apr 18, 2020 •

edited

Loading

Shreeshrii commented Apr 18, 2020

sam-kurdi commented Apr 18, 2020 •

edited

Loading

sam-kurdi commented Apr 19, 2020

Shreeshrii commented Apr 19, 2020

sam-kurdi commented Apr 19, 2020

sam-kurdi commented Apr 20, 2020 •

edited

Loading

Shreeshrii commented Apr 21, 2020

sam-kurdi commented Apr 21, 2020 •

edited

Loading

stale bot commented May 21, 2020

High error rate on training with Impact on RTL language (Kur) Persian-Arabic script #151 #157

High error rate on training with Impact on RTL language (Kur) Persian-Arabic script #151 #157

Comments

sam-kurdi commented Apr 16, 2020 • edited Loading

Shreeshrii commented Apr 16, 2020 via email

sam-kurdi commented Apr 16, 2020 • edited Loading

Shreeshrii commented Apr 16, 2020

Shreeshrii commented Apr 16, 2020

sam-kurdi commented Apr 16, 2020 • edited Loading

Shreeshrii commented Apr 17, 2020

sam-kurdi commented Apr 17, 2020

Shreeshrii commented Apr 17, 2020

stweil commented Apr 17, 2020

Shreeshrii commented Apr 18, 2020

stweil commented Apr 18, 2020

stweil commented Apr 18, 2020

Shreeshrii commented Apr 18, 2020

Shreeshrii commented Apr 18, 2020

stweil commented Apr 18, 2020

sam-kurdi commented Apr 18, 2020 • edited Loading

Shreeshrii commented Apr 18, 2020

Shreeshrii commented Apr 18, 2020

sam-kurdi commented Apr 18, 2020 • edited Loading

Shreeshrii commented Apr 18, 2020

sam-kurdi commented Apr 18, 2020 • edited Loading

sam-kurdi commented Apr 19, 2020

Shreeshrii commented Apr 19, 2020

sam-kurdi commented Apr 19, 2020

sam-kurdi commented Apr 20, 2020 • edited Loading

Shreeshrii commented Apr 21, 2020

sam-kurdi commented Apr 21, 2020 • edited Loading

stale bot commented May 21, 2020

sam-kurdi commented Apr 16, 2020 •

edited

Loading

sam-kurdi commented Apr 16, 2020 •

edited

Loading

sam-kurdi commented Apr 16, 2020 •

edited

Loading

sam-kurdi commented Apr 18, 2020 •

edited

Loading

sam-kurdi commented Apr 18, 2020 •

edited

Loading

sam-kurdi commented Apr 18, 2020 •

edited

Loading

sam-kurdi commented Apr 20, 2020 •

edited

Loading

sam-kurdi commented Apr 21, 2020 •

edited

Loading