Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Early stopping frequency not honored #146

Closed
mikegerber opened this issue Dec 9, 2019 · 2 comments
Closed

Early stopping frequency not honored #146

mikegerber opened this issue Dec 9, 2019 · 2 comments

Comments

@mikegerber
Copy link
Contributor

I am using Calamari 1.0.1:

calamari-cross-fold-train \
  --files \
  "$TMPDIR/*/*/*.png" \
  --best_models_dir $outdir \
  --early_stopping_frequency=0.25 \
  --early_stopping_nbest=5 \
  --batch_size=128 \
  --n_folds=5 \
  --max_parallel_models=1 \
  --display=0.01 \
  2>&1 | tee $outdir/train.`date -Iminutes`.log

I would expect early stopping to be checked every 0.25 epochs, however it actually happens roughly every 0.05 epochs now. This was not the case with Calamari 0.3.5.

Log excerpt:

Early stopping: 100%|██████████| 490/490 [03:01<00:00,  2.69it/s]
FOLD 0 | Found better model with accuracy of 90.495146%
FOLD 0 | Storing checkpoint to '/home/mike.gerber/devel/qurator-mono-repo/calamari-models/train-calamari-gt4histocr/data/calamari-models/GT4HistOCR/0.ckpt'
FOLD 0 | #0.320388: loss=16.61115993 ler=0.43991868 dt=3.12265495s
FOLD 0 |   PRED: '‪Zweck das Konfirmationsbiüchlein wieder einmal zur‬'
FOLD 0 |   TRUE: '‪Zweck das Konfirmationsbüchlein wieder einmal zur‬'
FOLD 0 | #0.330097: loss=15.82271417 ler=0.42927558 dt=3.11410474s
FOLD 0 |   PRED: '‪Die lUfer faſen den Jubel der Ströme nicht,‬'
FOLD 0 |   TRUE: '‪Die Ufer faſſen den Jubel der Stroͤme nicht,‬'
FOLD 0 | #0.339806: loss=15.27738711 ler=0.41900858 dt=3.09278003s
FOLD 0 |   PRED: '‪voruͤber, Studenten in Schnͤrrbͤcken und ſilber—‬'
FOLD 0 |   TRUE: '‪voruͤber, Studenten in Schnuͤrroͤcken und ſilber—‬'
FOLD 0 | #0.349515: loss=14.70154876 ler=0.40963159 dt=3.09945018s
FOLD 0 |   PRED: '‪nur l bis 1 Perſonen intereſſnren, grade dieſen‬'
FOLD 0 |   TRUE: '‪nur 50 bis 100 Perſonen intereſſiren, grade dieſen‬'
FOLD 0 | #0.359223: loss=14.19648499 ler=0.40086900 dt=3.10238453s
FOLD 0 |   PRED: '‪immer verging. Wenn Jhr es hören wollt, ſo will ich‬'
FOLD 0 |   TRUE: '‪immer verging. Wenn Jhr es hören wollt, ſo will ich‬'
FOLD 0 | #0.368932: loss=13.71040510 ler=0.39230464 dt=1.28325271s
FOLD 0 |   PRED: '‪hend die noch mehr fͤr mich gerhan, vielleicht‬'
FOLD 0 |   TRUE: '‪hend) die noch mehr fuͤr mich gethan, vielleicht‬'
FOLD 0 | Storing checkpoint to '/tmp/calamaribnc4tmc8/fold_0/model_00000732.ckpt'
FOLD 0 | Checking early stopping model
Early stopping: 100%|██████████| 490/490 [03:03<00:00,  2.67it/s]
FOLD 0 | Found better model with accuracy of 92.196010%
FOLD 0 | Storing checkpoint to '/home/mike.gerber/devel/qurator-mono-repo/calamari-models/train-calamari-gt4histocr/data/calamari-models/GT4HistOCR/0.ckpt'
FOLD 0 | #0.378641: loss=13.24382450 ler=0.38367690 dt=3.12489480s
FOLD 0 |   PRED: '‪uns einer Wirklichkeit bewußt werden, in ihrem eigenen‬'
FOLD 0 |   TRUE: '‪uns einer Wirklichkeit bewußt werden, in ihrem eigenen‬'
FOLD 0 | #0.388350: loss=12.86139683 ler=0.37567462 dt=3.13790299s
FOLD 0 |   PRED: '‪endlich vorüber, und er mußte wieder nah‬'
FOLD 0 |   TRUE: '‪endlich vorüber, und er mußte wieder nach‬'
FOLD 0 | #0.398058: loss=12.63035151 ler=0.36839195 dt=3.12761085s
FOLD 0 |   PRED: '‪der Schud des Böſen iſt das Bewußtſein, mit der poſitien‬'
FOLD 0 |   TRUE: '‪der Schuld des Böſen iſt das Bewußtſein, mit der poſitiven‬'
FOLD 0 | #0.407767: loss=12.44438200 ler=0.36114099 dt=3.11889884s
FOLD 0 |   PRED: '‪on dannen kamen wir ym die kyrch zů den heylgen engeln genant‬'
FOLD 0 |   TRUE: '‪¶ Von dannen kamen wir yn die kyrch zů den heyligen engeln genant ·‬'
FOLD 0 | #0.417476: loss=12.18084080 ler=0.35440814 dt=3.10386337s
FOLD 0 |   PRED: '‪Sie ſchadet ſonſt der Geſundheit.‬'
FOLD 0 |   TRUE: '‪Sie ſchadet ſonſt der Geſundheit.‬'
FOLD 0 | #0.427184: loss=11.85094854 ler=0.34774322 dt=1.25803710s
FOLD 0 |   PRED: '‪tiſſin iſt meiner, eines grauſamen Todes ge⸗‬'
FOLD 0 |   TRUE: '‪tiſſin iſt meiner, eines grauſamen Todes ge—‬'
FOLD 0 | Storing checkpoint to '/tmp/calamaribnc4tmc8/fold_0/model_00000854.ckpt'
FOLD 0 | Checking early stopping model
@mikegerber
Copy link
Contributor Author

I suspect the problem to be here: https://github.com/Calamari-OCR/calamari/blob/master/calamari_ocr/ocr/backends/tensorflow_backend/tensorflow_model.py#L320 – Calculating of the callback parameter steps_per_epoch is based on the validation set and I think that should be just the same value as 2 lines above. But please review for yourself, I might not understand the code correctly.

@ChWick
Copy link
Member

ChWick commented Dec 9, 2019

You are absolutely right, I already fixed this 5 days ago but forgot to push it (4837402).

@ChWick ChWick closed this as completed Dec 9, 2019
mikegerber added a commit to qurator-spk/train-calamari-gt4histocr that referenced this issue Nov 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants