How to achieve the publication results by using the pre-trained model and binatize.py? #17

Ladyliu47 · 2021-04-16T09:00:27Z

Thank you for the great work in document image binarization~

Is "model_weights_dibco_6_256x256_s96_aug_m205_f64_k5_s2_se3_e200_b32_esp.h5" corresponds to the model trained on the dibco series database except h-dibco2016？

When I run the binatize.py with dataset of h-dibco2016(http://vc.ee.duth.gr/h-dibco2016/benchmark/), the evaluation results are not so good as the conclusions of your publication. The average FM value of ten images is 86.39965. How to reach the FM value of SAE method: 91.65？

After using the pre-trained model, do I need some other operations to achieve the publication results？

By the way，the following is the evaluation results of h-dibco2016 in other pre-trained models:
DIBCO2016_Dataset.csv

What puzzles me is that some evaluation results are better than published ones, and some are worse. But I did not carry out additional training. I just specified the pre-trained model in the binatize.py. As for the evaluation, I use os.popen() to call the command line and execute: BinEvalWeights.exe and DIBCO_metrics.exe. It's a very simple process. Did I neglect some operations when using the pre-trained model?

Thank you again and I am looking forward to your help sincerely~

ajgallego · 2021-05-01T09:04:02Z

Dear @Ladyliu47, thank you for your interest in this project.

Yes, the model you are referring to is the one that corresponds to the model trained on the dibco series database except h-dibco2016.

That difference in the result can be for several reasons. What threshold are you using? Evaluate different thresholds to see if the result improves. Are you sure the test partition is the same? There were DIBCO folders that included handwritten and printed images. On the other hand, the tool I used for evaluation was HDIBCO's from 2013.

If you still cannot improve the result, I would have to review the model. The process to use it is simple as you say.

Kind regards

Ladyliu47 · 2021-05-12T03:39:08Z

Dear @ajgallego, thanks for your response and suggestions.

The threshold I used for evaluation was 0.5. Inspired by your suggestion, the evaluation results based on different thresholds are as follows. Although the results are different, none of them can achieve the F-measure value of 91.65.

The test partition is the 10 original images of h-dibco2016(http://vc.ee.duth.gr/h-dibco2016/benchmark/DIBCO2016_dataset-original.zip). I didn't find the printed images of dibco2016. It seems like that there are only handwritten images in the dataset of dibco2016. Have you ever use the printed images of dibco2016 during the test? If so, please tell me the website of the printed images of dibco2016. Thank you very much.

The tool I used for evaluation was HDIBCO's from 2016(http://vc.ee.duth.gr/h-dibco2016/benchmark/DIBCO_metrics.zip). Due to the different versions of MATLAB, I can't use the evaluation tool of 2013. By the way, I have tested the sample images in 2013 based on the evaluation tool of 2016. And the results are the same as 2013, I think that there is no difference between the evaluation tools in different years.

Last but not least, have you ever evaluated the images from dibco2017(https://vc.ee.duth.gr/dibco2017/benchmark/) ? How about your results?

Thank you again for your help and I am looking forward to your reply.

threshold	0	0.10	0.20	0.30	0.40	0.50	0.60	0.70	0.80	0.90
F-measure	86.87473	86.6786	86.66247	86.6474	86.63688	86.62508	86.61825	86.61082	86.59814	86.58317
Fps	90.39805	90.87379	90.89219	90.90235	90.91124	90.91526	90.92552	90.93636	90.9465	90.96493
PSNR	17.65381	17.63626	17.63381	17.63129	17.62909	17.62643	17.62559	17.62415	17.62146	17.61902
DRD	4.88334	4.86222	4.86404	4.86617	4.86759	4.86992	4.8703	4.87095	4.87247	4.87385
Recall	86.2434	84.87905	84.80306	84.74897	84.70734	84.66843	84.6339	84.59439	84.54435	84.4734
Precision	88.39292	89.42374	89.47259	89.50066	89.52428	89.54259	89.56579	89.59255	89.6205	89.66643
Rps	94.62979	94.15385	94.12175	94.10299	94.08874	94.07118	94.0588	94.04371	94.02554	94.00041
Pps	86.91282	88.13689	88.19566	88.22944	88.25749	88.27917	88.30724	88.33895	88.37199	88.42613

Attached file is the results of each test image at different thresholds：
DIBCO2016_threshold.csv

ajgallego · 2021-07-14T07:43:13Z

Dear @Ladyliu47, sorry for the delay. I did not receive notification of this comment.

I don't know why you don't get the same results. It may be some difference in the data used or perhaps in the weights file that is published. During experimentation, I ran hundreds of tests changing parameters. There may be an error in the published weight file and it may not be the one with which I got the best result for that dataset.

I have not tested with Dibco2017. I have not continued working on this project as I have focused on other research tasks, sorry.

ajgallego closed this as completed Mar 12, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to achieve the publication results by using the pre-trained model and binatize.py? #17

How to achieve the publication results by using the pre-trained model and binatize.py? #17

Ladyliu47 commented Apr 16, 2021

ajgallego commented May 1, 2021 •

edited

Ladyliu47 commented May 12, 2021

ajgallego commented Jul 14, 2021

How to achieve the publication results by using the pre-trained model and binatize.py? #17

How to achieve the publication results by using the pre-trained model and binatize.py? #17

Comments

Ladyliu47 commented Apr 16, 2021

ajgallego commented May 1, 2021 • edited

Ladyliu47 commented May 12, 2021

ajgallego commented Jul 14, 2021

ajgallego commented May 1, 2021 •

edited