Question about Differentiable OCR-Guided Finetuning in the paper #2

shallweiwei · 2024-05-17T09:59:41Z

This paper is really excellent! But I have a small question
It is the first time I see someone think of CTC loss to make the model recover more accurate characters. However, in this paper, only Figure 6 and Table 4 have a slight reference to this. Is there only CER metrics in Table 4 because this loss may affect the results of PSNR of the model？
Because the input of the model is 128*128 resolution patch instead of the whole image, I think this may also lead to some words or letters being cut off, which may lead to false recognition and affect the training.

Giordano-Cicchetti · 2024-05-20T15:05:11Z

Our work is also based on https://arxiv.org/abs/2105.07983.
Experimental results showed that CTC Loss helps the performance of a commercial OCR in the character recognition task.
We noticed that such loss function introduces small artifacts in images in a few samples. This is reflected slightly on some pixel-level metrics (PSNR mainly). However, what we think is that such loss drives the model not to produce visually excellent images but rather better images for ocr, improving its recognition accuracy (slightly).

For completeness, we leave the full table that we did not include in the paper.

Thank you for your comment.

	PSNR	SSIM	LPIPS	DISTS	CER
NAF-DPM w/o finetuning	34,377	0.9944	0.004668	0.022891	1.55
NAF-DPM w finetuning	34,066	0.9956	0.003934	0.013801	1.40

shallweiwei · 2024-05-21T03:15:58Z

Thank you very much for your answer！

shallweiwei changed the title ~~Questions about related to Differentiable OCR-Guided Finetuning in the paper~~ Question about Differentiable OCR-Guided Finetuning in the paper May 18, 2024

shallweiwei closed this as completed May 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about Differentiable OCR-Guided Finetuning in the paper #2

Question about Differentiable OCR-Guided Finetuning in the paper #2

shallweiwei commented May 17, 2024

Giordano-Cicchetti commented May 20, 2024

shallweiwei commented May 21, 2024

Question about Differentiable OCR-Guided Finetuning in the paper #2

Question about Differentiable OCR-Guided Finetuning in the paper #2

Comments

shallweiwei commented May 17, 2024

Giordano-Cicchetti commented May 20, 2024

shallweiwei commented May 21, 2024