Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about Differentiable OCR-Guided Finetuning in the paper #2

Closed
shallweiwei opened this issue May 17, 2024 · 2 comments
Closed

Comments

@shallweiwei
Copy link

This paper is really excellent! But I have a small question
It is the first time I see someone think of CTC loss to make the model recover more accurate characters. However, in this paper, only Figure 6 and Table 4 have a slight reference to this. Is there only CER metrics in Table 4 because this loss may affect the results of PSNR of the model?
Because the input of the model is 128*128 resolution patch instead of the whole image, I think this may also lead to some words or letters being cut off, which may lead to false recognition and affect the training.

@shallweiwei shallweiwei changed the title Questions about related to Differentiable OCR-Guided Finetuning in the paper Question about Differentiable OCR-Guided Finetuning in the paper May 18, 2024
@Giordano-Cicchetti
Copy link
Collaborator

Our work is also based on https://arxiv.org/abs/2105.07983.
Experimental results showed that CTC Loss helps the performance of a commercial OCR in the character recognition task.
We noticed that such loss function introduces small artifacts in images in a few samples. This is reflected slightly on some pixel-level metrics (PSNR mainly). However, what we think is that such loss drives the model not to produce visually excellent images but rather better images for ocr, improving its recognition accuracy (slightly).

For completeness, we leave the full table that we did not include in the paper.

Thank you for your comment.

  PSNR SSIM LPIPS DISTS CER
NAF-DPM w/o finetuning 34,377 0.9944 0.004668 0.022891 1.55
NAF-DPM w finetuning 34,066 0.9956 0.003934 0.013801 1.40

@shallweiwei
Copy link
Author

Thank you very much for your answer!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants