You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This paper is really excellent! But I have a small question
It is the first time I see someone think of CTC loss to make the model recover more accurate characters. However, in this paper, only Figure 6 and Table 4 have a slight reference to this. Is there only CER metrics in Table 4 because this loss may affect the results of PSNR of the model?
Because the input of the model is 128*128 resolution patch instead of the whole image, I think this may also lead to some words or letters being cut off, which may lead to false recognition and affect the training.
The text was updated successfully, but these errors were encountered:
shallweiwei
changed the title
Questions about related to Differentiable OCR-Guided Finetuning in the paper
Question about Differentiable OCR-Guided Finetuning in the paper
May 18, 2024
Our work is also based on https://arxiv.org/abs/2105.07983.
Experimental results showed that CTC Loss helps the performance of a commercial OCR in the character recognition task.
We noticed that such loss function introduces small artifacts in images in a few samples. This is reflected slightly on some pixel-level metrics (PSNR mainly). However, what we think is that such loss drives the model not to produce visually excellent images but rather better images for ocr, improving its recognition accuracy (slightly).
For completeness, we leave the full table that we did not include in the paper.
This paper is really excellent! But I have a small question
It is the first time I see someone think of CTC loss to make the model recover more accurate characters. However, in this paper, only Figure 6 and Table 4 have a slight reference to this. Is there only CER metrics in Table 4 because this loss may affect the results of PSNR of the model?
Because the input of the model is 128*128 resolution patch instead of the whole image, I think this may also lead to some words or letters being cut off, which may lead to false recognition and affect the training.
The text was updated successfully, but these errors were encountered: