文本识别训练时宽高比大于 10 或者文本长度大于 25 的图像会直接丢弃吗？ #5017

CharlesWu123 · 2021-12-22T09:12:30Z

https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.4/doc/doc_ch/FAQ.md#15
https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.4/doc/doc_ch/FAQ.md#210
你好，我从这两个问题中看到，宽高比大于10的以及文本长度超过25的训练图像都会被丢弃，这个具体在代码中哪里体现的呢？
那对于文档图像的训练，大部分的文本行图像宽高比都大于10并且文本长度都会超过25，在可以使用与训练模型的情况下应该怎么处理呢？

littletomatodonkey · 2021-12-22T11:58:27Z

在这里实现的

PaddleOCR/ppocr/data/imaug/label_ops.py

Line 142 in 95c670f

if len(text) == 0 or len(text) > self.max_text_len:

可以把图像shape变大点，比如[3, 32, 640]，然后max_text_length参数再修改的大一些

CharlesWu123 · 2021-12-22T12:16:10Z

@littletomatodonkey 好的，感谢

surdldz · 2022-01-17T09:52:27Z

在这里实现的

PaddleOCR/ppocr/data/imaug/label_ops.py

Line 142 in 95c670f

if len(text) == 0 or len(text) > self.max_text_len:

可以把图像shape变大点，比如[3, 32, 640]，然后max_text_length参数再修改的大一些

shape变大后，会导致推理变慢么？

paddle-bot-old · 2022-04-19T06:34:19Z

Since you haven't replied for more than 3 months, we have closed this issue/pr.
If the problem is not solved or there is a follow-up one, please reopen it at any time and we will continue to follow up.
It is recommended to pull and try the latest code first.
由于您超过三个月未回复，我们将关闭这个issue/pr。
若问题未解决或有后续问题，请随时重新打开（建议先拉取最新代码进行尝试），我们会继续跟进。

paddle-bot-old bot assigned littletomatodonkey Dec 22, 2021

paddle-bot-old bot closed this as completed Apr 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

文本识别训练时宽高比大于 10 或者文本长度大于 25 的图像会直接丢弃吗？ #5017

文本识别训练时宽高比大于 10 或者文本长度大于 25 的图像会直接丢弃吗？ #5017

CharlesWu123 commented Dec 22, 2021

littletomatodonkey commented Dec 22, 2021

CharlesWu123 commented Dec 22, 2021

surdldz commented Jan 17, 2022

paddle-bot-old bot commented Apr 19, 2022

文本识别训练时宽高比大于 10 或者文本长度大于 25 的图像会直接丢弃吗？ #5017

文本识别训练时宽高比大于 10 或者文本长度大于 25 的图像会直接丢弃吗？ #5017

Comments

CharlesWu123 commented Dec 22, 2021

littletomatodonkey commented Dec 22, 2021

CharlesWu123 commented Dec 22, 2021

surdldz commented Jan 17, 2022

paddle-bot-old bot commented Apr 19, 2022