CRNN自定义数据集存在与数据绑定的损失上溢 loss:65504 #610

panxua · 2023-11-15T03:47:27Z

现象：
存在和数据绑定的损失函数上溢
截图：

现状： 已解决
原因：

对于“标注长度 > max_text_len”，数据处理会置空而没有提示
对于“标注长度 + 重复标识符 > pred_seq_len”，会导致CTCLoss上溢，无提示。

详细说明：地址
解决方法：
统计标注最大长度，配置seq_max_len；
统计标注+重复标识符最大长度，配置pred_seq_len
并分别修改训练、评估、预测中的img_shape中的宽度，满足4 x pred_seq_len
建议：
在raining_recognition_custom_dataset中提示用户，
https://github.com/mindspore-lab/mindocr/blob/main/docs/en/tutorials/training_recognition_custom_dataset.md
https://github.com/mindspore-lab/mindocr/blob/main/docs/cn/tutorials/training_recognition_custom_dataset.md

zhtmike · 2023-11-15T04:59:46Z

Hello, we provide two additional options to solve the problem you mentioned. For reason 1, you can add filter_max_len: True in your configure file to filter these problematic cases; And you can add filter_max_len: True and extra_count_if_repeat: True to filter these cases raised from reason 2. For detail, you can check configs/rec/svtr/svtr_tiny.yaml. :)

panshaowu assigned Bourn3z Feb 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CRNN自定义数据集存在与数据绑定的损失上溢 loss:65504 #610

CRNN自定义数据集存在与数据绑定的损失上溢 loss:65504 #610

panxua commented Nov 15, 2023

zhtmike commented Nov 15, 2023

CRNN自定义数据集存在与数据绑定的损失上溢 loss:65504 #610

CRNN自定义数据集存在与数据绑定的损失上溢 loss:65504 #610

Comments

panxua commented Nov 15, 2023

zhtmike commented Nov 15, 2023