Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loss of accuracy in recognizing when exporting a trained model #11551

Open
Markitosik opened this issue Jan 29, 2024 · 10 comments
Open

Loss of accuracy in recognizing when exporting a trained model #11551

Markitosik opened this issue Jan 29, 2024 · 10 comments
Assignees
Labels
inference this is a inference phase issue. needs investigation this issue needs investigation to either narrow down, or clarify triaged this issue has been looked, and triaged.

Comments

@Markitosik
Copy link

Markitosik commented Jan 29, 2024

System Environment: Windows 11, Python 3.9.9

Version: Paddle 2.7+ PaddleOCR

I trained a recognition model using the configuration file config.yml and obtained excellent results on my dataset. However, after exporting the model for inference, the accuracy significantly decreases.
Command code:
python PaddleOCR/tools/export_model.py -c config.yml -o Global.pretrained_model="output/rec/mark/best_accuracy" Global.save_inference_dir=./mark
Configuration file:
Global:
use_gpu: true
epoch_num: 250
log_smooth_window: 20
print_batch_step: 10
save_model_dir: ./output/rec/mark1/
save_epoch_step: 3

eval_batch_step: [0, 2000]
cal_metric_during_train: True
pretrained_model:
checkpoints:
save_inference_dir: ./mark
use_visualdl: TRue
infer_img: doc/imgs_words_en/word_10.png

character_dict_path: ../Practice1/ppocr/utils/dict1.txt
character_type: EN
max_text_length: 25
infer_mode: True
use_space_char: False
save_res_path: ./output/rec/dict1_0.txt

Optimizer:
name: Adam
beta1: 0.9
beta2: 0.999
lr:
learning_rate: 0.0005
regularizer:
name: 'L2'
factor: 0

Architecture:
model_type: rec
algorithm: CRNN
Transform:
Backbone:
name: MobileNetV3
scale: 0.5
model_name: large
Neck:
name: SequenceEncoder
encoder_type: rnn
hidden_size: 96
Head:
name: CTCHead
fc_decay: 0

Loss:
name: CTCLoss

PostProcess:
name: CTCLabelDecode

Metric:
name: RecMetric
main_indicator: acc

Train:
dataset:
name: SimpleDataSet
data_dir: ../Practice1/dataset/train/images/
label_file_list: ['../Practice1/dataset/train/train_annotation.txt']
transforms:
- DecodeImage: # load image
img_mode: BGR
channel_first: False
- CTCLabelEncode: # Class handling label
- RecResizeImg:
image_shape: [3, 32, 100]
- KeepKeys:
keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order
loader:
shuffle: True
batch_size_per_card: 256
drop_last: True
num_workers: 8
use_shared_memory: False

Eval:
dataset:
name: SimpleDataSet
data_dir: ../Practice1/dataset/val/images/
label_file_list: ['../Practice1/dataset/val/val_annotation.txt']
transforms:
- DecodeImage: # load image
img_mode: BGR
channel_first: False
- CTCLabelEncode: # Class handling label
- RecResizeImg:
image_shape: [3, 32, 100]
- KeepKeys:
keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order
loader:
shuffle: False
drop_last: False
batch_size_per_card: 256
num_workers: 4
use_shared_memory: False

What am I doing wrong?
@tink2123
Copy link
Collaborator

What command did you use to predict?
It is recommended that you use this command:

python tools/infer/predict_rec.py --image_dir="./doc/imgs_words_en/word_336.png" --rec_model_dir="./your inference model" --rec_image_shape="3, 32, 100" --rec_char_dict_path="../Practice1/ppocr/utils/dict1.txt"

@Markitosik
Copy link
Author

An error occurs when executing this command

(venv) PS C:\Studies\Practice> python PaddleOCR/tools/infer/predict_rec.py --image_dir="C:\Users\Mark\Downloads\dataset2\oi_9988.jpg" --rec_model_dir="C:\Studies\Practice\mark2.0" --rec_image_shape="3, 32, 100" --rec_char_dict_path="../Practice1/ppocr/utils/dict1.txt"

[2024/01/30 22:27:02] ppocr INFO: In PP-OCRv3, rec_image_shape parameter defaults to '3, 48, 320', if you are using recognition model with PP-OCRv2 or an older version, please set --rec_image_shape='3,32,320
[2024/01/30 22:27:02] ppocr INFO: Traceback (most recent call last):
File "C:\Studies\Practice\PaddleOCR\tools\infer\predict_rec.py", line 662, in main
rec_res, _ = text_recognizer(img_list)
File "C:\Studies\Practice\PaddleOCR\tools\infer\predict_rec.py", line 628, in call
rec_result = self.postprocess_op(preds)
File "C:\Studies\Practice\PaddleOCR\ppocr\postprocess\rec_postprocess.py", line 121, in call
text = self.decode(preds_idx, preds_prob, is_remove_duplicate=True)
File "C:\Studies\Practice\PaddleOCR\ppocr\postprocess\rec_postprocess.py", line 83, in decode
char_list = [
File "C:\Studies\Practice\PaddleOCR\ppocr\postprocess\rec_postprocess.py", line 84, in
self.character[text_id]
IndexError: list index out of range

[2024/01/30 22:27:02] ppocr INFO: list index out of range

What could be the reason for this?

@Markitosik
Copy link
Author

I have achieved ~98% accuracy during training. I want to use my trained model in code, for example, instead of typing the command in the terminal every time.
from paddleocr import PaddleOCR,draw_ocr
ocr = PaddleOCR(rec_model_dir='{your_rec_model_dir}', rec_char_dict_path='{your_rec_char_dict_path}')
img_path = 'img_12.jpg'
result = ocr.ocr(img_path)

But when using this code and the final output model based on my trained model on the dataset where I previously performed validation, I am getting only 5% accuracy.

Why does it happen like this?

@tink2123
Copy link
Collaborator

An error occurs when executing this command

(venv) PS C:\Studies\Practice> python PaddleOCR/tools/infer/predict_rec.py --image_dir="C:\Users\Mark\Downloads\dataset2\oi_9988.jpg" --rec_model_dir="C:\Studies\Practice\mark2.0" --rec_image_shape="3, 32, 100" --rec_char_dict_path="../Practice1/ppocr/utils/dict1.txt"

[2024/01/30 22:27:02] ppocr INFO: In PP-OCRv3, rec_image_shape parameter defaults to '3, 48, 320', if you are using recognition model with PP-OCRv2 or an older version, please set --rec_image_shape='3,32,320 [2024/01/30 22:27:02] ppocr INFO: Traceback (most recent call last): File "C:\Studies\Practice\PaddleOCR\tools\infer\predict_rec.py", line 662, in main rec_res, _ = text_recognizer(img_list) File "C:\Studies\Practice\PaddleOCR\tools\infer\predict_rec.py", line 628, in call rec_result = self.postprocess_op(preds) File "C:\Studies\Practice\PaddleOCR\ppocr\postprocess\rec_postprocess.py", line 121, in call text = self.decode(preds_idx, preds_prob, is_remove_duplicate=True) File "C:\Studies\Practice\PaddleOCR\ppocr\postprocess\rec_postprocess.py", line 83, in decode char_list = [ File "C:\Studies\Practice\PaddleOCR\ppocr\postprocess\rec_postprocess.py", line 84, in self.character[text_id] IndexError: list index out of range

[2024/01/30 22:27:02] ppocr INFO: list index out of range

What could be the reason for this?

Please confirm whether the dictionary used in the training and exported configuration file is ../Practice1/ppocr/utils/dict1.txt.
It needs to be consistent.

@tink2123
Copy link
Collaborator

I have achieved ~98% accuracy during training. I want to use my trained model in code, for example, instead of typing the command in the terminal every time. from paddleocr import PaddleOCR,draw_ocr ocr = PaddleOCR(rec_model_dir='{your_rec_model_dir}', rec_char_dict_path='{your_rec_char_dict_path}') img_path = 'img_12.jpg' result = ocr.ocr(img_path)

But when using this code and the final output model based on my trained model on the dataset where I previously performed validation, I am getting only 5% accuracy.

Why does it happen like this?

As you said, you achieved 98% accuracy on the text recognition evaluation set, and you also need to select a single text recognition prediction in order to align the behavior when making predictions.

from paddleocr import PaddleOCR

ocr = PaddleOCR()  # need to run only once to download and load model into memory
img_path = 'PaddleOCR/doc/imgs_words/ch/word_1.jpg'
result = ocr.ocr(img_path, det=False)
for idx in range(len(result)):
    res = result[idx]
    for line in res:
        print(line)

Refer to the introduction in the documentation:https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.7/doc/doc_en/whl_en.md

@Markitosik
Copy link
Author

Yes, I use a dictionary located at this path (../Practice1/ppocr/utils/dict1.txt) throughout the entire model training

@tink2123
Copy link
Collaborator

Yes, I use a dictionary located at this path (../Practice1/ppocr/utils/dict1.txt) throughout the entire model training

maybe you could print text_id and len(self.character) in File "C:\Studies\Practice\PaddleOCR\ppocr\postprocess\rec_postprocess.py" Line84, to find the reason of IndexError.

@Markitosik
Copy link
Author

Markitosik commented Jan 31, 2024

python PaddleOCR/tools/infer/predict_rec.py --image_dir="C:\Users\Mark\Downloads\dataset2\oi_9988.jpg" --rec_model_dir="C:\Studies \Practice\mark2.0" --rec_image_shape="3, 32, 100" --rec_char_dict_path="../Practice1/ppocr/utils/dict1.txt"

I was able to figure out how to run the above command and it gives the correct answer, but there is a problem.
Why am I getting different results on the file when using the same model?
When I use the command, I get the correct answer.
But when I use the model in the program, it gives me the wrong answer.
Command:

python /tools/infer/predict_rec.py --image_dir="id_1000_value_176_881.jpg" --rec_model_dir="/mark1.4" --rec_image_shape="3, 32, 100" --rec_char_dict_path="/ppocr/utils/dict1.txt"

Code:

from paddleocr import PaddleOCR

ocr = PaddleOCR(use_gpu=True, rec_model_dir='/mark1.4',
                rec_char_dict_path='ppocr/utils/dict1.txt')

result = ocr.ocr('id_1000_value_176_881.jpg', cls=False, det=False)
answer = str(result[0][0][0])
print(answer)

@tink2123
Copy link
Collaborator

tink2123 commented Feb 2, 2024

All parameters need to be consistent, try code:

from paddleocr import PaddleOCR

ocr = PaddleOCR(use_gpu=True, rec_model_dir='/mark1.4', rec_image_shape="3, 32, 100",
                rec_char_dict_path='ppocr/utils/dict1.txt')

result = ocr.ocr('id_1000_value_176_881.jpg', cls=False, det=False)
answer = str(result[0][0][0])
print(answer)

@Markitosik
Copy link
Author

Even when using my output model in this code, nothing changes in the results.

@jzhang533 jzhang533 added triaged this issue has been looked, and triaged. needs investigation this issue needs investigation to either narrow down, or clarify inference this is a inference phase issue. labels Apr 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
inference this is a inference phase issue. needs investigation this issue needs investigation to either narrow down, or clarify triaged this issue has been looked, and triaged.
Projects
None yet
Development

No branches or pull requests

3 participants