Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No checkpoint found #8

Open
seacloud-0420 opened this issue Jan 2, 2021 · 23 comments
Open

No checkpoint found #8

seacloud-0420 opened this issue Jan 2, 2021 · 23 comments

Comments

@seacloud-0420
Copy link

Hi
I found a problem. When I was testing the model (python ./tools/test_net.py --config-file=configs/arpn/e2e_rrpn_R_50_C4_1x_test_AFPN.yaml).
2020-12-31 13:35:14,374 maskrcnn_benchmark.utils.checkpoint INFO: No checkpoint found. Initializing model from scratch
2020-12-31 13:35:14,374 maskrcnn_benchmark.data.build WARNING: When using more than one image per GPU you may encounter an out-of-memory (OOM) error if your GPU does not have sufficient memory. If this happens, you can reduce SOLVER.IMS_PER_BATCH (for training) or TEST.IMS_PER_BATCH (for inference). For training, you must also adjust the learning rate and schedule length according to the linear scaling rule. See for example: https://github.com/facebookresearch/Detectron/blob/master/configs/getting_started/tutorial_1gpu_e2e_faster_rcnn_R-50-FPN.yaml#L14
My GPU is TITAN RTX 24GB. I want to know what causes this, please tell me.

@mjq11302010044
Copy link
Owner

@heyun1994 Sorry, tools/test_net.py is not for testing the RRPN++, please see Readme.md for further information.

@seacloud-0420
Copy link
Author

Hi @mjq11302010044

Thank you for your response.
I have a few more problems about your project. When I was Training the model (python tools/train_net.py --config-file=configs/arpn_E2E/e2e_rrpn_R_50_C4_1x_train_AFPN_RT_LERB_Spotter.yaml).

FileNotFoundError: [Errno 2] No such file or directory: '../hard_space1/models/ICDAR13_15_17_Trial_AFPN_LERB_RT_Spotter_recg_refine_v1/model_0220000.pth'

So I test this model (python ./demo/rrpn_e2e_infer.py --config-file=configs/arpn/e2e_rrpn_R_50_C4_1x_test_AFPN.yaml),
and there were some problems.

Traceback (most recent call last):
File "/home/lhy/RRPN/./demo/rrpn_e2e_infer.py", line 93, in
alphabet = open(cfg.MODEL.ROI_REC_HEAD.ALPHABET).readlines()[0] + '-'
FileNotFoundError: [Errno 2] No such file or directory: ''
Thank you for your help. If it is convenient, can I add WeChat to communicate? My WeChat is 18860461318

@ShaheenPerveen
Copy link

@mjq11302010044 @heyun1994
I am also encountering the same issue. Were you able to get the above checkpoint file?

@seacloud-0420
Copy link
Author

@ShaheenPerveen
I can't get the above checkpoint file'../hard_space1/models/ICDAR13_15_17_Trial_AFPN_LERB_RT_Spotter_recg_refine_v1/model_0220000.pth'.

@ShaheenPerveen
Copy link

@heyun1994
I am also unable to find the below checkpoint:
FileNotFoundError: [Errno 2] No such file or directory: '../hard_space1/models/ICDAR13_15_17_Trial_AFPN_LERB_RT_Spotter_recg_refine_v1_ft/model_0040000.pth'

@seacloud-0420
Copy link
Author

@ShaheenPerveen
I'm also very confused, too. Did you use $RRPN_ROOT/demo/ICDAR19_eval_script.py or $RRPN_ROOT/demo/rrpn_e2e_infer.py to test your images ?

@ShaheenPerveen
Copy link

@heyun1994 I am trying to use: $RRPN_ROOT/demo/rrpn_e2e_infer.py to test my images. What about you?

@seacloud-0420
Copy link
Author

@ShaheenPerveen
so am I. Which config file do you use?

@ShaheenPerveen
Copy link

@heyun1994 I am using "./configs/arpn_E2E/e2e_rrpn_R_50_C4_1x_test_AFPN_RT_LERB_Spotter.yaml" config file

@mjq11302010044
Copy link
Owner

@ShaheenPerveen @heyun1994 You can download the trained file from my google drive link, and replace the checkpoint path in the config file, and it will be fine to go.

@ShaheenPerveen
Copy link

ShaheenPerveen commented Jan 5, 2021

Thank you @mjq11302010044, I was able to download and use the pretrained file but now I am encountering a new error. Please help out

  File "rrpn_e2e_infer.py", line 99, in <module>
    alphabet = open(cfg.MODEL.ROI_REC_HEAD.ALPHABET).readlines()[0] + '-'
FileNotFoundError: [Errno 2] No such file or directory: './data_cache/alphabet_IC13_IC15_Syn800K_pro.txt'

@seacloud-0420
Copy link
Author

@mjq11302010044
Thank you for your response.
I downloaded your pretrained file, but it still doesn't work. So I removed the checkpoint path, and the result of the training showed that the character could not be recognized. I'm also very confused, please help me.

lstr: |-----------------------------------| |ODY|
lstr: |-----------------------------------| |SALE|
lstr: |-----------------------------------| |EXIT|
lstr: |-----------------------------------| |UNI|
lstr: |-----------------------------------| |QLO|
2021-01-06 19:04:48,593 maskrcnn_benchmark.trainer INFO: eta: 0:00:08 iter: 99910 loss: 0.7328 (0.8009) loss_classifier: 0.0095 (0.0351) loss_box_reg: 0.0003 (0.0081) loss_rec: 0.5761 (0.6323) loss_objectness: 0.0986 (0.0982) loss_rpn_box_reg: 0.0148 (0.0271) time: 0.0990 (0.0990) data: 0.0021 (0.0048) lr: 0.000002 max mem: 1701
2021-01-06 19:04:49,552 maskrcnn_benchmark.trainer INFO: eta: 0:00:07 iter: 99920 loss: 0.7068 (0.8009) loss_classifier: 0.0096 (0.0351) loss_box_reg: 0.0003 (0.0081) loss_rec: 0.5728 (0.6323) loss_objectness: 0.0982 (0.0982) loss_rpn_box_reg: 0.0146 (0.0271) time: 0.0966 (0.0990) data: 0.0021 (0.0048) lr: 0.000002 max mem: 1701
2021-01-06 19:04:50,518 maskrcnn_benchmark.trainer INFO: eta: 0:00:06 iter: 99930 loss: 0.7440 (0.8009) loss_classifier: 0.0144 (0.0351) loss_box_reg: 0.0004 (0.0081) loss_rec: 0.5727 (0.6323) loss_objectness: 0.0978 (0.0982) loss_rpn_box_reg: 0.0204 (0.0271) time: 0.0944 (0.0990) data: 0.0021 (0.0048) lr: 0.000002 max mem: 1701
2021-01-06 19:04:51,517 maskrcnn_benchmark.trainer INFO: eta: 0:00:05 iter: 99940 loss: 0.7572 (0.8009) loss_classifier: 0.0223 (0.0351) loss_box_reg: 0.0007 (0.0081) loss_rec: 0.5852 (0.6323) loss_objectness: 0.0977 (0.0982) loss_rpn_box_reg: 0.0200 (0.0271) time: 0.0944 (0.0990) data: 0.0021 (0.0048) lr: 0.000002 max mem: 1701
2021-01-06 19:04:52,480 maskrcnn_benchmark.trainer INFO: eta: 0:00:04 iter: 99950 loss: 0.7461 (0.8009) loss_classifier: 0.0172 (0.0351) loss_box_reg: 0.0005 (0.0081) loss_rec: 0.6162 (0.6323) loss_objectness: 0.0984 (0.0982) loss_rpn_box_reg: 0.0132 (0.0271) time: 0.0914 (0.0990) data: 0.0021 (0.0048) lr: 0.000002 max mem: 1701
2021-01-06 19:04:53,507 maskrcnn_benchmark.trainer INFO: eta: 0:00:03 iter: 99960 loss: 0.7461 (0.8009) loss_classifier: 0.0202 (0.0351) loss_box_reg: 0.0004 (0.0081) loss_rec: 0.5790 (0.6323) loss_objectness: 0.0984 (0.0982) loss_rpn_box_reg: 0.0132 (0.0271) time: 0.0954 (0.0990) data: 0.0021 (0.0048) lr: 0.000002 max mem: 1701
2021-01-06 19:04:54,481 maskrcnn_benchmark.trainer INFO: eta: 0:00:02 iter: 99970 loss: 0.7845 (0.8009) loss_classifier: 0.0382 (0.0351) loss_box_reg: 0.0004 (0.0081) loss_rec: 0.5273 (0.6323) loss_objectness: 0.0979 (0.0982) loss_rpn_box_reg: 0.0188 (0.0271) time: 0.0970 (0.0990) data: 0.0021 (0.0048) lr: 0.000002 max mem: 1701
2021-01-06 19:04:55,457 maskrcnn_benchmark.trainer INFO: eta: 0:00:01 iter: 99980 loss: 0.7160 (0.8009) loss_classifier: 0.0195 (0.0351) loss_box_reg: 0.0003 (0.0081) loss_rec: 0.5273 (0.6323) loss_objectness: 0.0978 (0.0982) loss_rpn_box_reg: 0.0135 (0.0271) time: 0.0970 (0.0990) data: 0.0021 (0.0048) lr: 0.000002 max mem: 1701
2021-01-06 19:04:56,422 maskrcnn_benchmark.trainer INFO: eta: 0:00:00 iter: 99990 loss: 0.7500 (0.8009) loss_classifier: 0.0157 (0.0351) loss_box_reg: 0.0002 (0.0081) loss_rec: 0.5962 (0.6323) loss_objectness: 0.0991 (0.0982) loss_rpn_box_reg: 0.0125 (0.0271) time: 0.0929 (0.0990) data: 0.0021 (0.0048) lr: 0.000002 max mem: 1701
2021-01-06 19:04:57,333 maskrcnn_benchmark.trainer INFO: eta: 0:00:00 iter: 100000 loss: 0.7628 (0.8009) loss_classifier: 0.0127 (0.0351) loss_box_reg: 0.0003 (0.0081) loss_rec: 0.6319 (0.6323) loss_objectness: 0.0988 (0.0982) loss_rpn_box_reg: 0.0157 (0.0271) time: 0.0901 (0.0990) data: 0.0020 (0.0048) lr: 0.000000 max mem: 1701

@seacloud-0420
Copy link
Author

@ShaheenPerveen

Sorry to disturb you. Will the project run successfully after you add the checkpoint file?

@ShaheenPerveen
Copy link

@heyun1994 I am not training the model, I am using the pre-trained model for testing on my own images. I had downloaded model_IC15_89.pth using the link https://drive.google.com/file/d/1nv-ZjbYBj8ePZRa_fAhbHvzm7HqSxPWK/view?usp=sharing and changed the path to the location of this checkpoint in configs/arpn_E2E/e2e_rrpn_R_50_C4_1x_test_AFPN_RT_LERB_Spotter.yaml
as:

MODEL:
  META_ARCHITECTURE: "ARPN"
#   WEIGHT: "../hard_space1/models/ICDAR13_15_17_Trial_AFPN_LERB_RT_Spotter_recg_refine_v1_ft/model_0040000.pth"
  WEIGHT: "./checkpoint/model_IC15_89.pth"
  BACKBONE:
    CONV_BODY: "R-50-FPN"
    OUT_CHANNELS: 64

@seacloud-0420
Copy link
Author

@ShaheenPerveen
Thank you. What is the result of testing on your images with the pre-trained model? Can the character content be recognized? I used the pre-trained model to test the ICDAR15 data and the character content was not recognized.

@ShaheenPerveen
Copy link

Hi @heyun1994
I was able to load the model using the pre-trained checkpoint but after the model loads, I am encountering the following error:

  File "rrpn_e2e_infer.py", line 99, in <module>
    alphabet = open(cfg.MODEL.ROI_REC_HEAD.ALPHABET).readlines()[0] + '-'
FileNotFoundError: [Errno 2] No such file or directory: './data_cache/alphabet_IC13_IC15_Syn800K_pro.txt'

I am unable to find a source to obtain './data_cache/alphabet_IC13_IC15_Syn800K_pro.txt'.
How did you run the testing? Could you please help me with it. From where did you obtain the alphabet file?
If you have the above mentioned file, could you share it with me via google drive or something?

@ShaheenPerveen
Copy link

@heyun1994 I am unable to run the training properly also we can't use WeChat, will it be possible for you to upload the file as github gist or as a file in one of your github repository which I can clone?

@seacloud-0420
Copy link
Author

@ShaheenPerveen
Copy link

@heyun1994 Thank you for sharing the file. Let me run the model on my images and get back with what output did I get

@ShaheenPerveen
Copy link

ShaheenPerveen commented Jan 12, 2021

@heyun1994 I was able to test using pretrained model and the alphabet file provided by you but I am also facing the same problem. The output file generated by the model corresponding to each image looks like this:

270,558,331,487,347,500,287,572
236,534,286,479,300,492,251,547
285,570,343,499,358,511,300,583
398,471,441,420,456,433,413,483
394,320,448,256,461,268,407,331

When I check the images saved in vis folder, I can see some words being bounded by bounding boxes but there are no words recognised in the result file. Not sure how to understand this.
@mjq11302010044 Could you please help us figure this out? Why are we unable to see any words/character recognised and what does the above output exactly means?

@ShaheenPerveen
Copy link

Hey @mjq11302010044 could you please help me understand the output.
I have gone through the RRPN++ paper and I am little confused about the output I have obtained by using pre-trained ICDAR15 model using the link https://drive.google.com/file/d/1nv-ZjbYBj8ePZRa_fAhbHvzm7HqSxPWK/view?usp=sharing.
I was expecting the output will contain 5 numbers i.e. R(l, t, r, b, θ) as mentioned in RRPN++ paper from the detection head: https://arxiv.org/pdf/2009.13118.pdf
However, as the output contains one line with 8 numbers corresponding to each anchor predicted in the image, Is the output similar to QUAD geometry mentioned in EAST paper: https://arxiv.org/pdf/1704.03155.pdf?
Please clarify.

@ShaheenPerveen
Copy link

Hey @mjq11302010044 I was able to understand the output of the model, it is similar to QUAD geometry mentioned in EAST paper.
However, I still have no clarification on why am I getting only coordinates as output and no character content corresponding to the detected bbox. I am assuming the detection head of the model is working fine but the recognition head is not. Could you please help me in figuring out what could be the reason?
@heyun1994 as you have mentioned above, you were also encountering the same issue i.e. no character recognition, were you able to resolve the issue/figure out the reason?

@seacloud-0420
Copy link
Author

@ShaheenPerveen
Hello, My problem has not been resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants