Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: bg_num_rois = 0 and fg_num_rois = 0, this should not happen! #111

Closed
artuncF opened this issue Mar 31, 2018 · 21 comments
Closed

Comments

@artuncF
Copy link

artuncF commented Mar 31, 2018

When I tried to train the network with my own dataset which is actually https://github.com/udacity/self-driving-car/tree/master/annotations, this problem occurs after 900 iterations in the first epoch. Can you point out the source of problem?

@artuncF
Copy link
Author

artuncF commented Mar 31, 2018

I should also mention that the repo is up-to-date, I saw the commit related to the same error but it didn't solve the problem.

@artuncF
Copy link
Author

artuncF commented Apr 1, 2018

Your last commit didn't solve the problem. I've already said that my repo is up-to-date.

@WillSuen
Copy link

WillSuen commented Apr 4, 2018

I have same issue on pascal_voc_0712 data set (at 5000 iter of epoch 1) after I try to implement R-FCN with this repo. Any one know what is a possible reason?

@jwyang
Copy link
Owner

jwyang commented Apr 4, 2018

Hi, @WillSuen are you using the most recent code?

@WillSuen
Copy link

WillSuen commented Apr 4, 2018

@jwyang Yes, I just tried with the most recent code, but still got the same issue. The training process goes well, with loss decreasing each iter, and then at 5000 iter comes this ValueError: bg_num_rois = 0 and fg_num_rois = 0, this should not happen!

@WillSuen
Copy link

WillSuen commented Apr 4, 2018

I have changed several parameters in cfg,

           'RPN_MIN_SIZE': 8, -> 0
           'RPN_POST_NMS_TOP_N': 2000, -> 300
           'RPN_PRE_NMS_TOP_N': 12000, -> 6000

Will these parameters be the reason? I'll change them back to see what happens

@artuncF
Copy link
Author

artuncF commented Apr 4, 2018

I didn't change any parameter but got the error.

@WillSuen
Copy link

WillSuen commented Apr 5, 2018

It seems something wrong with the dataset I used. For the data batch where the error comes out, I print out gt_boxes, and find that there is no gt_boxes.

(0 ,.,.) = 
     0.0000     0.0000     0.0000     0.0000     0.0000
     0.0000     0.0000     0.0000     0.0000     0.0000
     0.0000     0.0000     0.0000     0.0000     0.0000
     0.0000     0.0000     0.0000     0.0000     0.0000
     0.0000     0.0000     0.0000     0.0000     0.0000
     0.0000     0.0000     0.0000     0.0000     0.0000
     0.0000     0.0000     0.0000     0.0000     0.0000
     0.0000     0.0000     0.0000     0.0000     0.0000
     0.0000     0.0000     0.0000     0.0000     0.0000
     0.0000     0.0000     0.0000     0.0000     0.0000
     0.0000     0.0000     0.0000     0.0000     0.0000
     0.0000     0.0000     0.0000     0.0000     0.0000
     0.0000     0.0000     0.0000     0.0000     0.0000
     0.0000     0.0000     0.0000     0.0000     0.0000
     0.0000     0.0000     0.0000     0.0000     0.0000
     0.0000     0.0000     0.0000     0.0000     0.0000
     0.0000     0.0000     0.0000     0.0000     0.0000
     0.0000     0.0000     0.0000     0.0000     0.0000
     0.0000     0.0000     0.0000     0.0000     0.0000
     0.0000     0.0000     0.0000     0.0000     0.0000

This is weird, because I download the pascal voc from their website. I'll double check the dataset.

@WillSuen
Copy link

WillSuen commented Apr 9, 2018

I finally fix this error I met, it is because I changed the path of config.py file so the program used wrong pkl file when loading data set. Hope this can help you.

@artuncF
Copy link
Author

artuncF commented Apr 12, 2018

I figured out that rpn_loss_box and rpn_class_box are returning nan, have you any idea why is that happening? @jwyang

@artuncF
Copy link
Author

artuncF commented Apr 12, 2018

And I am pretty sure about my ground truth boxes are true, I think there is a problem with rois and rpn.

@jwyang
Copy link
Owner

jwyang commented Apr 12, 2018

@artuncF I thought the new commit already solved this issue, but it turns out not. I will check again.

@CodeJjang
Copy link

Try verifying that MAX_NUM_GT_BOXES parameter matches to your dataset - I had to calculate it on my dataset and then set it to the correct number. It's one of the small things that helped me to train. Update us if it helped you or not

@ahmed-shariff
Copy link
Contributor

@artuncF tried lowering the learning rate?
also something that I missed when training on my own dataset: remember to have the 0th index represent the background.

@artuncF
Copy link
Author

artuncF commented Apr 29, 2018

I solved the problem. The problem is related to my annotation files now network working accurately.

@artuncF artuncF closed this as completed Apr 29, 2018
@frostinassiky
Copy link
Contributor

I met the same problem. Simply we can skip that batch.

@pengsida
Copy link

@jwyang Hi, I met this bug when training on the coco2017, and I find it was caused by a bug in roibatchLoader.py.

To keep the same ratio of images within the same batch, You crop them randomly through this block of code. However, when the picture is too long and objects are nearby the edge of the image, it is prone to crop the objects and thus the number of bounding boxes becomes zero. Here I give an example:

Before the cropping:
image

After:
image

Because the cropping is random, simply skipping such batch is a good idea.

@kentaroy47
Copy link

cleaning up the data/cache files helped me on this..

@mehrazi
Copy link

mehrazi commented Sep 29, 2018

I have same issue when training on own dataset
Cleaning data/cache also didn't help

@mehrazi
Copy link

mehrazi commented Sep 29, 2018

I solved the problem. The problem is related to my annotation files now network working accurately.
@artuncF
What's your annotations problem?

@marcunzueta
Copy link

check #594

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants