-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training of py-faster-rcnn on ImageNet #9
Comments
Same thing for me. I have reduced the pascal voc dataset to couple of classes. I suspect some errors in the roidb. |
@Jianchao-ICT I am also interested in training a model on ImageNet, so all I need to do is to change code in lib/datasets/ilsvrc.py and factory.py ? Do I need to change |
@SenayG Hi, have you tried to run it without changing any code? |
@sunshineatnoon Yes, you also need to change the |
@Jianchao-ICT Thx. I have trained a fast rcnn model on ImageNet, it seems the method to train faster rcnn is the same, right? |
@Jianchao-ICT as I suspected for myself it was due to the data (maybe the same thing for you).
and look the last values before your crash, if some values are empty the crash is normal, you have to check again you dataset. |
@sunshineatnoon Well, I guess training the two is basically the same since the core of faster r-cnn is also implemented using fast rcnn. However, I still came across the above problem... |
@SenayG Oh, GREAT THANKS! I will try it now. |
@SenayG Hi, I have added
|
@SenayG Oh, I see, the last |
So clearly the issue is here : {'gt_boxes': array([], shape=(0, 5), dtype=float32), |
Your welcome! Good to help you to debug! |
@SenayG Thanks! You are really nice 👍 |
@SenayG Oh, the problem is found. It is due to the image |
Good the issue, can now be close, I think |
@SenayG Hi, I have closed it. |
@Jianchao-ICT I am training the model on imagenet, but I have few problems:
The command I used to train is:
Did you have the same problem? |
|
@Jianchao-ICT I've got rid of the -1 operation, but I still got the same error. Then I found this guy: n02958343_19208.xml had wrong annotations. There is only one object in the image, but there are four objects annotated in the .xml file. Changing the annotation file fixed my problem. |
@sunshineatnoon Oh, driven crazy by this dataset: it has involved many gray images; some of the images have Now I also see your problem. I just want to say this dataset seems not that user-friendly... |
@sunshineatnoon Hi, how is your training now? I now come across the following error in the Stage 1 Fast RCNN training. Have you got similar things?
|
@Jianchao-ICT No, I didn't encounter this problem, so it might caused by your dataset. I got other errors (#13) during Stage 1 Fast RCNN training. I fixed that error and still training now, hope no more errors... |
@sunshineatnoon Oh, I train on the ImageNet detections. Well, I will try to figure it out... I also hope no more errors... |
@sunshineatnoon Hi, I have figured out the reason... I made a too naive mistake: I forgot to change the |
@Jianchao-ICT Good for you! You trained on the whole dataset of imagenet? I only trained on two classes on ImageNet and it took a long long time for training. |
@sunshineatnoon Yeah, I train on the 201 category (with background as one category) but I use the validation set for training since I notice that the training set has some labels that are not in the 200 categories? |
Hi all: |
It should look more like this, if you refer to the original pascal_voc structure:
Then you have to create the function, that creates the dataset with the call vid(split, year). IMDB is actually the image database, where the data is stored for training, filled by your own function. If you have a look into the shell script, you will note that the shell script just calls the python script. You can do both ways but the shell script might be more comfortable, even for testing later on. Does this help a bit? |
@ednarb29 Thanks for your time! |
The shell script just defines the parameters and calls the python script for training and testing. That is only for automation, so later you just have to call the shell script and everything is done for you. |
@ednarb29 So if I modify the codes and set training or test data path rightly, just change some parameters for shell script will be ok for training and testing, right ? |
Yes, that all depends somehow on the other changes you have to do for the new dataset, but the idea is an easy use for later experiments when it is configured. |
@ednarb29 @Jianchao-ICT Hi! |
@JohnnyY8 it seems that caffemodel and prototxt do not match. |
@leejiajun Yes, I solver it by changing |
@JohnnyY8 argmax is an empty sequence. this problem could be comes from too small or too large the size of image of your training set. BTW: http://blog.csdn.net/jiajunlee/article/details/50470897 |
@leejiajun Thanks for your blog. |
@JohnnyY8 |
@leejiajun Hi! |
@JohnnyY8 Probably! I remenber that people have discussed your problem in this issue. |
@leejiajun Yes, but I have checked all xml files more than once, all remainings maybe ok. |
@Jianchao-ICT @GregorySenay Hi! |
Can you print the blobs during the iteration and paste the end of the log ? |
@GregorySenay No problem!
|
@leejiajun Hi! |
@JohnnyY8 |
@leejiajun |
@JohnnyY8 I updated my blog http://blog.csdn.net/JiaJunLee/article/details/50470897 |
@leejiajun Thanks for your help! |
@JohnnyY8 |
@leejiajun really thanks for your help and your time. You are so nice! |
Removing
and
helped in my case. |
@soupault Hi |
@Jianchao-ICT @GregorySenay how did you outline which file had empty ground truth values ? |
I am trying to train py-faster-rcnn on my custom dataset. Untill now I was busy with cleaning out dataset for errors like empty gt_boxes and now am stuck with abrupt crashing of iterations at 1st one itself. |
@JohnnyY8 @rbgirshick @GregorySenay @sunshineatnoon @leefionglee @jojotata Where to find ZF_v2.caffemodel |
I revised the codes in
lib/datasets
to train on the ImageNet detection data: I create anilsvrc.py
(like thepascal_voc.py
) to account for the ImageNet data and modify the corresponding codes infactory.py
. Then I run the experiment script./experiments/scripts/faster_rcnn_alt_opt.sh
and everything seems to be correct at first, but then the training seems to get stuck with the following information never moving forward again...Could anyone give me some suggestions? Thanks!
The text was updated successfully, but these errors were encountered: