Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluation always produces mAP of 0.0 when using backbones other than Resnet50 #647

Open
jpxrc opened this issue Aug 26, 2018 · 48 comments
Open

Comments

@jpxrc
Copy link

jpxrc commented Aug 26, 2018

First and foremost, thank you for the awesome package! The dataset I am using is of satellite images consisting of 29 different classes. I have been able to train and evaluate a retinanet model on this dataset using the default 'densenet50' backbone on a subset of the 29 classes.

However, when I switch over to training and evaluating a model with a different backbone network such as 'densenet121', all of the mAP scores for each class is zero. I'm not receiving any issues when training (I am also using the random-transform flag for each epoch), or when converting the model (I also supply the --backbone='densenet121' flag) and it converts successfully. I can also see that losses being optimized during training so it's definitely detecting and classifying the objects in the images.

I even tried using the original resnet50 model trained on a subset of classes to see if it would pick up those classes on the full dataset with 29 classes and it still produces an output of zero. I looked at the validation_annotations.csv file for both cases and the formatting is identical so I don't think it has to do with the annotation files.

I have attached the validation_annotations.csv file, the classes.csv file (converted to .txt files in order to attach them here)
common_classes.txt
common_validation_annotations.txt

Any ideas what could be going on?

EDIT: I just did a comparison of a Resnet50 model and Densenet121 model both trained on the same dataset that I know for sure works and the problem is definetely with the densenet121 implementation because the Resnet50 model is producing output during evaluation.

@jpxrc jpxrc changed the title Evaluation always produces an mAP of 0.0 Evaluation always produces mAP of 0.0 Aug 26, 2018
@jpxrc jpxrc changed the title Evaluation always produces mAP of 0.0 Evaluation always produces mAP of 0.0 when using models other than Resnet50 Aug 26, 2018
@jpxrc jpxrc changed the title Evaluation always produces mAP of 0.0 when using models other than Resnet50 Evaluation always produces mAP of 0.0 when using backbones other than Resnet50 Aug 26, 2018
@hgaiser
Copy link
Contributor

hgaiser commented Aug 30, 2018

Densenet is a community contribution and I have never really used it. If you find out what the problem is then a PR will be welcome :)

@birham-red-bd
Copy link

I have the same problem when i changed the backbone to the shuffleNet. The loss is decreasing, but the MAP is always zero. However when i trained model with resnet50 backbone, everything is okay. I still did not find the problem. Can anyone give me some advice?

@hgaiser
Copy link
Contributor

hgaiser commented Sep 4, 2018

Do you evaluate during training, or after training using the evaluate tool? For densenet and mobilenet I noticed there is a bug when preprocessing the images in the evaluate tool. Those backbones require a mode='tf' in the preprocess_image call which isn't there currently.

@birham-red-bd
Copy link

yeah, I evaluate the map during training. The dataset i used is pascal voc2007. And I did not use pretrained model, because i did not find the pretrained weights of shufflenet-V2 on imagenet.

@hgaiser
Copy link
Contributor

hgaiser commented Sep 4, 2018

@songhuizhong I'm not sure what you mean with shufflenet-V2, we don't have that backbone in our repository.

@birham-red-bd
Copy link

It is the model I built by myself. here is the article introducing the shufflenet-v2 :https://arxiv.org/abs/1807.11164

@hgaiser
Copy link
Contributor

hgaiser commented Sep 7, 2018

In that case I can't help. I only have experience with ResNet backbones. If you find out the solution to this then a PR is welcome.

@birham-red-bd
Copy link

well, i find the problem, it is my fault, the reason why map is 0 is because the feature map input into feature pyramid network is in wrong order.
This is the wrong one.
layer_names = ['1x1conv5_out','stage4/block1/relu_1x1conv_1','stage3/block1/relu_1x1conv_1']
and this is the correct one.
layer_names = ['stage3/block1/relu_1x1conv_1','stage4/block1/relu_1x1conv_1','1x1conv5_out']
But I can only trained this model with the highest mAP of 44.59% on dataset of VOC2007 test. The training dataset is voc2012 trainval. The input size is changed to 512*512, because i want to use larger batch size of 28. And i trained this model without using pretrained weights on imageNet.

@hgaiser
Copy link
Contributor

hgaiser commented Sep 18, 2018

Alright, I'll assume this issue resolved then.

@hgaiser hgaiser closed this as completed Sep 18, 2018
@hgaiser
Copy link
Contributor

hgaiser commented Sep 18, 2018

Actually, the original issue was something else entirely, DensNet, so I'm reopening..

@hgaiser hgaiser reopened this Sep 18, 2018
@jyu-theartofml
Copy link

I ran into the same issue using resnet50 for backbone (training and using evaluation.py). I fixed it when I specify image-min-side and image-max-side, otherwise the default values of these parse args don't match up with my image dimension (256x256).

@MAGI003769
Copy link

MAGI003769 commented Dec 7, 2018

Above all thanks for your awesome work @hgaiser. I met the same problem with @yyannikb . With 'densenet121' as backbone, I got massive detection results and all the scores of boxes are 1. Yes, there is only such a value for score. Concequently, it leads a zero mAP.

My project uses mammography from dataset DDSM. For training, I set image_max_side=666 and image_min_side=400, start with pretrained model from keras repo and learning rate equal to 0.003. Has anyone resolved this problem? I will appreciate that if you can share your experience. Thanks a lot.

@wxinbeings
Copy link

When I used mobilenet224 to train, got the same issue. Has anyone resolved this problem? I will appreciate that if you can share your experience. Thanks a lot.

@ozyilmaz
Copy link

ozyilmaz commented Feb 2, 2019

Do you evaluate during training, or after training using the evaluate tool? For densenet and mobilenet I noticed there is a bug when preprocessing the images in the evaluate tool. Those backbones require a mode='tf' in the preprocess_image call which isn't there currently.

This was the solution for mobilenet. You have to hack the 'evaluate_coco' method in coco_eval.py.
Change this line: image = generator.preprocess_image(image)
To this: image = generator.preprocess_image(image, mode='tf')

@tommysugiarto
Copy link

Hi I also got the same problem when tried to inference model with densenet121 backbone, so someone already have idea how to solve that?
Thanks a lot

@wxinbeings
Copy link

Hi I also got the same problem when tried to inference model with densenet121 backbone, so someone already have idea how to solve that?
Thanks a lot

It's the same issue with mobilenet, just change the same place as @ozyilmaz commented.

@tonmoyborah
Copy link

@ozyilmaz when I do this, it throws an error that preprocess_image got an unexpected keyword mode. Is it only me or there is an obvious step I am missing?

@ozyilmaz
Copy link

@ozyilmaz when I do this, it throws an error that preprocess_image got an unexpected keyword mode. Is it only me or there is an obvious step I am missing?

@tonmoyborah , it is hard to guess but it seems like the generator object does not have the correct "preprocess_image" method.

@Cospel
Copy link

Cospel commented Apr 30, 2019

Same happened to me when using mobilenetv1/v2.
However when I set that image has fixed 800x800 size input for training and evaluation than the convergence works great. If I changed it to any other resolution like 801x801 than the model does not converge. If i set the input to the mobilenet inputs = keras.layers.Input(shape=(size, size, 3)) then the model works as expected (and not None, None, 3).

Anyone could explain this strange behaviour?

@IntelligentIndia7
Copy link

@tonmoyborah @ozyilmaz Hey, Did you solve the error !!! I am trying to run RetinaNet with Mobilenet224_1.0 backbone and I got map of 0. When I try and train with change in eval.py -> _get_detections method in the line - image = generator.preprocess_image(image) to image = generator.preprocess_image(image, mode = 'tf') as mentioned by @ozyilmaz . I get same error as unexpected keyword mode

@IntelligentIndia7
Copy link

when I train normally I get 0 mAP as shown below. Can anyone help me on this?

10000/10000 [==============================] - 3158s 316ms/step - loss: 5.4151 - regression_loss: 2.5757 - classification_loss: 2.8393 - val_loss: 5.3902 - val_regression_loss: 2.5642 - val_classification_loss: 2.8260
Running network: 100% (12704 of 12704) |#################################################################| Elapsed Time: 0:16:48 Time: 0:16:48
Parsing annotations: 100% (12704 of 12704) |#############################################################| Elapsed Time: 0:00:00 Time: 0:00:00
('6066 instances of class', 'M', 'with average precision: 0.0000')
('8803 instances of class', 'W', 'with average precision: 0.0000')
mAP: 0.0000

@hgaiser
Copy link
Contributor

hgaiser commented May 21, 2019

Please use the keras-retinanet slack channel for usage questions, or read the readme to find out possible issues.

@thusinh1969
Copy link

thusinh1969 commented May 21, 2019

I have same issue. Backbone densenet201 downloaded weight from keras github. Training with freeze-backbone, custom csv which worked well with any Tensorflow object detection model. Batch 16, dataset 27,000, one single class.

Up to epoch 3 (—> 81,000 iters), retinanet-evaluate produce NO predicted boundingbox on any of 3000 evaluation images! Can someone help... please.

Just to add, MobileNet224_2 also does NOT provide any mAP at all. Very tiring ... :( ...

Steve

@liminghuiv
Copy link

For mobilenet, I saw keras-retinanet is used in vehicle detection:
https://github.com/yangliupku/retinanet_detection
Can someone merge it?
@hgaiser ?

@mariaculman18
Copy link

Dear all, I came upon the same issue with DenseNet-121. While training, mAP is estimated right but when using retinanet-evaluate it is just 0. I know this is a community to contribute, I would like to help resolve this but I am just a beginner. So, does anyone got a way around this? I am using my own csv dataset for training.

@hgaiser
Copy link
Contributor

hgaiser commented Aug 20, 2019

Dear all, I came upon the same issue with DenseNet-121. While training, mAP is estimated right but when using retinanet-evaluate it is just 0. I know this is a community to contribute, I would like to help resolve this but I am just a beginner. So, does anyone got a way around this? I am using my own csv dataset for training.

The reply below is still valid:

Densenet is a community contribution and I have never really used it. If you find out what the problem is then a PR will be welcome :)

You could use resnet50, that should work.

@hgaiser
Copy link
Contributor

hgaiser commented Aug 20, 2019

Could the issue be related to this?

@mariaculman18
Copy link

mariaculman18 commented Aug 22, 2019

Dear all, I came upon the same issue with DenseNet-121. While training, mAP is estimated right but when using retinanet-evaluate it is just 0. I know this is a community to contribute, I would like to help resolve this but I am just a beginner. So, does anyone got a way around this? I am using my own csv dataset for training.

The reply below is still valid:

Densenet is a community contribution and I have never really used it. If you find out what the problem is then a PR will be welcome :)

You could use resnet50, that should work.

Yes, I already use ResNet-50 as the backbone and wanted to make a comparison with DenseNet-121.

@uttaransinha
Copy link

Hi,
I've looked into this problem in detail and it seems like the problem lies in the model itself. Any backbone other than ResNet-50 predicts box co-ordinates as -1, labels as -1 and scores as -1. In other words, the model cannot predict anything at all. But since the training phase of the model goes smoothly, I suspect that the conversion of a trained model to an inference model is bugged. Please take a look at the inference conversion code.

@mariaculman18
Copy link

Hi,
I've looked into this problem in detail and it seems like the problem lies in the model itself. Any backbone other than ResNet-50 predicts box co-ordinates as -1, labels as -1 and scores as -1. In other words, the model cannot predict anything at all. But since the training phase of the model goes smoothly, I suspect that the conversion of a trained model to an inference model is bugged. Please take a look at the inference conversion code.

Hi @Uttaran-IITH, please see this where with the guidance of @hgaiser @ikerodl96 I found the way around this issue.

@uttaransinha
Copy link

@mariaculman18, Thank you for the suggestion. Unfortunately, the solution works only for Densenet but not for Resnet101 or Resnet152. In the case of Densenet121, the mAP is very low even on the training data. I assume that the model you have used in the code is the inference model.

python3 ./keras-retinanet/keras_retinanet/bin/evaluate.py --backbone=resnet101 csv train.csv class.csv retinanet_resnet101_inference.h5

@mariaculman18
Copy link

@Uttaran-IITH yes, I only tried the solution for DenseNet-121. I can no give you any suggestion for other backbones, sorry :(

In my case, I got a mAP of 96% with ResNet-50 and 90% with DenseNet-121, with the training data.

@Jorbo19
Copy link

Jorbo19 commented Aug 30, 2019

Hi, I use resnet101, but the loss is always around 1. How can I reduce it?

@Jorbo19
Copy link

Jorbo19 commented Aug 30, 2019

I only have one class

@beibeiZ
Copy link

beibeiZ commented Sep 5, 2019

when I train normally I get 0 mAP as shown below. Can anyone help me on this?

10000/10000 [==============================] - 3158s 316ms/step - loss: 5.4151 - regression_loss: 2.5757 - classification_loss: 2.8393 - val_loss: 5.3902 - val_regression_loss: 2.5642 - val_classification_loss: 2.8260
Running network: 100% (12704 of 12704) |#################################################################| Elapsed Time: 0:16:48 Time: 0:16:48
Parsing annotations: 100% (12704 of 12704) |#############################################################| Elapsed Time: 0:00:00 Time: 0:00:00
('6066 instances of class', 'M', 'with average precision: 0.0000')
('8803 instances of class', 'W', 'with average precision: 0.0000')
mAP: 0.0000

retinanet-evaluate --convert-model ./model/resnet50_csv_100.h5 csv ./train.csv ./class.csv
Using TensorFlow backend.
usage: retinanet-evaluate [-h] [--convert-model] [--backbone BACKBONE]
[--gpu GPU] [--score-threshold SCORE_THRESHOLD]
[--iou-threshold IOU_THRESHOLD]
[--max-detections MAX_DETECTIONS]
[--save-path SAVE_PATH]
[--image-min-side IMAGE_MIN_SIDE]
[--image-max-side IMAGE_MAX_SIDE] [--config CONFIG]
{coco,pascal,csv} ... model
retinanet-evaluate: error: argument dataset_type: invalid choice: './model/resnet50_csv_100.h5' (choose from 'coco', 'pascal', 'csv')
why error?can you help me?

@ihunhh
Copy link

ihunhh commented Sep 12, 2019

I guess it is caused by this:
image
image

@yx-wu
Copy link

yx-wu commented Oct 12, 2019

@MAGI003769 hello,I met the same problem with you.With 'densenet201' as backbone,I got strange detection results and all the scores are 1.Has your problem been solved?Thank a lot.

@hgaiser
Copy link
Contributor

hgaiser commented Nov 4, 2019

Has your problem been solved?

I would also like to know if this problem still persists.

@yx-wu
Copy link

yx-wu commented Nov 4, 2019 via email

@bhavesh907
Copy link

I'm also facing a very strange problem. I get non-zero mAP value when evaluating during training but when I use convert_model.py and evaluate.py, I get zero mAP values. I'm facing this issue with the efficientnet backbones.

@mngata
Copy link

mngata commented Jan 10, 2020

I also encounter the same problem. I change the backbone to detnet59, basicly an modificatition to resnet50. I can see the loss decrease while i am training, however, the each-epoch evaluation on the test set is always 0. The resnet50 backbone works well. I was wondering if there is error in the evaluation function.

@foghegehog
Copy link
Contributor

@mariaculman18

Hi @Uttaran-IITH, please see this where with the guidance of @ hgaiser @ ikerodl96 I found the way around this issue.

Your solution helped me as well. Why not to make pr? (And use **common_args in all generators)

@mariaculman18
Copy link

@mariaculman18

Hi @Uttaran-IITH, please see this where with the guidance of @ hgaiser @ ikerodl96 I found the way around this issue.

Your solution helped me as well. Why not to make pr? (And use **common_args in all generators)

Happy to know it worked for you :)

I guess contributors are aware of the problem. It is better if they make pr, I don't know how to do that :/

@foghegehog
Copy link
Contributor

foghegehog commented Feb 25, 2020

@mariaculman18

Happy to know it worked for you :)

I guess contributors are aware of the problem. It is better if they make pr, I don't know how to do that :/

Well, I dared to create the pr: #1290. Let's see if it suits.

@amindehnavi
Copy link

when I train normally I get 0 mAP as shown below. Can anyone help me on this?

10000/10000 [==============================] - 3158s 316ms/step - loss: 5.4151 - regression_loss: 2.5757 - classification_loss: 2.8393 - val_loss: 5.3902 - val_regression_loss: 2.5642 - val_classification_loss: 2.8260
Running network: 100% (12704 of 12704) |#################################################################| Elapsed Time: 0:16:48 Time: 0:16:48
Parsing annotations: 100% (12704 of 12704) |#############################################################| Elapsed Time: 0:00:00 Time: 0:00:00
('6066 instances of class', 'M', 'with average precision: 0.0000')
('8803 instances of class', 'W', 'with average precision: 0.0000')
mAP: 0.0000

I have the same problem.have you found the solution?

@mariaculman18
Copy link

when I train normally I get 0 mAP as shown below. Can anyone help me on this?
10000/10000 [==============================] - 3158s 316ms/step - loss: 5.4151 - regression_loss: 2.5757 - classification_loss: 2.8393 - val_loss: 5.3902 - val_regression_loss: 2.5642 - val_classification_loss: 2.8260
Running network: 100% (12704 of 12704) |#################################################################| Elapsed Time: 0:16:48 Time: 0:16:48
Parsing annotations: 100% (12704 of 12704) |#############################################################| Elapsed Time: 0:00:00 Time: 0:00:00
('6066 instances of class', 'M', 'with average precision: 0.0000')
('8803 instances of class', 'W', 'with average precision: 0.0000')
mAP: 0.0000

I have the same problem.have you found the solution?

Use the solution here for Densenet.

@pleaseRedo
Copy link

pleaseRedo commented May 23, 2020

I also encounter the same problem. I change the backbone to detnet59, basicly an modificatition to resnet50. I can see the loss decrease while i am training, however, the each-epoch evaluation on the test set is always 0. The resnet50 backbone works well. I was wondering if there is error in the evaluation function.

@mngata Have you managed to fix this mAP 0.0 issue? I changed backbone to detnet59 and experienced same issue as well.

@amindehnavi
Copy link

when I train normally I get 0 mAP as shown below. Can anyone help me on this?
10000/10000 [==============================] - 3158s 316ms/step - loss: 5.4151 - regression_loss: 2.5757 - classification_loss: 2.8393 - val_loss: 5.3902 - val_regression_loss: 2.5642 - val_classification_loss: 2.8260
Running network: 100% (12704 of 12704) |#################################################################| Elapsed Time: 0:16:48 Time: 0:16:48
Parsing annotations: 100% (12704 of 12704) |#############################################################| Elapsed Time: 0:00:00 Time: 0:00:00
('6066 instances of class', 'M', 'with average precision: 0.0000')
('8803 instances of class', 'W', 'with average precision: 0.0000')
mAP: 0.0000

I have the same problem.have you found the solution?

Use the solution here for Densenet.

Thanks,it worked

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests