Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when using efficientnet-b2 and above during the Efficientdet training on custom Data #57

Closed
karen-gishyan opened this issue Aug 21, 2020 · 14 comments
Labels
bug Something isn't working solution added Solution added to the raised issue

Comments

@karen-gishyan
Copy link

Hi,

When configuring an Efficiendet model with "efficientnet-b2", "efficientnet-b3"up to "efficientnet-b8",I get such an error.
Given groups=1, weight of size 112 40 1 1, expected input[1, 48, 64, 64] to have 40 channels, but got 48 channels instead. The numbers change depending on the model. I would highly appreciate your help.

@abhi-kumar
Copy link
Contributor

Thank you for pointing out the issue. We will try to resolve it as soon as possible.

@abhi-kumar abhi-kumar added the bug Something isn't working label Aug 21, 2020
@abhi-kumar
Copy link
Contributor

@karen-gishyan
Copy link
Author

Thanks @abhi-kumar, will try that version.

@abhi-kumar
Copy link
Contributor

A Solution has been added, and examples have been updated too.
The issue was related to different convolution sizes to build the network variants and corresponding input sizes. You will find details on what input size to use for which model type in example notebooks.

Please check and let us know if the issue is resolved at your end.

@abhi-kumar abhi-kumar added the solution added Solution added to the raised issue label Aug 24, 2020
@karen-gishyan
Copy link
Author

karen-gishyan commented Aug 24, 2020

Thanks @abhi-kumar for the added solution, it indeed works. There is one more question I would like to ask. For my experiment, I am training the model with low resolution images, particularly, image size of 352, I was wondering if image sizes lower than 512 can be supported anyhow with pretrained weights? Thanks again for the solution.

@abhi-kumar
Copy link
Contributor

There's a Resizer transform in the code which reshapes the image at dataloader level. But if the images are smaller then using higher efficientdet versions might not serve the purpose as the images will be distorted. You might try out mobilenet ssd models of tensorflow object detection pipelines of monk or faster rcnn or retinanet models of mmdetection pipeline of monk. These ones resize images to either size 300, 320, or 512.

@karen-gishyan
Copy link
Author

Thanks a lot.

@karen-gishyan
Copy link
Author

@abhi-kumar could you please provide a bit more details about resizing for EfficientDet? I read in their official paper that the image size has has to be divisivle by 128, D0 version takes an image size of 512, D1-640 so on. With your efficientDet implementation, I have been able to to train a D1 version with a custom image size(352,352). So can you elaborate a bit on this? Have the images been resize to 640, or generally how has this been possible? Thanks.

@abhi-kumar
Copy link
Contributor

Could you please try the same with D2 or D3, training with an image size of (352, 352), and let us know the result.

@karen-gishyan
Copy link
Author

karen-gishyan commented Sep 15, 2020

@abhi-kumar this latest version does not allow anymore, so only the input size specified in your colab notebook, but as my training was with the last version I was curious. So did the old version resize all images based on common_size=512, as retinanet resizes everything based on min_side=608,max_side=1024? And if for b4 model you pass an image size of 1024, is it final input size or it gets rescaled?

@abhi-kumar
Copy link
Contributor

Earlier it autoscaled to size 512 strictly, irrespective of the pipeline being used. The latest version added input size to be provided by the user.

@karen-gishyan
Copy link
Author

thanks a lot @abhi-kumar , and a final question. Does your implementation of retinanet rescale all the images based on (608,1024) size?

@abhi-kumar
Copy link
Contributor

abhi-kumar commented Sep 15, 2020

As per the Class Resizer in the code https://github.com/Tessellate-Imaging/Monk_Object_Detection/blob/master/5_pytorch_retinanet/lib/retinanet/dataloader.py, the images are resized based on min_side=608, max_side=1024

@karen-gishyan
Copy link
Author

thanks!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working solution added Solution added to the raised issue
Projects
None yet
Development

No branches or pull requests

2 participants