Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster Rcnn construction function freezes backbone layers even for random initialized backbone #2164

Closed
muaz-urwa opened this issue Apr 29, 2020 · 2 comments

Comments

@muaz-urwa
Copy link
Contributor

馃悰 Bug

Currently, fasterrcnn_resnet50_fpn function is used to create a faster rcnn with resnet50 backbone and fpn. resnet_fpn_backbone function is backbone utils is used by this function. This function freezes the backbone layers in resnet apart form layer2, layer3 and layer4. This freezing is hard coded to reflect the faster rcnn paper which frooze the initial layers of pretrained backbone.

If pretrained backbone is not used and one intends to train the entire network from scratch, no layers should be frozen. Otherwise initial layers will always have randomly initialized weights. I think this can be considered a bug, because layer freezing is not even mentioned in the function docs so user are not aware of it.

This resulted in poor AP when we were training faster rcnn with resnet backbone from scratch on Detrac dataset. And it took a while a figure out.

I have created a pull request so that others don't run into the same problem when conducting experiments similar to mine.

I have moved the layer freezing logic to fasterrcnn_resnet50_fpn so that layers can be frozen if either a pretrained backbone or pretrained faster rcnn are used, and are not frozen otherwise.

Pull Request with correct behavior: #2160

To Reproduce

Steps to reproduce the behavior:

Script:

import torchvision
model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=False, pretrained_backbone=False)

for name, parameter in model.named_parameters():
if not parameter.requires_grad:
print(name)

Output:
backbone.body.conv1.weight
backbone.body.layer1.0.conv1.weight
backbone.body.layer1.0.conv2.weight
backbone.body.layer1.0.conv3.weight
backbone.body.layer1.0.downsample.0.weight
backbone.body.layer1.1.conv1.weight
backbone.body.layer1.1.conv2.weight
backbone.body.layer1.1.conv3.weight
backbone.body.layer1.2.conv1.weight
backbone.body.layer1.2.conv2.weight
backbone.body.layer1.2.conv3.weight

Expected behavior

None of the layers should be frozen since neither pretrained network, nor pretrained backbone is used. So no output is expected after running the above script

Environment

PyTorch version: 1.4.0
Is debug build: No
CUDA used to build PyTorch: 10.1

OS: Ubuntu 18.04.3 LTS
GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
CMake version: version 3.15.3

Python version: 3.6
Is CUDA available: Yes
CUDA runtime version: 10.2.89
GPU models and configuration: GPU 0: GeForce RTX 2080 Ti
Nvidia driver version: 440.33.01
cuDNN version: Could not collect

Versions of relevant libraries:
[pip3] numpy==1.17.4
[pip3] torch==1.4.0
[pip3] torchvision==0.5.0
[conda] Could not collect

Additional context

Furthermore, the number of resnet backbone layers that should be frozen is an important hyper parameter in my experience, and it needs to be tuned for each dataset. So adding an argument to control this enabled me to integrate it in automated hyper-parameter tuning. My pull request takes the tunable layers as a argument of fasterrcnn_resnet50_fpn function.

Pull Request with correct behavior: #2160

@fmassa
Copy link
Member

fmassa commented Apr 30, 2020

Hi,

Thanks for the PR, I've commented on it.

@fmassa
Copy link
Member

fmassa commented Oct 21, 2020

This has been fixed with #2160 and #2242

@fmassa fmassa closed this as completed Oct 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants