Faster Rcnn construction function freezes backbone layers even for random initialized backbone #2164

muaz-urwa · 2020-04-29T21:48:10Z

🐛 Bug

Currently, fasterrcnn_resnet50_fpn function is used to create a faster rcnn with resnet50 backbone and fpn. resnet_fpn_backbone function is backbone utils is used by this function. This function freezes the backbone layers in resnet apart form layer2, layer3 and layer4. This freezing is hard coded to reflect the faster rcnn paper which frooze the initial layers of pretrained backbone.

If pretrained backbone is not used and one intends to train the entire network from scratch, no layers should be frozen. Otherwise initial layers will always have randomly initialized weights. I think this can be considered a bug, because layer freezing is not even mentioned in the function docs so user are not aware of it.

This resulted in poor AP when we were training faster rcnn with resnet backbone from scratch on Detrac dataset. And it took a while a figure out.

I have created a pull request so that others don't run into the same problem when conducting experiments similar to mine.

I have moved the layer freezing logic to fasterrcnn_resnet50_fpn so that layers can be frozen if either a pretrained backbone or pretrained faster rcnn are used, and are not frozen otherwise.

Pull Request with correct behavior: #2160

To Reproduce

Steps to reproduce the behavior:

Script:

import torchvision
model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=False, pretrained_backbone=False)

for name, parameter in model.named_parameters():
if not parameter.requires_grad:
print(name)

Output:
backbone.body.conv1.weight
backbone.body.layer1.0.conv1.weight
backbone.body.layer1.0.conv2.weight
backbone.body.layer1.0.conv3.weight
backbone.body.layer1.0.downsample.0.weight
backbone.body.layer1.1.conv1.weight
backbone.body.layer1.1.conv2.weight
backbone.body.layer1.1.conv3.weight
backbone.body.layer1.2.conv1.weight
backbone.body.layer1.2.conv2.weight
backbone.body.layer1.2.conv3.weight

Expected behavior

None of the layers should be frozen since neither pretrained network, nor pretrained backbone is used. So no output is expected after running the above script

Environment

PyTorch version: 1.4.0
Is debug build: No
CUDA used to build PyTorch: 10.1

OS: Ubuntu 18.04.3 LTS
GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
CMake version: version 3.15.3

Python version: 3.6
Is CUDA available: Yes
CUDA runtime version: 10.2.89
GPU models and configuration: GPU 0: GeForce RTX 2080 Ti
Nvidia driver version: 440.33.01
cuDNN version: Could not collect

Versions of relevant libraries:
[pip3] numpy==1.17.4
[pip3] torch==1.4.0
[pip3] torchvision==0.5.0
[conda] Could not collect

Additional context

Furthermore, the number of resnet backbone layers that should be frozen is an important hyper parameter in my experience, and it needs to be tuned for each dataset. So adding an argument to control this enabled me to integrate it in automated hyper-parameter tuning. My pull request takes the tunable layers as a argument of fasterrcnn_resnet50_fpn function.

Pull Request with correct behavior: #2160

fmassa · 2020-04-30T15:20:38Z

Hi,

Thanks for the PR, I've commented on it.

fmassa · 2020-10-21T08:18:55Z

This has been fixed with #2160 and #2242

fmassa added enhancement module: models topic: object detection labels Apr 30, 2020

fmassa closed this as completed Oct 21, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Faster Rcnn construction function freezes backbone layers even for random initialized backbone #2164

Faster Rcnn construction function freezes backbone layers even for random initialized backbone #2164

muaz-urwa commented Apr 29, 2020

fmassa commented Apr 30, 2020

fmassa commented Oct 21, 2020

Faster Rcnn construction function freezes backbone layers even for random initialized backbone #2164

Faster Rcnn construction function freezes backbone layers even for random initialized backbone #2164

Comments

muaz-urwa commented Apr 29, 2020

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

fmassa commented Apr 30, 2020

fmassa commented Oct 21, 2020