return identity when blocks less than 1 in resnet#3262
return identity when blocks less than 1 in resnet#3262ZhiyuanChen wants to merge 3 commits intopytorch:mainfrom
Conversation
Some modifications replace certain stage/s in resnet. This allows such stage/s to be initialised to `nn.Identity()`
datumbox
left a comment
There was a problem hiding this comment.
@ZhiyuanChen Thanks for the PR. One of the tests fails due to typing. I left a suggestion on the comments, please have a look. Also could you please provide a bit more info on how this change will enable your use-cases. This will help us assess if the proposed change is necessary and whether there is a better recommended way to do it.
@fmassa: As far as I can see this is a BC change, so it should be OK. Any thoughts on whether it can have side-effects on any models that use it as backbone?
Thank you, I have noticed that but I'm not quite sure should I change the return value to non.Sequential or nn.Identity or should I wrap it with a nn.Sequential.
Visual transformer swap the blocks in the last stage of resnet with VT Module. Vision Transformer's hybrid variant remove the stage 4 entirely and extend the stage 3 to make it 50 layers (for resnet50). As the transformer architecture are attracting increasing attention, I believe there will be more hybrid architecture which will modifies resents like this |
Co-authored-by: Vasilis Vryniotis <datumbox@users.noreply.github.com>
|
Hi, Thanks for the PR! I think the solution implemented here does not completely address the use-case presented. The model still contains a Additionally, the same functionality implemented in this PR can be easily obtained with the following snippet: model = torchvision.models.resnet50()
model.layer3 = nn.Identity()
model.layer4 = nn.Identity()which I believe could live in use-code. One of the reasons why I would prefer not to natively support this patch is that the meaning of For those reasons, I think I would prefer not to move forward with this PR, but let us know if you have other thoughts |
This will not work if we are using the attributes of this object. For example, if I define a network like this: class MyNet(ResNet):
def __init__(
self,
block: Type[Union[BasicBlock, Bottleneck]],
layers: List[int],
num_classes: int = 1000,
zero_init_residual: bool = False,
groups: int = 1,
width_per_group: int = 64,
replace_stride_with_dilation: Optional[List[bool]] = None,
norm: Optional[Callable[..., nn.Module]] = nn.BatchNorm2d
) -> None:
super(ResNet, self).__init__()
self.layer4 = self._make_xxx_layer(512, layers[3])
def _make_xxx_layer(planes, layers):
return nn.Sequential(xxx(self.inplanes, planes) for _ in range layers)It will not work as the I do believe the programmers are mature enough to know what they are doing, if they specify a block has 0 layer, it should have 0 layer. |
Some modifications replace certain stage/s in resnet.
This allows such stage/s to be initialised to
nn.Identity()