Expand use_linear to all models and UpsampleInSpace module#52
Expand use_linear to all models and UpsampleInSpace module#52
Conversation
pzhanggit
left a comment
There was a problem hiding this comment.
Thanks @TsChala. I have two comments: (1) could we be explcit about the assumptions made for the linear changes to be functionally equivalent? For conv2d/conv3d, we’re assuming the patches are non-overlapping. Could we raise an error when that’s not the case but use_linear is on?
(2) The current nn.linear implementations of some functions like UpsampleConv3d do not reproduce the same functionality.
| self.out_proj.append(nn.Linear(channels, channels * kD * kH * kW, bias=False)) | ||
| self.out_proj.append(nn.InstanceNorm3d(channels, affine=True)) | ||
| self.out_proj.append(nn.GELU()) | ||
| # Final head | ||
| kD, kH, kW = self.ks[0] | ||
| self.out_head = nn.Linear(channels, channels * kD * kH * kW) |
There was a problem hiding this comment.
The nn.Linear operations here are not equivalent to UpsampleConv3d, as UpsampleConv3d consists of nn.Upsample and nn.Conv3d.
There was a problem hiding this comment.
Good point. Upon further investigation I think it doesn't make sense to include the use_linear option for the UpsampleConv3D part. We do the upsampling to change the size, then do the Conv3D block such that it doesn't change the input size. We hardcoded stride=1 and padding="same" exactly for this reason. Therefore, it is not really possible to have the non-overlapping scenario here (when kernel size == stride). I'll remove the use_linear option from UpsampleConv3d parts.
| kD, kH, kW = self.ks[-(ilayer+1)] | ||
| # Apply linear, norm, activation | ||
| x = rearrange(x, 'tb c d h w -> (tb d h w) c') | ||
| x = self.out_proj[layer_idx](x) # Linear layer |
There was a problem hiding this comment.
See my comment above. We will need to fix it to make it consistent.
Following #48 , I expanded the option to use a linear layer instead of conv3D. The following are the proposed changes:
use_linearto AViT, SViT and TurbT models as an option, default isFalse.UpsampleInSpacemodule also has conv3D's, these are replaced with linear layers as well ifuse_linear=Trueexpand_projectionsis now compatbile withuse_linear=True, Fix smooth layer and expand_projections regressions from #48 #50 already adressed thehMLP_output, I added the necessary parts forUpsampleInSpaceas well.hMLP_output'sout_headbias. The dimension of the bias for this term isout_chansifuse_linear=False, while it isout_chans * kD * kH * kWifuse_linear=True. I suggest to add the bias after rearranging so it's the same size as in the conv3D case. This keeps the parameter count consistent between linear and conv3D options.