for w32, it seems that the downsampling is done like this:
input: 64X48 [32 channels] -> 32X24 [32 channels] -> 16X12 [128 channels]
The downsample from 64X48 to 32X24 should enlarge to 64 output channels, to keep the higher resolution info saved in the inner channels, no?
for w32, it seems that the downsampling is done like this:
input: 64X48 [32 channels] -> 32X24 [32 channels] -> 16X12 [128 channels]
The downsample from 64X48 to 32X24 should enlarge to 64 output channels, to keep the higher resolution info saved in the inner channels, no?