The EdgeResidual block places the stride parameter in the pointwise projection layer: https://github.com/rwightman/pytorch-image-models/blob/3a7aa95f7e5fc90a6a2683c756e854e26201d82e/timm/models/efficientnet_blocks.py#L365-L366 .. but it would be more efficient to put it in the preceding expansion convolution layer. The TensorFlow reference implementation recently fixed this: https://github.com/tensorflow/tpu/issues/660