For example, having a (50, 720, 1280) input, we cannot have (1, 128, 128) patches with stride (1, 0, 0)