You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Can you explain more about using Conv2d as a patch-splitting image? (in the Discriminator) Since the image will change when applying the Conv and it didn't split the image into parts of a grid as the Vision-Transformer do. And I saw some other implement uses Unfold to split the image.
My original implementation uses the same kernel_size and stride while in that case this is equal to splitting. But later I found that overlapped splitting gets slightly better performance and I changed to this version. But you can still change it back to non-overlapped version and that will give you reasonable results as well.
I don't understand how convolution will split the image? since it convolves the whole image and gets the new one for each channel.
And yes, when plotting it didn't seem to have a reasonable result.
Can you explain more about using Conv2d as a patch-splitting image? (in the Discriminator) Since the image will change when applying the Conv and it didn't split the image into parts of a grid as the Vision-Transformer do. And I saw some other implement uses Unfold to split the image.
The text was updated successfully, but these errors were encountered: