You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, thank you so much for sharing this excellent work.
I have some confusion about the experimental setup of CIFAR10/100. Commonly used augmentation settings are random cropping and padding=4, and the input image resolution is 32x32. But this setting does not seem to be able to get the output resolution of 7x7 as described in the paper when using SwinT. So could you please tell me the detailed augmentation settings you used on CIFAR10/100, and whether there are any changes to the network structure of original VTs.
Thanks again.
The text was updated successfully, but these errors were encountered:
Hi @lkhl , for the CIFAR10/100, we load the data in this manner: here. Then, all of the datasets use a shared transformation. So, the input 32x32 images in CIFAR are also resized to 224x224. That's why we can get the output resolution 7x7 without changing anything of the original network architecture.
Hi, thank you so much for sharing this excellent work.
I have some confusion about the experimental setup of CIFAR10/100. Commonly used augmentation settings are random cropping and padding=4, and the input image resolution is 32x32. But this setting does not seem to be able to get the output resolution of 7x7 as described in the paper when using SwinT. So could you please tell me the detailed augmentation settings you used on CIFAR10/100, and whether there are any changes to the network structure of original VTs.
Thanks again.
The text was updated successfully, but these errors were encountered: