New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Thank you for your nice work | Question on Flowers dataset #65
Comments
Small update, I just ran the same experiment with learnable positional embedding, the results are similar to what we observed with sine-wave based embedding. |
Hello and thank you for your interest. So a few notes: 1. At the time we downloaded Flowers-102 as an image dataset (split into class subdirectories) and loaded it with the ImageFolder dataset class. The latter is unlikely to be the reason, but the former could be the issue here. Now any of these could be the reason why you're running into the issue, we'll look into them, but I thought I'd share these in case you'd like to try changing these in the meantime. |
Hi @alihassanijr, Many thanks for your response.
Thanks again for your response, I will update here incase I succeed in replicating the Flowers result. Thank you! |
Okay, so the number of GPUs might be the reason. We only trained CIFAR-10/100 and MNIST datasets on single GPUs. |
Thanks again for your response. I will try using a batch size of 768 and let you know. Thank you! |
Hi @alihassanijr, I did a few more experiments:
|
Hi, could you try it with the same learning rate and epochs? The training script we use doesn't factor in batch size into epochs or LR, so I'd suggest just increasing the batch size until the total reaches 768. |
The green line above uses the same LR and epochs, Ali. The red line is just increasing the batch size to 768. Thanks again for your reply and willingness to help. |
Hi, This is what’s causing the issue here. Now upon looking further, we found that there’s a number of papers on the leadboard that train on the kaggle split, as opposed to the original (not many shared where they downloaded the dataset or even checkpoints). Meanwhile, we tried training CCT, ViT, and CaiT variants from scratch on the torchvision split, and found that CCT is leading the other models of similar size about 20% in accuracy. |
Thank you very much @alihassanijr, for taking time to identify whats going wrong. Really appreciate your help! |
Hi @alihassanijr,
Many thanks for your super interesting work, and sharing the elegant code with the community.
I am able to replicate your CIFAR-10 and CIFAR-100 results perfectly. But, there is a large gap when it comes to the Flowers dataset.
After running the following command:
I am able to get only 62% accuracy. Please find the wandb report here. I am attaching the logs too:
output.log
The only change that I made to the code was to use the PyTorch dataloaders:
I am sure that this might be some minor configuration issue for Flowers Dataset, as I am able to replicate the results on CIFAR-10 and CIFAR-100.
Thanks again, and it would be very kind of you if you could help me.
Thanks,
Joseph
The text was updated successfully, but these errors were encountered: