-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data augmentation causes difference in input batches #6
Comments
Thank you for your interest. |
Thanks for your reply, it's really very helpful. |
Thanks for the great work!
I'm currently trying to adopt Mixmo for my own projects, I found some modifications that differs from Google'work MIMO.
For input batches generation, suppose we use input repetition=1.0, technically the indexes and images for the 2 inputs of 2 experts should be exactly the same.
In goolge's implementation, they read the images first, and then construct 2 inputs for the experts based on input repetition value (to shuffle partial indexes and keep the rest unchanged).
In your implementation, you compute the indexes first (based on input repetition), and then read the images accordingly.
The problem is, if we use default data augmentation (e.g. random cropping or flipping, which is also used in your code), even if the indexes are the same, it doesn't mean the images are the same because we apply DA to these imgaes randomly! However, in google's implementation, since the images are read in at first, this issue does not exist.
I hope I've made my ideas clearly.
I'm wondering that, how would this affect the performance, and which implementation is correct or reasonable? I'd like to hear your opinions. Appreciate your prompt reply!
The text was updated successfully, but these errors were encountered: