New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Example on using CutMix Augmentation for Image Classification #425
Conversation
Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please visit https://cla.developers.google.com/ to sign. Once you've signed (or fixed any issues), please reply here with What to do if you already signed the CLAIndividual signers
Corporate signers
ℹ️ Googlers: Go here for more info. |
@googlebot I signed it!
@googlebot I signed it! |
@fchollet could you please check and review my PR, if you have time. Thank You! |
Thanks for the PR! I'll get to reviewing it soon. I've been pretty swamped. |
It's totally understandable!! @fchollet |
I'll check out your awesome example soon, thanks for your 👍 effort |
Thank you! @8bitmp3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great stuff @sayannath I've added some suggestions, PTAL 👍
Thanks, I learned a lot from the paper and your example.
By the way, hope the first paragraph refactoring suggestion is OK. The first sentence explains the why and the second - what CutMix does (using the words from the original paper by Yun et al.):
|
Co-authored-by: 8bitmp3 <19637339+8bitmp3@users.noreply.github.com>
Co-authored-by: 8bitmp3 <19637339+8bitmp3@users.noreply.github.com>
Co-authored-by: 8bitmp3 <19637339+8bitmp3@users.noreply.github.com>
Co-authored-by: 8bitmp3 <19637339+8bitmp3@users.noreply.github.com>
Co-authored-by: 8bitmp3 <19637339+8bitmp3@users.noreply.github.com>
Co-authored-by: 8bitmp3 <19637339+8bitmp3@users.noreply.github.com>
Co-authored-by: 8bitmp3 <19637339+8bitmp3@users.noreply.github.com>
Co-authored-by: 8bitmp3 <19637339+8bitmp3@users.noreply.github.com>
Co-authored-by: 8bitmp3 <19637339+8bitmp3@users.noreply.github.com>
Co-authored-by: 8bitmp3 <19637339+8bitmp3@users.noreply.github.com>
Co-authored-by: 8bitmp3 <19637339+8bitmp3@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update! It's looking great. I think we can improve the choice of model.
Co-authored-by: 8bitmp3 <19637339+8bitmp3@users.noreply.github.com>
Is the PR ready for review at this time? |
Almost there. Improvising the things there. |
Ok, please ping me when you want me to look at it! |
Are you still working on this? |
@fchollet Yes but I have tried with MobileNetV2 but not giving me desired result. |
Ok, resnet seems fine then. Any other changes you're still working on? |
Training it to some more epochs and I will commit the changes today and will notify you. |
@fchollet you can look into it now, I have incorporated all the changes you have told me to do. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update! It looks great. Have you tried using larger input images?
Note that I pushed some copyedits, please pull them first.
examples/vision/cutmix.py
Outdated
|
||
AUTO = tf.data.experimental.AUTOTUNE | ||
BATCH_SIZE = 32 | ||
IMG_SHAPE = 32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Call this IMG_SIZE. IMG_SHAPE would be like (32, 32).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ya Sure! Will do that
|
||
|
||
def training_model(): | ||
return resnet_v20((32, 32, 3), 20) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: in practice you get much better accuracy on CIFAR-10 by upscaling the input images (perhaps this is what you saw better accuracy with ResNet-20...), i.e. by using a (None, None, 3) as input shape and starting the model with a Rescaling(150, 150)
layer. Have you tried that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No haven't tried that.
Incorporated this change. @fchollet |
Looks good, please add the generated files. |
@fchollet I have added the generated files. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the great contribution! 👍
Though there are many augmentation techniques like MixUp and CutOut techniques, there are some issues as well.
It is because in CutOut Augmentation we remove a part(square) from a picture with either a Gaussian Noise or a Black pixel, which results in a decrease of important portions of an image during training. This can lead to a limitation in the case of CNN.
In the case of MixUp Augmentation, the images which are generated are somewhere not natural and confuses the model, especially for the localization task.
In this example how we use the CutMix Augmentation technique to overcome these issues. In this example, I have used the CIFAR10 dataset and the CutMix Augmentation technique performs better and yields better results than the simple one.