pytorch based YoloV5 solution - Global Wheat Detection
A complete pytorch pipeline for training, cross-validation and inference notebooks used in Kaggle competition Global Wheat Detection (May-Aug 2020)
Table of Contents
- Brief overview of the competition images
- Notebooks description
- How to use
Brief overview of the competition images
A brief content description is provided here, for detailed descriptions check the notebook comments
- Handled the noisy labels (too big/small boxes etc.)
- Stratified 5 fold split based on source
- Albumentations - RandomSizedCrop, HueSaturationValue, RandomBrightnessContrast, RandomRotate90, Flip, Cutout, ShiftScaleRotate
- Mixup - https://arxiv.org/pdf/1710.09412.pdf
2 images are mixed
- Mosaic - https://arxiv.org/pdf/2004.12432.pdf
4 images are cropped and stitched together. YoloV5 by default has a canvas where it stitches images in size multiple of 32 pixels. For batch size = 4 the canvas looks like:
for batch size = 2
- Default YoloV5 configuration
- YoloV5 by default uses TensorBoard during training, the best model is selected using "fitness" criteria based on following parameters:
Some of my TensorBoard training logs can be found at TensorBoard.dev
[CV] Cross Validation notebook
- Same as in [TRAIN]
- Support for ensembling of multiple folds of the same model
- Non-Maximum Supression (NMS) is used to ensemble final predicted boxes
Automated Threshold Calculations:
- Confidence level threshold is calculated based on ground truth labels
- Optimal Final CV score (Metric: IoU) is obtained through this
[INFERENCE] Submission notebook
Test Time Augmentations:
- Same as in [CV]
- Multi-Round Pseudo Labelling pipeline based on https://arxiv.org/pdf/1908.02983.pdf
- Implemented Cross Validation calculations at the end of each round to decide the best thresholds for Pseudo Labels in the next round
- Training pipeline same as in [TRAIN]
Post-Processing and Result:
- Final predictions made with ensembled combinations of TTA
How to use
Just change the directories according to your environment.
In case of any deprecation issues/warnings in future, use the modules available in YoloV5-Mixup folder.
Acknowledging the shortcomings is the first step for progress. Thus, listing the possible improvements that could've made my Model better:
- Ensemble Multi-Model/Fold predictions for Pseudo Labels, currently single model is used to make pseudo labels. Would've made the model more robust to noise too.
- GAN or Style Transfer could've been used to produce more similar labeled images from the current train images for better generalization.
- Relabeling of noisy labels using multi-folds.