Skip to content
Switch branches/tags

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

pytorch based YoloV5 solution - Global Wheat Detection

A complete pytorch pipeline for training, cross-validation and inference notebooks used in Kaggle competition Global Wheat Detection (May-Aug 2020)

Table of Contents

Brief overview of the competition images

Wheat heads were from various sources:
A few labeled images are as shown: (Blue bounding boxes)
head head

Notebooks description

A brief content description is provided here, for detailed descriptions check the notebook comments

[TRAIN] notebook

  1. Pre-Processing:
    - Handled the noisy labels (too big/small boxes etc.)
    - Stratified 5 fold split based on source

  2. Augmentations:
    - Albumentations - RandomSizedCrop, HueSaturationValue, RandomBrightnessContrast, RandomRotate90, Flip, Cutout, ShiftScaleRotate
    - Mixup -
    2 images are mixed
    - Mosaic -
    4 images are cropped and stitched together. YoloV5 by default has a canvas where it stitches images in size multiple of 32 pixels. For batch size = 4 the canvas looks like:
    for batch size = 2

  3. Configurations:
    - Default YoloV5 configuration

  4. TensorBoard Analysis:
    - YoloV5 by default uses TensorBoard during training, the best model is selected using "fitness" criteria based on following parameters:
    Some of my TensorBoard training logs can be found at

[CV] Cross Validation notebook

  1. Pre-Processing:
    - Same as in [TRAIN]

  2. Test Time Augmentations:
    - Flips and Rotate
    - Color shift
    - Scale (scale down with padding)

  3. Ensemble:
    - Support for ensembling of multiple folds of the same model
    - Non-Maximum Supression (NMS) is used to ensemble final predicted boxes

  4. Automated Threshold Calculations:
    - Confidence level threshold is calculated based on ground truth labels
    - Optimal Final CV score (Metric: IoU) is obtained through this

[INFERENCE] Submission notebook

  1. Test Time Augmentations:
    - Same as in [CV]

  2. Pseudo Labelling:
    - Multi-Round Pseudo Labelling pipeline based on
    - Implemented Cross Validation calculations at the end of each round to decide the best thresholds for Pseudo Labels in the next round
    - Training pipeline same as in [TRAIN]

  3. Post-Processing and Result:
    - Final predictions made with ensembled combinations of TTA

How to use

Just change the directories according to your environment.

Google Colab deployed versions are available for
[TRAIN] Open In Colab
[CV] Open In Colab

In case of any deprecation issues/warnings in future, use the modules available in YoloV5-Mixup folder.


Acknowledging the shortcomings is the first step for progress. Thus, listing the possible improvements that could've made my Model better:

  • Ensemble Multi-Model/Fold predictions for Pseudo Labels, currently single model is used to make pseudo labels. Would've made the model more robust to noise too.
  • GAN or Style Transfer could've been used to produce more similar labeled images from the current train images for better generalization.
  • Relabeling of noisy labels using multi-folds.


My modified version of YoloV5 training, cross-validation and inference with Pseudo Labelling pytorch pipelines used in GWD Kaggle Competition