# Introduction

In this competition, you’ll detect wheat heads from outdoor images of wheat plants, including wheat datasets from around the globe. If successful, researchers can accurately estimate the density and size of wheat heads in different varieties. With improved detection farmers can better assess their crops, ultimately bringing cereal, toast, and other favorite dishes to your table.

To get large and accurate data about wheat fields worldwide, plant scientists use image detection of "wheat heads"—spikes atop the plant containing grain. These images are used to estimate the density and size of wheat heads in different varieties. 

Models developed for wheat phenotyping need to generalize between different growing environments. Current detection methods involve one- and two-stage detectors (Yolo-V3 and Faster-RCNN), but even when trained with a large dataset, a bias to the training region remains.

![https://storage.googleapis.com/kaggle-media/competitions/UofS-Wheat/descriptionimage.png](https://storage.googleapis.com/kaggle-media/competitions/UofS-Wheat/descriptionimage.png)


***N.B. Object detection is a computer technology related to computer vision and image processing that deals with detecting instances of semantic objects of a certain class (such as humans, buildings, or cars) in digital images and videos.***
[https://en.wikipedia.org/wiki/Object_detection](https://en.wikipedia.org/wiki/Object_detection)

***Example of Object detection:***
![https://upload.wikimedia.org/wikipedia/commons/thumb/3/38/Detected-with-YOLO--Schreibtisch-mit-Objekten.jpg/1024px-Detected-with-YOLO--Schreibtisch-mit-Objekten.jpg](https://upload.wikimedia.org/wikipedia/commons/thumb/3/38/Detected-with-YOLO--Schreibtisch-mit-Objekten.jpg/1024px-Detected-with-YOLO--Schreibtisch-mit-Objekten.jpg)

# Data

The data is images of wheat fields, with bounding boxes for each identified wheat head. **Not all images include wheat heads / bounding boxes.** The images were recorded in many locations around the world.

The CSV data is simple - the image ID matches up with the filename of a given image, and the width and height of the image are included, along with a bounding box (see below). There is a row in train.csv for each bounding box. Not all images have bounding boxes.

**Files**

    train.csv - the training data
    sample_submission.csv - a sample submission file in the correct format
    train.zip - training images
    test.zip - test images

1. train.zip consists of 3422 images
2. test.zip consists of 10 images
3. train.csv has 3373 images (98% of training images) and unique 117761 bounding boxes

**Columns**

    image_id - the unique image ID
    width, height - the width and height of the images
    bbox - a bounding box, formatted as a Python-style list of [xmin, ymin, width, height]
    
1. All the images are of 1024 * 1024 pixels

# Evaluation Metric
his competition is evaluated on the mean average precision at different intersection over union (IoU) thresholds. The IoU of a set of predicted bounding boxes and ground truth bounding boxes is calculated as:
IoU(A,B)=A∩BA∪B.

![https://storage.googleapis.com/kaggle-media/competitions/rsna/IoU.jpg](https://storage.googleapis.com/kaggle-media/competitions/rsna/IoU.jpg)

The metric sweeps over a range of IoU thresholds, at each point calculating an average precision value. The threshold values range from 0.5 to 0.75 with a step size of 0.05. In other words, at a threshold of 0.5, a predicted object is considered a "hit" if its intersection over union with a ground truth object is greater than 0.5.

At each threshold value t
, a precision value is calculated based on the number of true positives (TP), false negatives (FN), and false positives (FP) resulting from comparing the predicted object to all ground truth objects:
TP(t)/TP(t)+FP(t)+FN(t)

The average precision of a single image is calculated as the mean of the above precision values at each IoU threshold:
$$ \frac{1}{|thresholds|} \sum_t \frac{TP(t)}{TP(t) + FP(t) + FN(t)}.$$

Lastly, the score returned by the competition metric is the mean taken over the individual average precisions of each image in the test dataset.

# Importing Libraries

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)


import matplotlib.pyplot as plt

import cv2

import os

# Reading the Dataset

In [None]:
BASE_DIR = '/kaggle/input/global-wheat-detection/'
train_data = pd.read_csv(BASE_DIR+"train.csv")
submission_file = pd.read_csv(BASE_DIR+"sample_submission.csv")
train_images_dir = BASE_DIR + "train/"

# Basic Statistics

In [None]:
train_data.head()

In [None]:
submission_file.head()

In [None]:
print("The training data has {} rows and {} columns".format(train_data.shape[0],train_data.shape[1]))

In [None]:
train_data.isna().sum()

In [None]:
train_data.nunique()

In [None]:
train_data.source.unique()

In [None]:
train_data.groupby("source")["image_id"].nunique()

Submission file doesn't have source column so it will be difficult to use it 

In [None]:
all_images = set(x.split(".")[0] for x in os.listdir(train_images_dir))
images_with_bb = set(train_data.image_id.unique())
images_without_bb = all_images^ images_with_bb

In [None]:
df_images_without_bb=pd.DataFrame(images_without_bb,columns = ["image_id"])

In [None]:
train_data.head()

In [None]:
train_data[["x_start","y_start","width","height"]] = pd.DataFrame([i[1:-1].split(',') for i in train_data.bbox.to_list()],index=train_data.index)

In [None]:
train_data = train_data.astype({"x_start":float,"y_start":float,"width":float,"height":float})
train_data = train_data.astype({"x_start":int,"y_start":int,"width":int,"height":int})

In [None]:
train_data.head()

# Looking at the images

In [None]:
def plot_images(image_list,rows,cols,title):
    fig,ax = plt.subplots(rows,cols,figsize = (25,5))
    ax = ax.flatten()
    for i, image_id in enumerate(image_list):
        image = cv2.imread(train_images_dir+'{}.jpg'.format(image_id))
        image = cv2.cvtColor(image,cv2.COLOR_BGR2RGB)
        ax[i].imshow(image)
        ax[i].set_axis_off()
        ax[i].set_title(image_id)
    plt.suptitle(title)

In [None]:
plot_images(train_data[train_data.source == 'arvalis_1'].sample(5)["image_id"].values,1,5,"Images with wheat")

In [None]:
plot_images(train_data[train_data.source == 'arvalis_2'].sample(5)["image_id"].values,1,5,"Images with wheat")

In [None]:
plot_images(train_data[train_data.source == 'arvalis_3'].sample(5)["image_id"].values,1,5,"Images with wheat")

In [None]:
plot_images(train_data[train_data.source == 'ethz_1'].sample(5)["image_id"].values,1,5,"Images with wheat")

In [None]:
plot_images(train_data[train_data.source == 'inrae_1'].sample(5)["image_id"].values,1,5,"Images with wheat")

In [None]:
plot_images(train_data[train_data.source == 'rres_1'].sample(5)["image_id"].values,1,5,"Images with wheat")

In [None]:
plot_images(train_data[train_data.source == 'usask_1'].sample(5)["image_id"].values,1,5,"Images with wheat")

In [None]:
plot_images(df_images_without_bb.sample(10)["image_id"].values,2,5,"Images without wheat")

In [None]:
def plot_images_with_bb(imageId):
    plt.rcParams["figure.figsize"] = (10,10)
    bboxes = train_data[train_data.image_id == imageId]
    image = cv2.imread(train_images_dir+'{}.jpg'.format(imageId))
    image = cv2.cvtColor(image,cv2.COLOR_BGR2RGB)
    for row in bboxes.iterrows():
        image = cv2.rectangle(image,(row[1]["x_start"],row[1]["y_start"]),(row[1]["x_start"]+row[1]["width"],row[1]["y_start"]+row[1]["height"]),(255,0,0),5)
        fig = plt.imshow(image)
    plt.axis("off")
    plt.title(imageId)

In [None]:
plot_images_with_bb("2ae9c276f")

***Please upvote my kernel if you like it***