# Global Wheat Detection

In this Project, I will be taking part in the ["Global Wheat Detection"](https://www.kaggle.com/c/global-wheat-detection) competition. The goal is to use object detection to detect heads of wheat plants as accurately as possible. My first attempt will be using yolov5 to do this.

## Setup
The general goal of this notebook will be that it can be run both on my machine, as well as on kaggle. 

### Imports


In [1]:
import pandas as pd
import os
import numpy as np

### Transforming the Labels for yolov5

yolov5 expects the labels in the form:

* one txt file per picture
* one line per object (Box)
* box coordinates have to be normalized (0-1) in the format: x_center, y_center, width, height
* class numbers (zero-indexed)

In [2]:
labels = pd.read_csv('train.csv')

In [3]:
labels

Unnamed: 0,image_id,width,height,bbox,source
0,b6ab77fd7,1024,1024,"[834.0, 222.0, 56.0, 36.0]",usask_1
1,b6ab77fd7,1024,1024,"[226.0, 548.0, 130.0, 58.0]",usask_1
2,b6ab77fd7,1024,1024,"[377.0, 504.0, 74.0, 160.0]",usask_1
3,b6ab77fd7,1024,1024,"[834.0, 95.0, 109.0, 107.0]",usask_1
4,b6ab77fd7,1024,1024,"[26.0, 144.0, 124.0, 117.0]",usask_1
...,...,...,...,...,...
147788,5e0747034,1024,1024,"[64.0, 619.0, 84.0, 95.0]",arvalis_2
147789,5e0747034,1024,1024,"[292.0, 549.0, 107.0, 82.0]",arvalis_2
147790,5e0747034,1024,1024,"[134.0, 228.0, 141.0, 71.0]",arvalis_2
147791,5e0747034,1024,1024,"[430.0, 13.0, 184.0, 79.0]",arvalis_2


The dataset contains the columns:
* `image_id`: the unique identifier for each of the pictures. There is a row for every box, so each picture can occur multiple times
* `width` & `height`: number of pixels of width and height of the pictures
* `bbox`: coordinates for the box. They are in pixels and follow the format \[x_min, y_min, width, height\]
* `source`: source of the picture. The pictures of the wheat were taken by different institutions across the world and the reason for the competition is that the models trained seem to work very well in one part of the world, but not generalize very well.

In [4]:
num_labels = np.shape(labels.image_id.unique())[0]
num_labels

3373

This means there should be about 3373 pictures. It should be a little more, because the explanation of the dataset states, that there are several pictures that do not contain any boxes.

In [5]:
num_pics = len(os.listdir("./train"))
num_pics

3422

In [6]:
num_pics - num_labels

49

This means there are 49 pictures that do not contain any boxes. This is expected and not important when using yolov5, because it doesn't expect a .txt file for pictures that do not contain any boxes, so I won't have to make one

In [7]:
labels.bbox

0          [834.0, 222.0, 56.0, 36.0]
1         [226.0, 548.0, 130.0, 58.0]
2         [377.0, 504.0, 74.0, 160.0]
3         [834.0, 95.0, 109.0, 107.0]
4         [26.0, 144.0, 124.0, 117.0]
                     ...             
147788      [64.0, 619.0, 84.0, 95.0]
147789    [292.0, 549.0, 107.0, 82.0]
147790    [134.0, 228.0, 141.0, 71.0]
147791     [430.0, 13.0, 184.0, 79.0]
147792     [875.0, 740.0, 94.0, 61.0]
Name: bbox, Length: 147793, dtype: object

Now the individual pixel numbers need to be extracted. The individual lists are actually saved as lists, so I will use the pandas.Series.str.extract() method to extract the individual numbers to use for creating the txt files

In [8]:
pixels = labels.bbox.str.extract(r'(\d+\.\d+)\,\s(\d+\.\d+)\,\s(\d+\.\d+)\,\s(\d+\.\d+)')
pixels.columns = ['x_min', 'y_min', 'w_box', 'h_box']

In [9]:
labels = pd.concat([labels,pixels], axis = 1)

In [10]:
labels.head(5)

Unnamed: 0,image_id,width,height,bbox,source,x_min,y_min,w_box,h_box
0,b6ab77fd7,1024,1024,"[834.0, 222.0, 56.0, 36.0]",usask_1,834.0,222.0,56.0,36.0
1,b6ab77fd7,1024,1024,"[226.0, 548.0, 130.0, 58.0]",usask_1,226.0,548.0,130.0,58.0
2,b6ab77fd7,1024,1024,"[377.0, 504.0, 74.0, 160.0]",usask_1,377.0,504.0,74.0,160.0
3,b6ab77fd7,1024,1024,"[834.0, 95.0, 109.0, 107.0]",usask_1,834.0,95.0,109.0,107.0
4,b6ab77fd7,1024,1024,"[26.0, 144.0, 124.0, 117.0]",usask_1,26.0,144.0,124.0,117.0


In [11]:
labels.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 147793 entries, 0 to 147792
Data columns (total 9 columns):
 #   Column    Non-Null Count   Dtype 
---  ------    --------------   ----- 
 0   image_id  147793 non-null  object
 1   width     147793 non-null  int64 
 2   height    147793 non-null  int64 
 3   bbox      147793 non-null  object
 4   source    147793 non-null  object
 5   x_min     126158 non-null  object
 6   y_min     126158 non-null  object
 7   w_box     126158 non-null  object
 8   h_box     126158 non-null  object
dtypes: int64(2), object(7)
memory usage: 10.1+ MB


The `width`, `height`, `x_min`, `y_min`, `w_box` and `h_box` columns are still in a str format, but they can be easily transformed with the pandas.Series.astype('float64') method

In [12]:
labels.width = labels.width.astype('float64')
labels.height = labels.height.astype('float64')
labels.x_min = labels.x_min.astype('float64')
labels.y_min = labels.y_min.astype('float64')
labels.w_box = labels.w_box.astype('float64')
labels.h_box = labels.h_box.astype('float64')

In [13]:
labels.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 147793 entries, 0 to 147792
Data columns (total 9 columns):
 #   Column    Non-Null Count   Dtype  
---  ------    --------------   -----  
 0   image_id  147793 non-null  object 
 1   width     147793 non-null  float64
 2   height    147793 non-null  float64
 3   bbox      147793 non-null  object 
 4   source    147793 non-null  object 
 5   x_min     126158 non-null  float64
 6   y_min     126158 non-null  float64
 7   w_box     126158 non-null  float64
 8   h_box     126158 non-null  float64
dtypes: float64(6), object(3)
memory usage: 10.1+ MB


Now to make the normed coordinates that yolo needs.

In [14]:
(labels.y_min + 0.5 * labels.w_box) / labels.height

0         0.244141
1         0.598633
2         0.528320
3         0.145996
4         0.201172
            ...   
147788    0.645508
147789    0.588379
147790    0.291504
147791    0.102539
147792    0.768555
Length: 147793, dtype: float64

In [15]:
labels['object_class'] = 0 # object class for use by yolo all objects are wheat so class 0
labels['x_center_norm'] = (labels.x_min + 0.5 * labels.w_box) / labels.width
labels['y_center_norm'] = (labels.y_min + 0.5 * labels.h_box) / labels.height
labels['width_norm'] = labels.w_box/labels.width
labels['height_norm'] = labels.h_box/labels.height

In [16]:
labels.head(5)

Unnamed: 0,image_id,width,height,bbox,source,x_min,y_min,w_box,h_box,object_class,x_center_norm,y_center_norm,width_norm,height_norm
0,b6ab77fd7,1024.0,1024.0,"[834.0, 222.0, 56.0, 36.0]",usask_1,834.0,222.0,56.0,36.0,0,0.841797,0.234375,0.054688,0.035156
1,b6ab77fd7,1024.0,1024.0,"[226.0, 548.0, 130.0, 58.0]",usask_1,226.0,548.0,130.0,58.0,0,0.28418,0.563477,0.126953,0.056641
2,b6ab77fd7,1024.0,1024.0,"[377.0, 504.0, 74.0, 160.0]",usask_1,377.0,504.0,74.0,160.0,0,0.404297,0.570312,0.072266,0.15625
3,b6ab77fd7,1024.0,1024.0,"[834.0, 95.0, 109.0, 107.0]",usask_1,834.0,95.0,109.0,107.0,0,0.867676,0.14502,0.106445,0.104492
4,b6ab77fd7,1024.0,1024.0,"[26.0, 144.0, 124.0, 117.0]",usask_1,26.0,144.0,124.0,117.0,0,0.085938,0.197754,0.121094,0.114258


now the only thing left to do is to create the label .txt files.

In [17]:
for item in labels.image_id.unique()[0]:
    print(labels[labels.image_id == item])

Empty DataFrame
Columns: [image_id, width, height, bbox, source, x_min, y_min, w_box, h_box, object_class, x_center_norm, y_center_norm, width_norm, height_norm]
Index: []
Empty DataFrame
Columns: [image_id, width, height, bbox, source, x_min, y_min, w_box, h_box, object_class, x_center_norm, y_center_norm, width_norm, height_norm]
Index: []
Empty DataFrame
Columns: [image_id, width, height, bbox, source, x_min, y_min, w_box, h_box, object_class, x_center_norm, y_center_norm, width_norm, height_norm]
Index: []
Empty DataFrame
Columns: [image_id, width, height, bbox, source, x_min, y_min, w_box, h_box, object_class, x_center_norm, y_center_norm, width_norm, height_norm]
Index: []
Empty DataFrame
Columns: [image_id, width, height, bbox, source, x_min, y_min, w_box, h_box, object_class, x_center_norm, y_center_norm, width_norm, height_norm]
Index: []
Empty DataFrame
Columns: [image_id, width, height, bbox, source, x_min, y_min, w_box, h_box, object_class, x_center_norm, y_center_norm, wid

In [18]:
id_nr = labels.image_id.unique()[0]
# np.savetxt(r'./labels/'+id_nr+r'.txt', labels[labels.image_id == id_nr].iloc[:, 9:].values, sep = ',')
labels[labels.image_id == id_nr].iloc[:, 9:].to_csv(r'./labels/'+id_nr+r'.txt', index = False, header = False)