In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

In [None]:
import pandas as pd
import numpy as np
from matplotlib import pyplot as plt

#from tqdm import tqdm_notebook as tqdm
#from tqdm import tqdm 
from tqdm.notebook import tqdm as tqdm

import cv2
import os
import re

import random

from PIL import Image

import albumentations as A
from albumentations.pytorch.transforms import ToTensorV2

import torch
import torchvision

import ast

from torchvision.models.detection.faster_rcnn import FastRCNNPredictor
from torchvision.models.detection import FasterRCNN
from torchvision.models.detection.rpn import AnchorGenerator

from torch.utils.data import DataLoader, Dataset
from torch.utils.data.sampler import SequentialSampler

# Deep Learning Detection and Classification of Wheat Heads to Estimate Yield

Please find notebooks for data EDA and for both models on my kaggle page: https://www.kaggle.com/ottiliemitchell/notebooks

## Background
Wheat is a core cereal that is relied on around the world in our diet. In the 1920s wheat breeding was introduced to increase yields and to build immunity against disease (David et al., 2020). 

Wheat's popularity as a staple food product, has meant that farmers globally have had to grow different varieties to suit their growing conditions. This is  vital for food security to ensure successful harvests. Breeding has significantly increased yields, however, this rate of increase has plateaued since the early 1990s (Brisson et al., 2010). Farmers now need to look at other ways to increase and estimate wheat yields. 

Demand for high yielding wheat has initiated wide research, with the creation of a large database, including a set of images which will be studied in this investigation. The images are made up of "wheat heads" which are the spikes at the top of the plant's stem. The wheat head holds the plant's nutrition and is harvested and processed into cereal for food consumption. It would be of benefit to farmers if they were able to estimate the density and size of the wheat heads to predict wheat yields.

With the advance in GPU performance, deep learning has become more accessable to individuals as well as institutions to perform object detection on large datasets (Alom et al., 2019). Although deep learning can be an accurate way to look at large datasets, it can come with limitations. As the images are taken of wheat fields outside, images are often distorted or blury, with movement in the wind. Also, different varieties, at different growth stages and the orientation of wheat heads can appear differently. This can make it more challenging to identify single wheat heads.

The Global Wheat Head data is led by 9 research institutions from across the world, with the joint ambition of accurately detecting wheat heads. These institutions include The University of Tokyo, Institut National de Researche pour L'Agriculture, L'Alimentation et L'Environment, Arvalis, ETHZ, University of Saskatchewan, University of Queensland, Nanjing Agricultural University and Rothamsted Research.

### Problem outline
The accurate classification and detection of wheat heads would give farmers an invaluable tool in the prediction of wheat yields. Using the Global Wheat Head data, this study will attempt to successfully identify the location of wheat heads within provided images.

## Methods

### Data
The data set was found on Kaggle (https://www.kaggle.com/c/global-wheat-detection).

The data includes 2 folders of images, test and train and metadata giving further information of the bounding boxes.

There are 3422 images in the training folder and 10 images in the test folder. Each image is assigned a unique ID number, have the dimension 1024 x 1024 and are made up of 3 layers.

The metadata is made up of 5 columns and 147794 rows. The columns include:
* Image_id – the unique ID number given to each image
* Width – Width of each image
* Height – Height of each images
* Bbox – a bounding box including the xmin, ymin, width and height
* Source – where the images came from.

In [None]:
DATA_PATH = "../input/global-wheat-detection/"
TRAIN_DIR = "../input/global-wheat-detection/train"
TEST_DIR = "../input/global-wheat-detection/test"

In [None]:
df = pd.read_csv(os.path.join(DATA_PATH, "train.csv"))
df.head(5)



### Detection and Classification
Classification is simply categorising a stimuli into a finite set of classes. This involves recognising the dominant content in a scene, which is given the strongest confidence score, irrespective of the location, scaling or rotation.

An example of this would be recognising the dominant feature within an image, such as cat or dog. Classification does not consider where the dominant feature is.

Detection differs from classification, as it involves the classification and localisation of an object (Tuzel et al., 2006). For this study, we would like to be able to determine the wheat heads and their location, thus it is a detection problem.


### Data Preprocessing
The data needs to be formatted into appropriately pre-processed floating point tensors before being run through the model. 

Firstly, the bounding box coordinates must be extracted to create separate columns.

In [None]:
def extract_bbox(DataFrame):
    DataFrame["x"] = [np.float(ast.literal_eval(i)[0]) for i in DataFrame["bbox"]]
    DataFrame["y"] = [np.float(ast.literal_eval(i)[1]) for i in DataFrame["bbox"]]
    DataFrame["w"] = [np.float(ast.literal_eval(i)[2]) for i in DataFrame["bbox"]]
    DataFrame["h"] = [np.float(ast.literal_eval(i)[3]) for i in DataFrame["bbox"]]
    
extract_bbox(df)
df.head()

The images from the training data is then split into train and validation sets using a 80:20 split. Once the data has been split, it is integrated back with the metadata. This can then show how many bounding boxes are in each data set, training and validation.


In [None]:
train_split = 0.8
images_id   = df["image_id"].unique() 
train_ids   = images_id[:int(len(images_id)*train_split)]
valid_ids   = images_id[int(len(images_id)*train_split):]

print(f'Total Images Number: {len(images_id)}')
print(f'Number of Training images: {len(train_ids)}')
print(f'Number of Validation images: {len(valid_ids)}')

In [None]:
train_df = df[df["image_id"].isin(train_ids)]
valid_df = df[df["image_id"].isin(valid_ids)]

print(f'Shape of train_df: {train_df.shape}')
print(f'Shape of valid_df: {valid_df.shape}')


#### Image Augmentation
Image augmentation is where the images are distorted, making it more difficult for the model to detect the wheat heads. It is used in deep learning and computer vision to increase the quality of trained models. In this study a Python library Albumentation was used for image augmentation. The model is first run without image augmentations and to improve the model it was then run again using the albumentation arguments "HorizontalFlip", "RandomBrightnessContrast" and "Blur".

In [None]:
# Data Transform - Albumentation
def get_train_transform():
    return A.Compose([
        ToTensorV2(p=1.0)
    ], bbox_params={'format': 'pascal_voc', 'label_fields': ['labels']})

def get_valid_transform():
    return A.Compose([
        ToTensorV2(p=1.0)
    ], bbox_params={'format': 'pascal_voc', 'label_fields': ['labels']})

In [None]:
class WheatDataset(Dataset):
    def __init__(self, dataframe, image_dir, transform=None):
        super().__init__()
        self.dataframe = dataframe
        self.image_dir = image_dir
        self.transform = transform
        self.image_ids = dataframe["image_id"].unique()
        
    def __getitem__(self, idx):
        #Load images and details
        image_id = self.image_ids[idx]
        details = self.dataframe[self.dataframe["image_id"]==image_id]
        img_path = os.path.join(TRAIN_DIR, image_id)+".jpg"
        image = cv2.imread(img_path, cv2.IMREAD_COLOR)
        image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB).astype(np.float32)
        image /= 255.0
        
        #Row of Dataframe of a particular index.
        boxes = details[['x', 'y', 'w', 'h']].values
        boxes[:, 2] = boxes[:, 0] + boxes[:, 2]
        boxes[:, 3] = boxes[:, 1] + boxes[:, 3]
        
        #To find area
        area = (boxes[:, 3] - boxes[:, 1]) * (boxes[:, 2] - boxes[:, 0])
        
        #Convert it into tensor dataType
        area = torch.as_tensor(area, dtype=torch.float32)
        
        # there is only one class
        labels = torch.ones((details.shape[0],), dtype=torch.int64)
        
        # suppose all instances are not crowd
        iscrowd = torch.zeros((details.shape[0],), dtype=torch.int64)
        
        target = {}
        target['boxes'] = boxes
        target['labels'] = labels
        target['image_id'] = torch.tensor(idx) 
        target['area'] = area
        target['iscrowd'] = iscrowd
        
        if self.transform:
            sample = {
                'image': image,
                'bboxes': target['boxes'],
                'labels': labels
            }
            
            sample = self.transform(**sample)
            image = sample['image']
            target['boxes'] = torch.stack(tuple(map(torch.tensor, zip(*sample['bboxes'])))).permute(1, 0)
            target["boxes"] = torch.as_tensor(target["boxes"], dtype=torch.long)
        
        return image, target     #, image_id
    
    def __len__(self) -> int:
        return len(self.image_ids)

In [None]:
def collate_fn(batch):
    return tuple(zip(*batch))

In [None]:
train_dataset = WheatDataset(train_df, TRAIN_DIR, get_train_transform())
valid_dataset = WheatDataset(valid_df, TRAIN_DIR, get_valid_transform())

In [None]:
indices = torch.randperm(len(train_dataset)).tolist()

train_data_loader = DataLoader(
    train_dataset,
    batch_size=8,
    shuffle=False,
    num_workers=4,
    collate_fn=collate_fn
)

valid_data_loader = DataLoader(
    valid_dataset,
    batch_size=8,
    shuffle=False,
    num_workers=4,
    collate_fn=collate_fn
)

In [None]:
def plot_images(n_num, random_selection=True):

    if random_selection:
        index = random.sample(range(0, len(train_df["image_id"].unique())-1), n_num)
    else:
        index = range(0, n_num)
    plt.figure(figsize=(15,15))
    fig_no = 1
    
    for i in index:
        images, targets = train_dataset.__getitem__(i)
        sample = np.array(np.transpose(images, (1,2,0)))
        boxes = targets["boxes"].numpy().astype(np.int32)
    
        #Plot figure/image

        for box in boxes:
            cv2.rectangle(sample,(box[0], box[1]),(box[2], box[3]),(255,223,0), 2)
        plt.subplot(n_num/2, n_num/2, fig_no)
        plt.imshow(sample)
        fig_no+=1

Now that the data is in the correct form, the data can be run through the model. Here is an example of four of the training images with bounding boxes.

In [None]:
#train images
plot_images(4)

### Model
The model used for this study was a pretrained Faster R-CNN ResNet50. This model extends the Faster R-CNN, whilst only adding a small overhead, running at 5 fps. Additionally, this model outperforms all existing single-model entries for segmentation, bounding-box object detection and person keypoint detection.

The model was firstly trained on a model with no image augmentation and 20 epochs. This was then improved by adding several image augmentations and reducing the number of epochs.

The model needs to be downloaded and the head of the model changed to a more appropriate predictor. The number of classes, input features, device, parameters, optimisers and epochs are also defined. 

In [None]:
model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained = True)

num_classes = 2  # wheat + background
# get number of input features for the classifier
in_features = model.roi_heads.box_predictor.cls_score.in_features
# replace the pre-trained head with a new one
model.roi_heads.box_predictor = FastRCNNPredictor(in_features, num_classes)
device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')
model.to(device)
params = [p for p in model.parameters() if p.requires_grad]
optimizer = torch.optim.SGD(params, lr=0.005, momentum=0.9, weight_decay=0.0005)
# lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=3, gamma=0.1)
lr_scheduler = None

num_epochs = 20

The model is now ready to run on the training and validation data.

In [None]:
import time

itr=1

total_train_loss = []
total_valid_loss = []

losses_value = 0
for epoch in range(num_epochs):
  
    start_time = time.time()
    train_loss = []
    model.train()
    
 #<-----------Training Loop---------------------------->
    pbar = tqdm(train_data_loader, desc = 'description')
    for images, targets in pbar:
        
        images = list(image.to(device) for image in images)
        targets = [{k: v.to(device) for k, v in t.items()} for t in targets]

        loss_dict = model(images, targets)
        
        losses = sum(loss for loss in loss_dict.values())
        losses_value = losses.item()
        train_loss.append(losses_value)        
        optimizer.zero_grad()
        losses.backward()
        optimizer.step()
        
        pbar.set_description(f"Epoch: {epoch+1}, Batch: {itr}, loss: {losses_value}")
        itr+=1

    epoch_train_loss = np.mean(train_loss)
    total_train_loss.append(epoch_train_loss)

 #<---------------Validation Loop---------------------->
    with torch.no_grad():
        valid_loss = []

        for images, targets in valid_data_loader:
            images = list(image.to(device) for image in images)
            targets = [{k: v.to(device) for k, v in t.items()} for t in targets]
            
            
            # If you need validation losses
            model.train()
            # Calculate validation losses
            loss_dict = model(images, targets)
            losses = sum(loss for loss in loss_dict.values())
            loss_value = losses.item()
            valid_loss.append(loss_value)
            
    epoch_valid_loss = np.mean(valid_loss)
    total_valid_loss.append(epoch_valid_loss)    
    
    print(f"Epoch Completed: {epoch+1}/{num_epochs}, Time: {time.time()-start_time},\
    Train Loss: {epoch_train_loss}, Valid Loss: {epoch_valid_loss}")  

## Results 

The graph shows that the validation loss and training loss diverge exponentially after aproximately 10 epochs. This tells us that the model is overfitting and that using 20 epochs was far too many. The training loss has continued to decrease from 0.94 to 0.62, whilst the validation loss starts to decrease from 0.86 to 0.81, but then increase to 1.03.


In [None]:
import seaborn as sns
plt.figure(figsize=(8,5))
sns.set_style(style="whitegrid")
sns.lineplot(range(1, len(total_train_loss)+1), total_train_loss, label="Training Loss")
sns.lineplot(range(1, len(total_train_loss)+1), total_valid_loss, label="Valid Loss")
plt.xlabel("Epochs")
plt.ylabel("Loss")
plt.show()

As the first model was clearly overfitting the data, I reduced the number of epochs to 9 and introduced image augmentation to try and decrease the validation loss further. 

In [None]:
DATA_PATH = "../input/global-wheat-detection/"
TRAIN_DIR = "../input/global-wheat-detection/train"
TEST_DIR = "../input/global-wheat-detection/test"
df = pd.read_csv(os.path.join(DATA_PATH, "train.csv"))

In [None]:
def extract_bbox(DataFrame):
    DataFrame["x"] = [np.float(ast.literal_eval(i)[0]) for i in DataFrame["bbox"]]
    DataFrame["y"] = [np.float(ast.literal_eval(i)[1]) for i in DataFrame["bbox"]]
    DataFrame["w"] = [np.float(ast.literal_eval(i)[2]) for i in DataFrame["bbox"]]
    DataFrame["h"] = [np.float(ast.literal_eval(i)[3]) for i in DataFrame["bbox"]]
    
extract_bbox(df)

In [None]:
train_split = 0.8
images_id   = df["image_id"].unique() 
train_ids   = images_id[:int(len(images_id)*train_split)]
valid_ids   = images_id[int(len(images_id)*train_split):]

train_df = df[df["image_id"].isin(train_ids)]
valid_df = df[df["image_id"].isin(valid_ids)]

In [None]:
# Data Transform - Albumentation
def get_train_transform():
    return A.Compose([
        A.HorizontalFlip(p=0.5),
        A.RandomBrightnessContrast(p=0.2),
        A.Blur(p=1),
        ToTensorV2(p=1.0)
    ], bbox_params={'format': 'pascal_voc', 'label_fields': ['labels']})

def get_valid_transform():
    return A.Compose([
        ToTensorV2(p=1.0)
    ], bbox_params={'format': 'pascal_voc', 'label_fields': ['labels']})

In [None]:
class WheatDataset(Dataset):
    def __init__(self, dataframe, image_dir, transform=None):
        super().__init__()
        self.dataframe = dataframe
        self.image_dir = image_dir
        self.transform = transform
        self.image_ids = dataframe["image_id"].unique()
        
    def __getitem__(self, idx):
        #Load images and details
        image_id = self.image_ids[idx]
        details = self.dataframe[self.dataframe["image_id"]==image_id]
        img_path = os.path.join(TRAIN_DIR, image_id)+".jpg"
        image = cv2.imread(img_path, cv2.IMREAD_COLOR)
        image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB).astype(np.float32)
        image /= 255.0
        
        #Row of Dataframe of a particular index.
        boxes = details[['x', 'y', 'w', 'h']].values
        boxes[:, 2] = boxes[:, 0] + boxes[:, 2]
        boxes[:, 3] = boxes[:, 1] + boxes[:, 3]
        
        #To find area
        area = (boxes[:, 3] - boxes[:, 1]) * (boxes[:, 2] - boxes[:, 0])
               
        #Convert it into tensor dataType
        area = torch.as_tensor(area, dtype=torch.float32)
        
        # there is only one class
        labels = torch.ones((details.shape[0],), dtype=torch.int64)
        
        # suppose all instances are not crowd
        iscrowd = torch.zeros((details.shape[0],), dtype=torch.int64)
        
        target = {}
        target['boxes'] = boxes
        target['labels'] = labels
        target['image_id'] = torch.tensor(idx) ### <------------ New change list has been removed
        target['area'] = area
        target['iscrowd'] = iscrowd
        
        if self.transform:
            sample = {
                'image': image,
                'bboxes': target['boxes'],
                'labels': labels
            }
            
            sample = self.transform(**sample)
            image = sample['image']
            target['boxes'] = torch.stack(tuple(map(torch.tensor, zip(*sample['bboxes'])))).permute(1, 0)
            target["boxes"] = torch.as_tensor(target["boxes"], dtype=torch.long)
        
        return image, target     #, image_id
    
    def __len__(self) -> int:
        return len(self.image_ids)

In [None]:
def collate_fn(batch):
    return tuple(zip(*batch))

train_dataset = WheatDataset(train_df, TRAIN_DIR, get_train_transform())
valid_dataset = WheatDataset(valid_df, TRAIN_DIR, get_valid_transform())

In [None]:
indices = torch.randperm(len(train_dataset)).tolist()

train_data_loader = DataLoader(
    train_dataset,
    batch_size=8,
    shuffle=False,
    num_workers=4,
    collate_fn=collate_fn
)

valid_data_loader = DataLoader(
    valid_dataset,
    batch_size=8,
    shuffle=False,
    num_workers=4,
    collate_fn=collate_fn
)

In [None]:
model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained = True)
num_classes = 2  # wheat + background
# get number of input features for the classifier
in_features = model.roi_heads.box_predictor.cls_score.in_features
# replace the pre-trained head with a new one
model.roi_heads.box_predictor = FastRCNNPredictor(in_features, num_classes)
device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')
model.to(device)
params = [p for p in model.parameters() if p.requires_grad]
optimizer = torch.optim.SGD(params, lr=0.005, momentum=0.9, weight_decay=0.0005)
# lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=3, gamma=0.1)
lr_scheduler = None

num_epochs = 9

In [None]:
import time

itr=1

total_train_loss = []
total_valid_loss = []

losses_value = 0
for epoch in range(num_epochs):
  
    start_time = time.time()
    train_loss = []
    model.train()
    
 #<-----------Training Loop---------------------------->
    pbar = tqdm(train_data_loader, desc = 'description')
    for images, targets in pbar:
        
        images = list(image.to(device) for image in images)
        targets = [{k: v.to(device) for k, v in t.items()} for t in targets]

        loss_dict = model(images, targets)
        
        losses = sum(loss for loss in loss_dict.values())
        losses_value = losses.item()
        train_loss.append(losses_value)        
        optimizer.zero_grad()
        losses.backward()
        optimizer.step()
        
        pbar.set_description(f"Epoch: {epoch+1}, Batch: {itr}, loss: {losses_value}")
        itr+=1

    epoch_train_loss = np.mean(train_loss)
    total_train_loss.append(epoch_train_loss)
    
    
    #<---------------Validation Loop---------------------->
    with torch.no_grad():
        valid_loss = []

        for images, targets in valid_data_loader:
            images = list(image.to(device) for image in images)
            targets = [{k: v.to(device) for k, v in t.items()} for t in targets]
            
            
            # If you need validation losses
            model.train()
            # Calculate validation losses
            loss_dict = model(images, targets)
            losses = sum(loss for loss in loss_dict.values())
            loss_value = losses.item()
            valid_loss.append(loss_value)
            
    epoch_valid_loss = np.mean(valid_loss)
    total_valid_loss.append(epoch_valid_loss)
    
    print(f"Epoch Completed: {epoch+1}/{num_epochs}, Time: {time.time()-start_time},\
    Train Loss: {epoch_train_loss}, Valid Loss: {epoch_valid_loss}")   

In the graph below you can see that adding in the augmentations has significantly reduced the validation loss of the model to just below 0.825.

In [None]:
import seaborn as sns
plt.figure(figsize=(8,5))
sns.set_style(style="whitegrid")
sns.lineplot(range(1, len(total_train_loss)+1), total_train_loss, label="Training Loss")
sns.lineplot(range(1, len(total_train_loss)+1), total_valid_loss, label="Valid Loss")
plt.xlabel("Epochs")
plt.ylabel("Loss")
plt.show()

In [None]:
# Data Transform - Test Albumentation
def get_test_transform():
    return A.Compose([
        ToTensorV2(p=1.0)
    ])

In [None]:
class WheatTestDatasetTest(Dataset):
    def __init__(self, dataframe, image_dir, transform=None):
        super().__init__()
        self.dataframe = dataframe
        self.image_dir = image_dir
        self.transform = transform
        self.image_ids = dataframe["image_id"].unique()
        
    def __getitem__(self, idx):
        image_id = self.image_ids[idx]
        details = self.dataframe[self.dataframe["image_id"]==image_id]
        img_path = os.path.join(TEST_DIR, image_id)+".jpg"
        image = cv2.imread(img_path, cv2.IMREAD_COLOR)
        image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB).astype(np.float32)
        image /= 255.0
        
        if self.transform:
            sample = {
                'image': image,
            }
            
            sample = self.transform(**sample)
            image = sample['image'] 
            
        return image, image_id
    
    def __len__(self) -> int:
        return len(self.image_ids)

In [None]:
df_test = pd.read_csv(os.path.join(DATA_PATH, "sample_submission.csv"))

In [None]:
def collate_fn(batch):
    return tuple(zip(*batch))

test_dataset = WheatTestDatasetTest(df_test, TEST_DIR, get_test_transform())
print(f"Length of test dataset: {len(test_dataset)}")

test_data_loader = DataLoader(
    test_dataset,
    batch_size=4,
    shuffle=False,
    num_workers=4,
    drop_last=False,
    collate_fn=collate_fn
)

In [None]:
model.eval()
x = model.to(device)

In [None]:
detection_threshold = 0.5
output_list = []

for images, image_ids in test_data_loader:

    images = list(image.to(device) for image in images)
    outputs = model(images)

    for i, image in enumerate(images):

        boxes = outputs[i]['boxes'].data.cpu().numpy()
        scores = outputs[i]['scores'].data.cpu().numpy()
        
        boxes = boxes[scores >= detection_threshold].astype(np.int32)
        scores = scores[scores >= detection_threshold]
        image_id = image_ids[i]
        
        boxes[:, 2] = boxes[:, 2] - boxes[:, 0]
        boxes[:, 3] = boxes[:, 3] - boxes[:, 1]
        
        output_dict = {
            'image_id': image_ids[i],
            'boxes': outputs[i]['boxes'].data.cpu().numpy(),
            'scores': outputs[i]['scores'].data.cpu().numpy()
        }
        output_list.append(output_dict)

In [None]:
## Plot image prediction

def predict_images(n_num, random_selection=True):
    '''Plot N Number of Predicted Images'''
    if random_selection:
        index = random.sample(range(0, len(df_test["image_id"].unique())), n_num)
    else:
        index = range(0, n_num)
        
    plt.figure(figsize=(15,15))
    fig_no = 1
    
    for i in index:
        images, image_id = test_dataset.__getitem__(i)
        sample = images.permute(1,2,0).cpu().numpy()
        boxes = output_list[i]['boxes']
        scores = output_list[i]['scores']
        boxes = boxes[scores >= detection_threshold].astype(np.int32)
        #Plot figure/image
        for box in boxes:
            cv2.rectangle(sample,(box[0], box[1]),(box[2], box[3]),(255,223,0), 2)
        plt.subplot(n_num/2, n_num/2, fig_no)
        plt.imshow(sample)
        fig_no+=1

In [None]:
def print_image(i):
  plt.figure(figsize=(6,6))
  images, image_id = test_dataset.__getitem__(i)
  sample = images.permute(1,2,0).cpu().numpy()
  boxes = output_list[i]['boxes']
  scores = output_list[i]['scores']
  boxes = boxes[scores >= detection_threshold].astype(np.int32)
  #Plot figure/image
  for box in boxes:
      cv2.rectangle(sample,(box[0], box[1]),(box[2], box[3]),(255,223,0), 2)
  # plt.subplot(n_num/2, n_num/2, fig_no)
  plt.imshow(sample)

The test images were run through the improved model to see how well it could detect the wheat heads. As there is no data for the bounding boxes of the test data, the study was unable to produce a numeric accuracy, however, it can be seen from the images below that the wheat heads were successfully detected. 

In [None]:
predict_images(4, True)

## Discussion
As wheat is such an important global food source, the ability to sucessfully identify wheat heads from a set of images, can help farmers predict their yield, to aid food security issues.

The outcome of this study shows that the use of augmentations are essential to running and improving the accuracy of a model. Although the model was successful in detecting wheat heads, further analysis of the wheat heads is needed to support farmers in predicting wheat yields.

The model used has the potential to be developed through further analysis of images taken throughout the growing period, to recognise the development of the crop through the growth stages. This would give farmers more information to help them with decision making of requirements such as irrigation and fertilisation. Accurate images could also indicate the health of the crop by detection of disease and pest infestation.

The global agri-tech sector is experiencing exponential growth, with government support and investment, to deal with challenges facing the global agricultural sector. The use of technology is slowly becoming cricial in the industry, to feed a growing global population with added constraints on land and farming inputs. With the demand of farmers to produce higher yields; if we continue with traditional farming methods, the security of the food chain could be put at risk.

Many farmers have turned to intensive farming strategies to meet this demand, although this has increased productivity, it comes at a cost to the environment (Mennerat et al., 2010). Agri-tech improves yields, profitability and efficiency, whilst reducing the impact on the environment and by building sustainablity and resiliance across crop cultivation.

Farm work is historically manual, however, more and more farmers have begun to consult data about essential variables such as soil, crops, irrigation, fertilisers and weather (Goedde et al., 2020). The benefits of adopting technologies would free up significant time for farmers to pursue work outside their core industry. Farmers must embrace change to a digital future, to overcome increased demand and disruptive forces.

Drones use computer vision to analyse field conditions and are giving farmers the ability of surveying crops quicker over large areas, to relay real-time data (Goedde et al., 2020). The collected data could be used with machine learning to quickly analyse crop yields and to aid to farmers decisions.

With advances in the Agri-tech sector, the application of machine learning using imagery  will aid yield prediction and should be considered as an essential tool for farmers in the future.

## References 
Alom, M.Z., Taha, C., Yakopcic, C. et al., A state of the art survey on deep learning theory and architectures. *Electron*, 8:3, pg. 292.

Brisson, N.P., Gate, P., Gouache, D., Charmet, G., Oury, F.X. & Huard, F. (2010). Why are wheat yields stagnating in Europe? A comprohensive data analysis for France. *Field Crops Research*, 119:1, pg. 201-212.

David, E., Madec, S., Sadeghi-Tehran, P., Aasen, H., Zheng, B., Lui, S. (2020). Global wheat head detection (GWHD) dataset: A large and diverse dataset of high-resolution RGB-labelled images to develop and benchmark wheat heat detection methods. *Plant Phenomics*, 2020, pg. 12.

Goedde, L., Katz, J., Menard, A. & Revellat, J. (2020). Agriculture’s connected future: How technology can yield new growth. Available at: https://www.mckinsey.com/industries/agriculture/our-insights/agricultures-connected-future-how-technology-can-yield-new-growth (Accessed: 6th Jan. 2021)

Kaggle. (2020). Global wheat detection. Available at: https://www.kaggle.com/c/global-wheat-detection (Accessed: 3rd Dec. 2020)

Mennerat, A., Nilsen, F., Ebert, D. et al. (2010) Intensive Farming: Evolutionary Implications for Parasites and Pathogens. *Evol Biol* 37, pg. 59–67.

Tuzel, O., Porikli, F. & Meer, P. (2020). Region Covariance: A fast detection and classifiaction. *Lecture notes in Computer Science*. 3952, pg. 589-600. 