# Global Wheat Detection

### An international computer science competition to count wheat ears more effectively, using image analysis

<div align="center"> <img width="512" height="116" src="http://www.global-wheat.com/wp-content/uploads/2019/11/temporary_gwd_logo-2.png"</div>


### The Problem 

For several years, agricultural research has been using sensors to observe plants at key moments in their development. However, some important plant traits are still measured manually. One example of this is the manual counting of wheat ears from digital images – a long and tedious job. Factors that make it difficult to manually count wheat ears from digital images include the possibility of overlapping ears, variations in appearance according to maturity and genotype, the presence or absence of barbs, head orientation and even wind.  
 

### The Need 

There is the need for a robust and accurate computer model that is capable of counting wheat ears from digital images. This model will benefit phenotyping research and help producers around the world assess ear density, health and maturity more effectively. Some work has already been done in deep learning, though it has resulted in too little data to have a generic model.  


Refer [this](http://www.global-wheat.com/) page for more details.

<div align="center"><img src="http://www.global-wheat.com/wp-content/uploads/2020/04/ILLU_01_EN.jpg" width="800"/></div>

# Let's Code!

We can see the contents of the directory by using the os module

In [None]:
import os
print(os.listdir('../input/global-wheat-detection/'))

There are two folders namely train and test, and csv file for train and sample submission. Now, we we explore the file train.csv

In [None]:
import pandas as pd
train_csv = pd.read_csv('../input/global-wheat-detection/train.csv')
train_csv.head()

In [None]:
train_csv.tail()

How does it look? we can see

In [None]:
train_csv.shape

Are there empty values ​​in the file? we check

In [None]:
train_csv.isnull().any().any()

Good, it will make it easier for us to process data. now we get the whole info

In [None]:
train_csv.info()

### For widht dan hight

In [None]:
print(train_csv.width.unique())
print(train_csv.height.unique())

### For source

In [None]:
unique = train_csv['source'].unique()
print(unique)

In [None]:
train_csv.image_id.value_counts()

In [None]:
train_csv.groupby("source").image_id.count()

In [None]:
nunique = train_csv.image_id.nunique()
print(nunique)

### For image_id

In [None]:
train_csv.groupby("source").image_id.value_counts()

We can Visualization this

In [None]:
#Visualization
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')
plt.rcParams['figure.figsize'] = (18, 8)
plt.rcParams['figure.figsize'] = (15, 10)
sns.countplot(train_csv['source'], palette = 'hsv')
plt.title('Distribution of Source', fontsize = 20)
plt.legend()
plt.show()

In [None]:
labels = ['ethz_1', 'arvalis_1', 'rres_1','arvalis_3','usask_1','arvalis_2','inrae_1 ']
plt.rcParams['figure.figsize'] = (7, 7)
plt.pie(train_csv['source'].value_counts(),labels=labels,explode = [0.0,0.0,0.05,0.05,0.2,0.2,0.2], autopct = '%.2f%%')
plt.title('Source', fontsize = 21)
plt.axis('off')
plt.legend(loc='lower center', bbox_to_anchor=(1, 1))
plt.show()

### For bbox

In [None]:
from ast import literal_eval

def get_bbox_area(bbox):
    bbox = literal_eval(bbox)
    return bbox[2] * bbox[3]
train_csv['bbox_area'] = train_csv['bbox'].apply(get_bbox_area)
train_csv['bbox_area'].value_counts().hist(bins=33)

# Exploratory : train (images)

In [None]:
train_dir = '../input/global-wheat-detection/train'
test_dir = '../input/global-wheat-detection/test'

print('total train images:', len(os.listdir(train_dir)))
print('total test images:', len(os.listdir(test_dir)))

In [None]:
import matplotlib.image as mpimg

pic_index = 100
train_files = os.listdir(train_dir)


next_train = [os.path.join(train_dir, fname) 
                for fname in train_files[pic_index-4:pic_index]]

for i, img_path in enumerate(next_train):
  img = mpimg.imread(img_path)
  plt.imshow(img)
  plt.axis('Off')
  plt.show()

In [None]:
from pandas_profiling import ProfileReport
profile = ProfileReport(train_csv, title='Report',progress_bar = False);
profile.to_widgets()

## Get Predict