## Birds species detection using TensorFlow Object Detection API
The Faster R-CNN model will be used in this project to train datasets to categorise bird species based on images provided by our professor Dr. Carl Chalmers.
Furthermore, a comparison with the SSD Mobilenet model will be made to assess how this model compares. 

## The Data
The images of bird species provided by our lecturer, Dr. Carl Chalmers, will serve as the input dataset for our project. In total, our professor provided us with approximately 4000 images of four different bird species, from which we were required to tag three unique bird classes with a minimum of 500 tags each class. <br>
*All Bird Species*:
- Erithacus_Rubecula
- Periparus_ater
- Pica_pica
- Turdus_merula

*Selected Bird Species*:
- Erithacus_Rubecula
- Periparus_ater
- Pica_pica

Because of the differences within these species, I've chosen the three bird species mentioned above to implement my model. The colour of *Erithacus Rubecula* was distinct from the others, while *Pica pica* had a long tail that will aid in identification, and finally *Periparus ater* was similar to Pica pica but with different shades and a shorter tail. These differences will really help my model to train properly. For these reasons, I decided to choose these 3 bird species for my input.<br>
For the model, a total of 2402 photos were tagged with RenomTag. There were 801 images tagged with *Erithacus Rubecula*, 801 images with *Periparus ater*, and 801 images with *Pica pica*. Also, the tags are in the pascal VOC format.  

# Pre-processing

## Importing Libraries
Let's start by importing libraries that we are going to use!

In [1]:
import os
import numpy as np
from matplotlib.pyplot import imread
%matplotlib inline

The code below is used to locate images that do not include tags in the xml or pascal VOC formats.
The *splittext* function, which is part of the *os* package, helps in splitting the path name into two parts: root and extension.<br>

Output that came is the folder name only, that means our all images are tagged.

In [2]:
dim1 = []
dim2 = []
for image_filename in os.listdir('./images'):
    name, ext = os.path.splitext(image_filename)
    try:
        if not ext == '.xml':
            img = imread('./images/'+image_filename)
            d1,d2,color = img.shape
            dim1.append(d1)
            dim2.append(d2)
    except:
        print(image_filename)
        continue

test
cleaned
train


**Checking the mean value of dim1**

In [3]:
np.mean(dim1)

789.1639617145236

Printing the maximum and the minimum value of dim1

In [4]:
print('Minimum:', np.min(dim1))
print('Maximum:', np.max(dim1))

Minimum: 142
Maximum: 1024


**Checking the mean value of dim2**

In [5]:
np.mean(dim2)

938.9546400332918

Printing the maximum and the minimum value of dim1

In [6]:
print('Minimum:', np.min(dim2))
print('Maximum:', np.max(dim2))

Minimum: 140
Maximum: 1024
