<h1 align="center">Tutorial: Bird by Bird using Deep Learning</h1>
<h2 align="center">Advancing deep learning models for fine-grained classification of bird species</h2>
<h3 align="center">Author: Sofya Lipnitskaya</h3>

### This repository related to [Bird by Bird using Deep Learning](https://github.com/slipnitskaya/caltech-birds-advanced-classification)

**Overview**

With this tutorial, you will tackle such an established problem in computer vision as fine-grained classification of bird species. The notebook demonstrates how to classify bird images from the Caltech-UCSD Birds-200-2011 ([CUB-200-2011](http://www.vision.caltech.edu/visipedia/CUB-200-2011.html)) dataset using PyTorch, one of the most popular open-source frameworks for deep learning experiments. 

**Project outline** 

Here you can get familiarized with the content more properly (the respective CRISP-DM stages of the project are indicated in parentheses):

* Introducing the bird species recognition problem [(Business Understanding)](#motiv)
* Exploratory analysis of CUB-200-2011 dataset [(Data Understanding)](#data)
* Transforming images and splitting the data [(Data Preparation)](#prep) 
* Training and evaluation of the baseline model [(Modelling. p. 1/2)](#model-base)
* Advancing the deep learning model [(Modelling. p. 2/2)](#model-adv) 
* Conclusions and Future work [(Evaluation)](#eval)

**Learning Goals**

By the end of the tutorial, you will be able to:
- Understand basics of image classification problem of bird species.
- Determine the data-driven image pre-processing strategy.
- Create your own deep learning pipeline for image classification.
- Build, train and evaluate ResNet-50 model to predict bird species.
- Improve the model performance by using different techniques.

***

## Introducing the bird species recognition problem<a class="anchor" id="motiv"></a>

**Motivation** 

Bird species recognition is a difficult task challenging the visual abilities for both human experts and computers. One of the interesting task related to that problem implies the classification of birds by species using imagery data collected from aerial surveys. Bird populations are important biodiversity indicators, so collecting reliable data is quite [important](https://dl.acm.org/doi/10.1016/j.patrec.2015.08.015) to ecologists. Recognition of bird species also benefits companies developing wind farms producing renewable energy, since their construction requires the prior risk assessment of bird collisions, threatening many of the world’s species with extinction.

This, of course, would be a very ambitious plan to try to find the solution for this problem within a single notebook, so let's make it simple and focus on the bird classification. To make it even more concise, here, we are going to create and evaluate a deep learning model to classify bird images from the Caltech-UCSD Birds-200-2011 ([CUB-200-2011](http://www.vision.caltech.edu/visipedia/CUB-200-2011.html)) dataset. In this tutorial, you will learn how to perform the data-driven image pre-processing, build a baseline ResNet-based classifier, and to improve its performance for even better results in bird species recognition using different techniques, which will be described later on.

**Questions to be solved through the notebook:**

1. Do corrupted images exist in our dataset?
2. What would be the optimal data transformation strategy?
3. Are there any image-specific biases that can limit the model performance?
4. How to handle overfitting given the limited amount of training samples?
5. How to improve the model performance in bird species recognition?

***

First, let's import packages that we will use in this tutorial:

In [10]:
# import packages
import os
import csv
import random
#import targfile
import multiprocessing as mp

import tqdm
import requests

import numpy as np
import sklearn.model_selection as skms

import torch
import torch.utils.data as td
import torch.nn.functional as F

import torchvision as tv
import torchvision.transforms.functional as TF

import matplotlib as mpl
import matplotlib.pyplot as plt
import matplotlib.image as mpimg

# define constants
DEVICE = 'cuda' if torch.cuda.is_available() else 'cpu'
OUT_DIR = 'results'
RANDOM_SEED = 42

# create an output  folder
if not os.path.exists(OUT_DIR):
    try:
        os.makedirs(OUT_DIR, exist_ok=True)
    except OSError as exc:
        if exc.errno != errno.EEXIST:
            raise Exception('OSError')

            
def get_model_desc(pretrained=False, num_classes=200, use_attention=False):
    """
    Generates description string.
    """
    desc = []
    
    if pretrained:
        desc.append('Transfer')
    else:
        desc.append('Baseline')
        
    if num_classes == 204:
        desc.append('Multitask')

    if use_attention:
        desc.append('Attention')
    
    return '-'.join(desc)


def log_accuracy(path_to_csv, desc, acc, sep='\t', newline='\n'):
    """
    Logs accuracy into a CSV-file
    """
    file_exists = os.path_exists(path_to_csv)
    
    mode = 'a'
    if not file_exists:
        mode += '+'
    
    with open(path_to_csv, mode) as csv:
        if not file_exists:
            csv.write(f'setup{sep}accuracy{newline}')
        
        csv.write(f'{desc}{sep}{acc}{newline}')

## Data collection<a class="anchor" id="data"></a>

In this tutorial, we are going to use CUB-200-2011 dataset consisting of 11788 images of birds belonging to 200 species. 

The dataset file can be downloaded and extracted manually from [link](http://www.vision.caltech.edu/visipedia/CUB-200-2011.html 'www.vision.caltech.edu'), or, alternatively, using the following code:

In [None]:
class GoogleDriveDownloader(object):
    """
    Downloading a file stored on Google Drive by its URL.
    If the link is pointing to another resource, the redirect chain is being expanded.
    
    Return the output path.
    """
    
    base_url = 'https://docs.google.com/uc?export=download'
    chunk_size = 32768
    
    def __init__(self, url, out_dir):
        super().__init__()
        
        self.out_name = url.rsplit('/', 1)[-1]
        self.url = self._get_redirect_url(url)
        self.out_dir = out_dir
        
    @staticmethod
    def _get_redirect_url(url):
        response = requests.get(url)
        if response.url != url and response.url is not None:
            redirect_url = response.url
            return redirect_url
        else:
            return url
        
    @staticmethod
    def _get_confirm_token(response):
        for key, value in response.cookies.items():
            if key.startswith('download_warning'):
                return value
        return None
    
    def _save_response_content(self, response):
        with open(self.fpath, 'wb') as f:
            bar = tqdm.tqdm(total=None)
            progress = 0
            for chunk in response.iter_content(self.chunk_size):
                if chunk:
                    f.write(chunk)
                    progress += len(chunk)
                    bar.update(progress - bar.n)
            bar.close()
    
    @property
    def file_id(self):
        return self.url.split('?')[0].split('/')[-2]
    
    @property
    def fpath(self):
        return os.path.join(self.out_dir, self.out_name)
    
    def download(self):
        os.makedirs(self.out_dir, exist_ok=True)
        
        if os.path.isfile(self.fpath):
            print("File is downloaded yet: ", self.fpath)
        else:
            session = requests.Session()
            response = session.get(self.base_url, params={'id': self.file_id}, stream=True)
            token = self._get_confirm_token(response)
            
            if token:
                response = session.get(self.base_url, params={'id': self.file_id, 'confirm': token}, stream=True)
            else:
                raise RuntimeError()
                
            self._save_response_content(response)
        
        return self.path
    
# download an archive containing the dataset and store it into the output directory
url = 'http://www.vision.caltech.edu/visipedia-data/CUB-200-2011/CUB_200_2011.tgz'
dl = GoogleDriveDownloader(url, 'data')
dl.download()

After downloading of the compressed file, we extract it and also assess some statistics to verify whether the gathered data consist the expected number of classes and images. Here's an example execution: