<h1 align="center">Tutorial: Bird by Bird using Deep Learning</h1>
<h2 align="center">Advancing deep learning models for fine-grained classification of bird species</h2>
<h3 align="center">Author: Sofya Lipnitskaya</h3>

### This repository related to [Bird by Bird using Deep Learning](https://github.com/slipnitskaya/caltech-birds-advanced-classification)

**Overview**

With this tutorial, you will tackle such an established problem in computer vision as fine-grained classification of bird species. The notebook demonstrates how to classify bird images from the Caltech-UCSD Birds-200-2011 ([CUB-200-2011](http://www.vision.caltech.edu/visipedia/CUB-200-2011.html)) dataset using PyTorch, one of the most popular open-source frameworks for deep learning experiments. 

**Project outline** 

Here you can get familiarized with the content more properly (the respective CRISP-DM stages of the project are indicated in parentheses):

* Introducing the bird species recognition problem [(Business Understanding)](#motiv)
* Exploratory analysis of CUB-200-2011 dataset [(Data Understanding)](#data)
* Transforming images and splitting the data [(Data Preparation)](#prep) 
* Training and evaluation of the baseline model [(Modelling. p. 1/2)](#model-base)
* Advancing the deep learning model [(Modelling. p. 2/2)](#model-adv) 
* Conclusions and Future work [(Evaluation)](#eval)

**Learning Goals**

By the end of the tutorial, you will be able to:
- Understand basics of image classification problem of bird species.
- Determine the data-driven image pre-processing strategy.
- Create your own deep learning pipeline for image classification.
- Build, train and evaluate ResNet-50 model to predict bird species.
- Improve the model performance by using different techniques.

***

## Introducing the bird species recognition problem<a class="anchor" id="motiv"></a>

**Motivation** 

Bird species recognition is a difficult task challenging the visual abilities for both human experts and computers. One of the interesting task related to that problem implies the classification of birds by species using imagery data collected from aerial surveys. Bird populations are important biodiversity indicators, so collecting reliable data is quite [important](https://dl.acm.org/doi/10.1016/j.patrec.2015.08.015) to ecologists. Recognition of bird species also benefits companies developing wind farms producing renewable energy, since their construction requires the prior risk assessment of bird collisions, threatening many of the world’s species with extinction.

This, of course, would be a very ambitious plan to try to find the solution for this problem within a single notebook, so let's make it simple and focus on the bird classification. To make it even more concise, here, we are going to create and evaluate a deep learning model to classify bird images from the Caltech-UCSD Birds-200-2011 ([CUB-200-2011](http://www.vision.caltech.edu/visipedia/CUB-200-2011.html)) dataset. In this tutorial, you will learn how to perform the data-driven image pre-processing, build a baseline ResNet-based classifier, and to improve its performance for even better results in bird species recognition using different techniques, which will be described later on.

**Questions to be solved through the notebook:**

1. Do corrupted images exist in our dataset?
2. What would be the optimal data transformation strategy?
3. Are there any image-specific biases that can limit the model performance?
4. How to handle overfitting given the limited amount of training samples?
5. How to improve the model performance in bird species recognition?

***

First, let's import packages that we will use in this tutorial: