# Deep Learning for Coders, Part 1
### Assignment #1: Building a Baldwin Classifier

This is my first assignment for the [fast.ai](http://www.fast.ai/) [_Practical Deep Learning for Coders, Part 1_](http://course.fast.ai/) course, which I'm taking at the [Data Institute at the USF](https://www.usfca.edu/data-institute) in the fall of 2018.  

The [original notebook uses an existing data set to classify breeds of cats and dogs](https://github.com/fastai/course-v3/blob/master/nbs/dl1/lesson1-pets.ipynb).  Our assignment was to run that example then come up with our own data set.  

I decided to build something the world _truly_ needs right now: a Baldwin brother classifier.

Aside from the obvious benefits to our society of programmatic Baldwin classification (you're welcome!), I was really interested to try this twist on creating a model for a few reasons:
- Get experience creating my own dataset
- Refine the model on a comparatively small number of photographs (~180-220 per Baldwin)
- Try the optimization techniques with a dataset of faces at different resolutions and ages with a group of people who look somewhat similar

<img src="baldwins.jpg" />

### Getting the data
Getting the photos in a programmatic way was more of a challenge than I thought.  Here's what I tried:

1) [*Google Custom API*](https://developers.google.com/custom-search/) (includes an image search) - I was already using the Google Cloud to run the notebook so I thought this would be convienient, but ended up abandoning it when I spent 30 minutes trying to figure out a poorly documented 400 error.  I'll try it again in a future project.

2) [*Bing Image Search API*](https://azure.microsoft.com/en-us/services/cognitive-services/bing-image-search-api/) - This API is really well documented and the browser search got me slightly better results than Google (as in more solo photos of each person at high resolutions), but the API results were significantly less suitable and not as relevant.

3) [*Mix of JS and Python from a PyImageSearch post*](https://www.pyimagesearch.com/2017/12/04/how-to-create-a-deep-learning-dataset-using-google-images/) - This technique involves executing some JS code in the console to export a `.txt` file of the image URLS you want based on your search in the browser, then running some Python to download them to your computer.  This ended up being the fastest option this time, and the way it is done was a good learning experience.

Once I got the images downloaded, I took a pass of them to crop any that had other people in the shot and remove anything too low resolution or not relevant.  

Then I had to seperate them into `test`, `train` and `valid` folders.  I used 80% of each brothers' pictures for testing, 20% for training, then took 20% of the training set out for validation.  So my file structure looked like this:
```
/images

 |- test
   |- alec
   |- daniel
   |- stephen
   |- william
   
 |- train
   |- alec
   |- daniel
   |- stephen
   |- william
   
 |- valid
   |- alec
   |- daniel
   |- stephen
   |- william
```

In [None]:
%reload_ext autoreload
%autoreload 2
%matplotlib inline

We import all the necessary packages. We are going to work with the [fastai V1 library](http://www.fast.ai/2018/10/02/fastai-ai/) which sits on top of [Pytorch 1.0](https://hackernoon.com/pytorch-1-0-468332ba5163). The fastai library provides many useful functions that enable us to quickly and easily build neural networks and train our models.

In [None]:
from fastai import *
from fastai.vision import *

## Looking at the data

In [None]:
path_img = './images'

The first thing we do when we approach a problem is to take a look at the data. We _always_ need to understand very well what the problem is and what the data looks like before we can figure out how to solve it. Taking a look at the data means understanding how the data directories are structured, what the labels are and what some sample images look like.

The main difference between the handling of image classification datasets is the way labels are stored. In this particular dataset, labels are stored in the filenames themselves. We will need to extract them to be able to classify the images into the correct categories. Fortunately, the fastai library has a handy function made exactly for this, `ImageDataBunch.from_name_re` gets the labels from the filenames using a [regular expression](https://docs.python.org/3.6/library/re.html).

In [None]:
path = Path('./images')
path.ls()

In [None]:
tfms = get_transforms(do_flip=False)
data = ImageDataBunch.from_folder(path, ds_tfms=tfms, size=224)

In [None]:
data.show_batch(rows=3, figsize=(7,6))

In [None]:
print(data.classes)
len(data.classes),data.c

## Training: resnet34

Now we will start training our model. We will use a [convolutional neural network](http://cs231n.github.io/convolutional-networks/) backbone and a fully connected head with a single hidden layer as a classifier. Don't know what these things mean? Not to worry, we will dive deeper in the coming lessons. For the moment you need to know that we are building a model which will take images as input and will output the predicted probability for each of the categories (in this case, it will have 37 ouptuts).

We will train for 5 epochs (5 cycles through all our data).

In [None]:
learn = ConvLearner(data, models.resnet34, metrics=error_rate)

In [None]:
learn.fit_one_cycle(10)

In [None]:
learn.save('stage-1')

## Results

Let's see what results we have got. 

We will first see which were the categories that the model most confused with one another. We will try to see if what the model predicted was reasonable or not. In this case the mistakes look reasonable (none of the mistakes seems obviously naive). This is an indicator that our classifier is working correctly. 

Furthermore, when we plot the confusion matrix, we can see that the distribution is heavily skewed: the model makes the same mistakes over and over again but it rarely confuses other categories. This suggests that it just finds it difficult to distinguish some specific categories between each other; this is normal behaviour.

In [None]:
interp = ClassificationInterpretation.from_learner(learn)

In [None]:
interp.plot_top_losses(9, figsize=(15,11))

In [None]:
doc(interp.plot_top_losses)

In [None]:
interp.plot_confusion_matrix(figsize=(12,12), dpi=60)

In [None]:
interp.most_confused(min_val=2)

## Unfreezing, fine-tuning, and learning rates

Since our model is working as we expect it to, we will *unfreeze* our model and train some more.

In [None]:
learn.unfreeze()

In [None]:
learn.fit_one_cycle(5)

In [None]:
learn.load('stage-1')

In [None]:
learn.lr_find()

In [None]:
learn.recorder.plot()

In [None]:
learn.unfreeze()
learn.fit_one_cycle(2, max_lr=slice(1e-4,1e-3))

## Training: resnet50

Now we will train in the same way as before but with one caveat: instead of using resnet34 as our backbone we will use resnet50 (resnet34 is a 34 layer residual network while resnet50 has 50 layers. Later in the course you can learn the details in the [resnet paper](https://arxiv.org/pdf/1512.03385.pdf)).

Basically, resnet50 usually performs better because it is a deeper network with more parameters. Let's see if we can achieve a higher performance here.

In [None]:
data = ImageDataBunch.from_folder(path, ds_tfms=get_transforms(), size=299, bs=30)
data.normalize(imagenet_stats)

In [None]:
learn = ConvLearner(data, models.resnet50, metrics=error_rate)

In [None]:
learn.fit_one_cycle(10)

In [None]:
learn.save('stage-1-50')

In [None]:
learn.unfreeze()
learn.fit_one_cycle(1, max_lr=slice(1e-5,1e-4))

In this case it doesn't, so let's go back to our previous model.

In [None]:
learn.load('stage-1-50')

In [None]:
interp = ClassificationInterpretation.from_learner(learn)

In [None]:
interp.most_confused(min_val=1)