### *Neonympha* classificiation with `pytorch` and `fastai`

Chris Hamm - 2019-09-19 (first code)

I have a lot of photos of *Neonympha* butterflies. Can a CNN tell them apart?

### Preliminaries

Prepare the computing environment

In [1]:
%reload_ext autoreload
%autoreload 2
%matplotlib inline
%autosave 60

Autosaving every 60 seconds


In [4]:
from fastai import *
from fastai.vision.all import *

import numpy as np
batch_size = 16

Set the random seed

In [5]:
np.random.seed(1138)

Read in the data

In [14]:
image_path = '../data/images/train_test_validate'

In [15]:
file_names = get_image_files(image_path)
print(file_names[:4])

[Path('../data/images/train_test_validate/test/Nfr/Nfr_4339.JPG'), Path('../data/images/train_test_validate/test/Nfr/Nfr_4334.JPG'), Path('../data/images/train_test_validate/test/Nfr/Nfr_4268.JPG'), Path('../data/images/train_test_validate/test/Nfr/Nfr_4344.JPG')]


Creat the pattern to identify the categories (Nmi, Nfr, Nar, Nhe)

In [16]:
cat_pat = '(N[a-z][a-z])'

In [9]:
image_data = ImageDataBunch.from_name_re(image_path, file_names, \
                                         pat = cat_pat, ds_tfms = get_transforms(), \
                                         size = 301, bs = batch_size)

NameError: name 'ImageDataBunch' is not defined

Normalize the data

In [None]:
image_data.normalize(imagenet_stats)

In [None]:
print(image_data.classes)

Print some images

In [None]:
image_data.show_batch(rows = 3, figsize = (6, 6))

In [None]:
len(image_data.classes), image_data.c

### Train `resnet34`

In [None]:
image_learn34 = cnn_learner(image_data, models.resnet34, metrics = error_rate)

In [None]:
image_learn34.fit_one_cycle(4)

### Results

In [None]:
interpretation34 = ClassificationInterpretation.from_learner(image_learn34)

losses34, idxs34 = interpretation34.top_losses()

len(image_data.valid_ds) == len(losses34) == len(idxs34)

Save the model

In [None]:
image_learn23.save('model34_1')

In [None]:
interpretation34.plot_top_losses(6, figsize = (15, 11))

In [None]:
# doc(interpretation.plot_top_losses)

In [None]:
interpretation34.plot_confusion_matrix(figsize = (6, 6), dpi = 200)

In [None]:
interpretation34.most_confused(min_val = 0) # only Nhe and Nar are confused

### `resnet34` unfreezing, fine-tuning, and learning rates

In [None]:
image_learn34.load('model34_1')

In [None]:
# Need to add validation set
image_learn34.lr_find()

In [None]:
image_learn34.recorder.plot()

## Training `resnet50`

In [None]:
image_learn50 = cnn_learner(image_data, models.resnet50, metrics = error_rate)

In [None]:
image_learn50.fit_one_cycle(4) #error_rate 0.032967 as low as can go

### Results

In [None]:
interpretation50 = ClassificationInterpretation.from_learner(image_learn50)

losses50, idxs50 = interpretation50.top_losses()

len(image_data.valid_ds) == len(losses50) == len(idxs50)

In [None]:
interpretation50.plot_top_losses(3, figsize = (15, 11))

In [None]:
interpretation50.plot_confusion_matrix(figsize = (6, 6), dpi = 200)

In [None]:
interpretation34.most_confused(min_val = 0) # only Nhe and Nar are confused

### Update model

In [None]:
# Need to add validation set
image_learn50.lr_find()

In [None]:
image_learn50.recorder.plot()

In [None]:
image_learn50.unfreeze()
image_learn50.fit_one_cycle(2, max_lr = slice(1e-5, 1e-4)) #0.021978 error_rate at 229 pixel images