**Objective:** Trained scientists visit designated areas and take note of the species inhabiting them. Using such a highly qualified workforce is expensive, time inefficient, and insufficient since humans cannot cover large areas when sampling. Use DL to predict the presence or absence of invasive species in areas that have not been sampled.

In [None]:
# Get automatic reloading and inline plotting
%reload_ext autoreload
%autoreload 2
%matplotlib inline

In [None]:
### Import Required Libraries
# Using Fastai Libraries
from fastai.imports import *
from fastai.transforms import *
from fastai.conv_learner import *
from fastai.model import *
from fastai.dataset import *
from fastai.sgdr import *
from fastai.plots import *
import numpy as np
import pandas as pd
import torch

PATH is the path to your data, and sz is the size that the images will be resized to in order to ensure that the training runs quickly.  bs is the batch size that is we can break the data up into smaller parts. arch, is the selected architecture of the neural network model.

In [None]:
import os
PATH = "../input"
print(os.listdir(PATH))
TMP_PATH = "/tmp/tmp"
MODEL_PATH = "/tmp/model/"
sz= 224
bs = 58
arch = resnet34

The programming framework used to behind the scenes to work with NVidia GPUs is called CUDA. Further, to improve performance, we need to check for NVidia package called CuDNN (special accelerated functions for deep learning).

In [None]:
### Checking GPU Set up
print(torch.cuda.is_available())
print(torch.backends.cudnn.enabled)

In [None]:
files = os.listdir(f'{PATH}/train')[:5]
files

Let's explore what the data images look like:

In [None]:
img = plt.imread(f'{PATH}/train/{files[0]}')
plt.imshow(img);

In [None]:
img.shape

In [None]:
img[:4,:4]

Get the distribution of the image sizes:

In [None]:
label_csv = f'{PATH}/train_labels.csv'
n = len(list(open(label_csv))) - 1 # header is not counted (-1)
val_idxs = get_cv_idxs(n) # random 20% data for validation set
print(n)
print(len(val_idxs))

In [None]:
label_df = pd.read_csv(label_csv)
label_df.head()

In [None]:
### Count of both classes
label_df.pivot_table(index="invasive", aggfunc=len).sort_values('name', ascending=False)

In [None]:
tfms = tfms_from_model(arch, sz, aug_tfms=transforms_side_on, max_zoom=1.1)
data = ImageClassifierData.from_csv(PATH, 'train', f'{PATH}/train_labels.csv', test_name='test', # we need to specify where the test set is if you want to submit to Kaggle competitions
                                   val_idxs=val_idxs, suffix='.jpg', tfms=tfms, bs=bs)

In [None]:
fn = f'{PATH}/train' + data.trn_ds.fnames[0]
#img = PIL.Image.open(fn)
size_d = {k: PIL.Image.open(f'{PATH}/' + k).size for k in data.trn_ds.fnames}
row_sz, col_sz = list(zip(*size_d.values()))
row_sz = np.array(row_sz); col_sz = np.array(col_sz)
plt.hist(row_sz);

**Our first model**
To make the process quick we will first run a *pretrained* model and observe the results. Further, we can tweak the model for improvements. This means for our pre-trained model, that is, a model created by some one else to solve a different problem, the weights corresponding to the activation function are saved/trained and being applied here. 
The chosen architechture to start: **resnet34** 

In [None]:
## Data Sizes
len(data.trn_ds), len(data.test_ds)

In [None]:
def get_data(sz, bs): # sz: image size, bs: batch size
    tfms = tfms_from_model(arch, sz, aug_tfms=transforms_side_on, max_zoom=1.1)
    data = ImageClassifierData.from_csv(PATH, 'train', f'{PATH}/train_labels.csv', test_name='test',
                                       val_idxs=val_idxs, suffix='.jpg', tfms=tfms, bs=bs)
    
    return data #if sz > 500 else data.resize(512,TMP_PATH) 
# Reading the jpgs and resizing is slow for big images, so resizing them all to 340 first saves time

In [None]:
data = get_data(sz, bs)
learn = ConvLearner.pretrained(arch, data, precompute=True,tmp_name=TMP_PATH, models_name=MODEL_PATH)
learn.fit(1e-2, 3)

Current level of Accuracy: Approx around 93-95. Now, let's try to understand what is happening by evaluation of performance metrics and looking at the right/wrong predictions. That is, we will explore:
* A few correct labels at random
* A few incorrect labels at random
* The most correct labels of each class (i.e. those with highest probability that are correct)
* The most incorrect labels of each class (i.e. those with highest probability that are incorrect)
* The most uncertain labels (i.e. those with probability closest to 0.5).

In [None]:
# this gives prediction for validation set. Predictions are in log scale
log_preds = learn.predict()
log_preds.shape

In [None]:
preds = np.argmax(log_preds, axis=1)  # from log probabilities to 0 or 1
probs = np.exp(log_preds[:,1])        # pr(1) # Where Species = Invasive is class 1
data.classes

In [None]:
def rand_by_mask(mask): return np.random.choice(np.where(mask)[0], min(len(preds), 4), replace=False)
def rand_by_correct(is_correct): return rand_by_mask((preds == data.val_y)==is_correct)
def plots(ims, figsize=(12,6), rows=1, titles=None):
    f = plt.figure(figsize=figsize)
    for i in range(len(ims)):
        sp = f.add_subplot(rows, len(ims)//rows, i+1)
        sp.axis('Off')
        if titles is not None: sp.set_title(titles[i], fontsize=16)
        plt.imshow(ims[i])
def load_img_id(ds, idx): return np.array(PIL.Image.open(f'{PATH}/'+ds.fnames[idx]))

def plot_val_with_title(idxs, title):
    imgs = [load_img_id(data.val_ds,x) for x in idxs]
    title_probs = [probs[x] for x in idxs]
    print(title)
    return plots(imgs, rows=1, titles=title_probs, figsize=(16,8)) if len(imgs)>0 else print('Not Found.')

def most_by_mask(mask, mult):
    idxs = np.where(mask)[0]
    return idxs[np.argsort(mult * probs[idxs])[:4]]

def most_by_correct(y, is_correct): 
    mult = -1 if (y==1)==is_correct else 1
    return most_by_mask(((preds == data.val_y)==is_correct) & (data.val_y == y), mult)

In [None]:
# 1. A few correct labels at random
plot_val_with_title(rand_by_correct(True), "Correctly classified")

In [None]:
# 2. A few incorrect labels at random
plot_val_with_title(rand_by_correct(False), "Incorrectly classified")

In [None]:
plot_val_with_title(most_by_correct(0, True), "Most correct classifications: Class 0")

In [None]:
plot_val_with_title(most_by_correct(1, True), "Most correct classifications: Class 1")

In [None]:
plot_val_with_title(most_by_correct(0, False), "Most incorrect classifications: Actual Class 0 Predicted Class 1")

In [None]:
plot_val_with_title(most_by_correct(1, False), "Most incorrect classifications: Actual Class 1 Predicted Class 0")

In [None]:
most_uncertain = np.argsort(np.abs(probs -0.5))[:4]
plot_val_with_title(most_uncertain, "Most uncertain predictions")

Scope of Improvement:
* Find an Optimal Learning Rate
* Use Data Augmentation techniques
* Instead of using a Pretrained model, train the layers of the neural network based on our dataset

In [None]:
## How does loss change with changes in Learning Rate (For the Last Layer)
learn.lr_find()
learn.sched.plot_lr()

In [None]:
# Note that the loss is still clearly improves till lr=1e-2 (0.01). 
# The LR can vary as a part of the stochastic gradient descent over time.
learn.sched.plot()

**Data Augmentation**
Data augmentation is a good step to prevent overfitting. That is, by cropping/zooming/rotating the image, we can ensure that the model does not learn patterns specific to the train data and generalizes well to new data. 

In [None]:
def get_augs():
    tfms = tfms_from_model(arch, sz, aug_tfms=transforms_side_on, max_zoom=1.1)
    data = ImageClassifierData.from_csv(PATH, 'train', f'{PATH}/train_labels.csv',
                                        bs = 2, tfms=tfms,
                    suffix='.jpg', val_idxs=val_idxs, test_name='test')
    x,_ = next(iter(data.aug_dl))
    return data.trn_ds.denorm(x)[1]

In [None]:
# An Example of data augmentation
ims = np.stack([get_augs() for i in range(6)])
plots(ims, rows=2)

In [None]:
#tfms = tfms_from_model(arch, sz, aug_tfms=transforms_side_on, max_zoom=1.2)
data = ImageClassifierData.from_csv(PATH,'train', f'{PATH}/train_labels.csv', tfms=tfms,
                                      suffix='.jpg', val_idxs=val_idxs, test_name='test')
learn = ConvLearner.pretrained(arch, data, precompute=True,tmp_name=TMP_PATH, models_name=MODEL_PATH)
learn.fit(1e-2, 3)

With Precompute = TRUE, all layers of the Neural network are set to frozen excluding the last layer. Thus we are only updating the weights in the last layer with our dataset. Now, we will train the model with the option precompute as false and cycle_len enabled. Cycle Length uses a technique called stochastic gradient descent with restarts (SGDR), a variant of learning rate annealing, which gradually decreases the learning rate as training progresses. This is helpful because as we get closer to the optimal weights, we want to take smaller steps.

In [None]:
learn.precompute=False
learn.fit(1e-2, 3, cycle_len=1)

In [None]:
learn.sched.plot_lr()

To unfreeze all layers however, we will call unfreeze. We will also try differential rates for the respective layers.

In [None]:
learn.unfreeze()
lr=np.array([1e-4,1e-3,1e-2])
learn.fit(lr, 3, cycle_len=1, cycle_mult=2)

In [None]:
learn.sched.plot_lr()

Above, we have the learning rate of the final layers. The learning rates of the earlier layers are fixed at the same multiples of the final layer rates as we initially requested (i.e. the first layers have 100x smaller, and middle layers 10x smaller learning rates, since we set lr=np.array([1e-4,1e-3,1e-2]).

To get a better picture, we can use Test time augmentation, that is we use data augmentation techniques on our validation set. Thus, by making predictions on both the validation set images and their augmented images, we will be more accurate.

In [None]:
log_preds,y = learn.TTA()
probs = np.mean(np.exp(log_preds),0)

In [None]:
accuracy_np(probs, y)

**Results:**

In [None]:
log_preds = learn.predict()
preds = np.argmax(log_preds, axis=1)  # from log probabilities to 0 or 1
probs = np.exp(log_preds[:,1])        # pr(1) # Where Species = Invasive is class 1
# Confusion Matrix
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y, preds)
plot_confusion_matrix(cm, data.classes)

In [None]:
plot_val_with_title(most_by_correct(0, False), "Most incorrect classifications: Actual Class 0 Predicted Class 1")

In [None]:
plot_val_with_title(most_by_correct(1, False), "Most incorrect Classifications: Actual Class 1 Predicted Class 0")

**Code Summary and Explanation**
**Steps**
*Data Exploration: *
*     Explore the data size and get an idea of how the images look like.
*     Check the distribution of image sizes. Resizing of Images (Standardizing) might be required to speed up the process.

*Models Tweaking:*
*      Run a quick model (smaller number of epochs) with precompute = TRUE, that is only updating the weights of last layer. 
*      Evaluate the Performance by observing the train and validation loss and the overall accuracy.
*      Explore the Images of the most correct/incorrect classifications to understand if there are any visible patterns/reasons of wrong classification. It helps to get more comfortable with what the model is doing. 
*      Find optimal Learning Rate using lr_find(). We want a learning rate where loss is improving.
*      Train last layer from precomputed activations for 1-2 epochs.
*      Use data augmentation and train the last layer again (cycle_len = 1).
*      Unfreeze all layers and retrain the model. Set the earlier layers to 3x-10x lower learning rate than next higher layer.
*      Recheck the Learning Rate (lr_find).
*      Train full network with cycle_mult=2 until over-fitting.
*      Use Test time augmentation to get a better picture regarding the accuracy.

In [None]:
log_preds, y = learn.TTA(is_test=True) # use test dataset rather than validation dataset
probs = np.mean(np.exp(log_preds),0)
df = pd.DataFrame(probs)
df.columns = data.classes
df = pd.DataFrame(df.loc[:, '1'])
df.insert(0, 'id', [o[5:-4] for o in data.test_ds.fnames])
df['temp'] = df['id'].astype(float)
df = df.sort_values('temp',ascending=True) 
df = df.loc[:,['id','1']]
df.columns = ['name','invasive']
df.to_csv("submit.csv", index=False)