Fastai vision on iWildCam 2020 data resized to 256x256.
Resnet50 + mixup + TTA

## Data pre-processing

In [None]:
import os

import pandas as pd
from matplotlib import pyplot as plt

from fastai import *
from fastai.vision import *

import json

%matplotlib inline

Read in training dataset.

In [None]:
test_images = os.listdir("../input/iwildcam2020-256/256_images/test/images/")
train_images = os.listdir("../input/iwildcam2020-256/256_images/train/images/")

In [None]:
with open(r'/kaggle/input/iwildcam-2020-fgvc7/iwildcam2020_train_annotations.json') as json_file:
    train_data = json.load(json_file)

In [None]:
df_train = pd.DataFrame({'id': [item['id'] for item in train_data['annotations']],
                         'category_id': [item['category_id'] for item in train_data['annotations']],
                         'image_id': [item['image_id'] for item in train_data['annotations']],
                         'location': [item['location'] for item in train_data['images']],
                         'file_name': [item['file_name'] for item in train_data['images']]})

df_train.head()

In [None]:
df_train.shape

In [None]:
df_train = df_train[df_train['file_name'].isin(train_images)]

In [None]:
df_train.shape

Split training validation using categories. 70% of category entries in training set.

In [None]:
cat_images = dict()
cat_count = dict()

annotations = train_data['annotations']
_images = train_data['images']
for i, annotation in enumerate(annotations):
    _img = annotation['image_id']
    cat = annotation['category_id']
    
    imgs = cat_images.get(cat, None)
    if imgs is None:
        cat_images[cat] = [{'image_id': _img, 'category': cat}]
    else:
        cat_images[cat].append({'image_id': _img, 'category': cat})
        
    count = cat_count.get(cat, 0)
    if count == 0:
        cat_count[cat] = 1
    else:
        cat_count[cat] += 1
        
n_train = dict()
n_val = dict()

for cat, count in cat_count.items():
    _train = math.floor(count * 0.70)
    if _train < 1:
        _train = 1
    _val = count - _train
    n_train[cat] = _train
    n_val[cat] = _val

train_images = []
val_images = []
for cat in cat_images.keys():
    random.shuffle(cat_images[cat])
    train_images += cat_images[cat][:n_train[cat]]
    val_images += cat_images[cat][n_train[cat]:]

val_img_dt = pd.DataFrame(val_images)



Tag validation set.

In [None]:
df_train['is_valid'] = np.where(df_train.image_id.isin(val_img_dt['image_id']), True, False)

Check what is location split.

In [None]:
loc_valid = df_train.loc[(df_train['is_valid'] == True)].location.unique()
loc_train = df_train.loc[(df_train['is_valid'] == False)].location.unique()

loc_valid.shape
df_train.category_id.unique().shape

In [None]:
df_train.groupby('is_valid').size()

Remove corrupted images.

In [None]:
df_train.drop(df_train.loc[df_train['file_name']=='87022118-21bc-11ea-a13a-137349068a90.jpg'].index, inplace=True)
df_train.drop(df_train.loc[df_train['file_name']=='8792549a-21bc-11ea-a13a-137349068a90.jpg'].index, inplace=True)

In [None]:
df_train.category_id.unique().shape

Read test data.

In [None]:
with open(r'/kaggle/input/iwildcam-2020-fgvc7/iwildcam2020_test_information.json') as f:
    test_data = json.load(f)

In [None]:
df_test = pd.DataFrame.from_records(test_data['images'])
df_test.head()

## Modelling part

I'm creating a ImageDataBunch in a few steps. DataBunch is an object that the model needs.
First I create an `ImageList` with training and test data.
Secondly I define the transformations that will be applied to the pictures.
I say that labels for the training come from the dataframe and are stored in `category_id` column. I add the test set.
Finally I can create the databunch. Apply transofrmations to the data, resize all pictures to 128x128, add reflection padding. I want to use a batch size `bs` of 256 images, and normalize the data with `imagenet_stats`.

In [None]:
train, test = [ImageList.from_df(df, path='../input/iwildcam2020-256/256_images/', cols='file_name', folder=folder, suffix='') 
               for df, folder in zip([df_train, df_test], ['train/images', 'test/images'])]
trfm = get_transforms(max_rotate=20, max_zoom=1.3, max_lighting=0.4, max_warp=0.4,
                      p_affine=1., p_lighting=1.)
src = (train.use_partial_data(1)
        .split_from_df(col='is_valid')
        .label_from_df(cols='category_id')
        .add_test(test))
data = (src.transform(trfm, size = 128, padding_mode = 'reflection')
        .databunch(path=Path('.'), bs = 256).normalize(imagenet_stats))

In [None]:
print(data.classes)

In [None]:
org_classes = pd.DataFrame({"org_category": data.classes})
org_classes['Category'] = org_classes.index

In [None]:
def _plot(i,j,ax):
    x,y = data.train_ds[1]
    x.show(ax, y=y)

plot_multi(_plot, 3, 3, figsize=(8,8))

Show batch.

In [None]:
data.show_batch()

Show number of categories in the data.

In [None]:
data.c

I use transfer learning. This means I will use a pre-trained model in this case Resnet50 and adapt it to my dataset. In transfer learning we keep the convolutionals layers: body or the backbone with their weigths pre-trained on ImageNet and only define a new head. I use the head defined by the fastai library.

I use accuracy as the metric to print. I add mixup. Model won't be trained on actual photos, but on random combinations of them.

In [None]:
learn = cnn_learner(data, base_arch=models.resnet50, metrics=accuracy).mixup()

The most important parameter to set is learning rate which is the step size in the optimization to reach the loss minimum. To find the learning rate I use `lr_find`. What it does is it starts with a very small lr, increases it with every batch and records the loss. Then the lr values are ploted against the losses.

In [None]:
learn.lr_find()
learn.recorder.plot(suggestion=True)

Recommended methods choosing the LR:
 * at the steepest decline of loss
 * 10x prior to the minimum loss. 

In [None]:
learn.recorder.min_grad_lr

Fastai.vision module divides the architecture in 3 groups and trains them with variable learning rates depending on what you input. (Starting layers usually don't require large variations in parameters)

Additionally, if you use 'fit_one_cycle', all the groups will have learning rate annealing with their respective variable learning.
First I freeze the body weights and only train the head.

In [None]:
learn.fit_one_cycle(10, slice(0.01))

After the random weights in the head are trained a bit, we can unfreeze the weights in the whole network and train everything.

In [None]:
learn.unfreeze()
learn.lr_find()
learn.recorder.plot(suggestion=True)

In [None]:
learn.fit_one_cycle(10, slice(1e-5, 1e-4))

In [None]:
interp = ClassificationInterpretation.from_learner(learn)
interp.plot_confusion_matrix(figsize=(12,12), dpi=60)

Make predictions on the test set using test time augmentation. TTA makes 4 predictions using the transforms of the training set and averages them. 

In [None]:
preds,y = learn.TTA(ds_type=DatasetType.Test)

In [None]:
pred_csv = pd.DataFrame(preds.numpy())
pred_csv['Id'] = learn.data.test_ds.items
pred_csv.to_csv("outout_preds.csv", index = False)

In [None]:
submission = pd.read_csv('../input/iwildcam-2020-fgvc7/sample_submission.csv')
id_list = list(submission.Id)
pred_list = list(np.argmax(preds.numpy(), axis=1))
pred_dict = dict((key, value.item()) for (key, value) in zip(learn.data.test_ds.items,pred_list))
pred_ordered = [pred_dict['../input/iwildcam2020-256/256_images/test/images/' + id + '.jpg'] for id in id_list]
submission_with_idx = pd.DataFrame({'Id':id_list,'Category':pred_ordered})
submission_fixed_labels = pd.merge(submission_with_idx, org_classes, on = 'Category', how='left')
submission_fixed_labels = submission_fixed_labels.drop(['Category'], axis = 1)
submission_fixed_labels.rename(columns={'org_category': 'Category'}, inplace=True)

submission_fixed_labels.to_csv("submission.csv".format(Category),index = False)
print("Done")