# About this notebook
* This kernel is intended for Fastai beginners. It is quite easy to get a good model setup due to the ease of access to state-of-the-art algorithms provided by Fastai.

In [None]:
import os

import numpy as np
import pandas as pd

from fastai import *
from fastai.imports import *
from fastai.vision.all import *

Making sure GPU is on, both needs to be True. If the first is False, make sure GPU is turned on in the notebook settings. For issues with second, please check google for help. :D

In [None]:
torch.cuda.is_available(),torch.backends.cudnn.enabled

In [None]:
def seed_everything(seed):
    os.environ['PYTHONHASHSEED'] = str(seed)
    random.seed(seed)
    np.random.seed(seed)
    torch.manual_seed(seed)
    torch.cuda.manual_seed_all(seed)
    torch.backends.cudnn.deterministic = True
    torch.backends.cudnn.benchmark = False
seed_everything(42)

# Quick data exploration

In [None]:
path = Path('../input/cassava-leaf-disease-classification/')
os.listdir(path)

In [None]:
train = pd.read_csv(path/'train.csv')
train.head()

In [None]:
train.value_counts('label')

High volume of data with an imblance.

In [None]:
train['label'].hist(figsize = (8, 6));

Adding the path of each image makes life easier when running inference.

In [None]:
train['image_id'] = train['image_id'].map(lambda x:path/'train_images'/x)
train.head()

# Loading data into Fastai format

* Fastai provides several options for loading images; csv, folders, path, df...
* In this case; the images are loaded using the pandas dataframe which contains the path of each image. You can also specify the image folder's path to the ImageDataLoaders instead of adding the path to df explicitly.
* item_tfm resizes each image in terms of pixels, large sizes greatly increase computation.
* aug_transforms() applies image transformations. For a full detail of the transformations applied, check out their doc: https://docs.fast.ai/vision.augment.html#aug_transforms.
* Normalize.from_stats(*imagenet_stats): Normalizes images based on mean and std of the imagenet pretrained model as we are using a pretrained resnet model.
* batch_tfms applies the above two points to each batch of images.
* Have a look at their doc if you want to learn more about the parameters and other methods to load data: https://docs.fast.ai/vision.data.html.

In [None]:
tfms = aug_transforms()
data = ImageDataLoaders.from_df(train, valid_pct=0.2, item_tfms=Resize(512), batch_tfms=[*tfms,Normalize.from_stats(*imagenet_stats)])

In [None]:
data.show_batch()

# Modelling

* Fastai uses a single trainer class (cnn_learner) that takes in the data, model, metrics, loss functions, optimizer functions etc.
* Here, the Resnet50 model is used. Its a convolutional neural network that has 50 layers and is pretrained on thousands of images with optimised weights.
* You can also easily use techniques such as label smoothing with the Learner object if you wish. 
* to_native_fp16() changes floating point representation to 16-bit; GPU computation is faster for neural networks.

As we cannot use the internet for the competition, the pretrained resnet50 model is loaded (placed in the folder the online model would be placed in) in as a input dataset.

In [None]:
if not os.path.exists('/root/.cache/torch/hub/checkpoints/'):
    os.makedirs('/root/.cache/torch/hub/checkpoints/')
    !cp '../input/resnet50preload/resnet50-19c8e357.pth' '/root/.cache/torch/hub/checkpoints/resnet50-19c8e357.pth'

In [None]:
learn = cnn_learner(data, models.resnet50, pretrained=True, metrics=accuracy, model_dir='../working/').to_native_fp16()

**Training (Short version)**
1. First we find the learning rate.
2. Freeze the model and train only the last layer(has random weights).
> As the pretrained model is used, the initial weights on the other layers have 'good' values. Hence we freeze the first layers and train only the last layer that has random weights.
3. Unfreeze, find the learning rate again, and train all layers.
> Model is unfreezed to train all layers.

In [None]:
learn.lr_find()

* fit_one_cycle is widely used to train models, you can also try other fit methods.
* We freeze and train the last layer.
* The first parameter is the Epoch; Number of times the dataset is iterated by the CNN. Second is the learning rate.
* You can increase the epochs to experiment. Higher values usually give better results however it increases computation drastically.

In [None]:
learn.freeze()
learn.fit_one_cycle(1,1e-1)

Observing the loss function.

In [None]:
learn.recorder.plot_loss()

* We unfreeze to train all the layers.
* The learning rate changes due to unlocking all layers.

In [None]:
learn.unfreeze()
learn.lr_find()

* An approximated value of the learning rate is used. You can try others, using small rates increases computation.
* Use more epochs, at least 10 (experiment with values to get optimal validation loss values) for actual model.

In [None]:
learn.fit_one_cycle(5, 1e-4)

In [None]:
learn.recorder.plot_loss()

Change back to 32-bit.

In [None]:
learn = learn.to_native_fp32()

# Inference

* It is easy to get the predictions on the test set as long as the test data is in the same format as train data.
* Add in path of test images similar to train images.

In [None]:
test = pd.read_csv(path/'sample_submission.csv')
tmp_test = test.copy()
tmp_test['image_id'] = tmp_test['image_id'].map(lambda x:path/'test_images'/x)
tmp_test.head()

Fastai provides a method (test_dl) that can parse the test data with the same parameters as the training data.

In [None]:
test_data = data.test_dl(tmp_test)
test_data.show_batch()

* tta applies the same image transformations as on the training data.
* In practise, the image is transformed n times and its predicted results are averaged.

In [None]:
predict, t = learn.tta(dl=test_data, n=8, beta=0)

In [None]:
test['label'] = predict.argmax(dim=-1).numpy()
test.head()

In [None]:
test.to_csv('submission.csv', index=False)