In [None]:
from fastcore.all import *
from fastai.vision.all import *

## Set up the paths to each of the datasets. 

The training root folder should be composed of 2 subfolders, one for the training set and one for the validation set. The training set use used to adjust the weights of the model, and the validation set is used to assess the loss after each training epoch.

In [None]:
training_root = "/data/"
test_root = "/test/"
training_set_folder = "train"
validation_set_folder = "val"

## Create a datablock and data loaders

```
GrandparentSplitter(train_name=training_set_folder, valid_name=validation_set_folder),
```
This tells the data block that the training and validation sets both exist in separate folders called `training_set_folder` and `validation_set_folder`

```
get_items=get_image_files,
```
Defines how the datablock loads the files, in this case we just load every image in the target directory.

```
get_y=parent_label,
```
This tells the datablock that each class label exists as a subfolder underneath the validation and training set folders.

```
batch_tfms=aug_transforms(size=224),
```
Defines a set of data augmentations on each input image during the training phase. Randomizes croping, warping etc in order to get more robust results and reduce overfitting. These run on the GPU during training.

```
item_tfms=[Resize(600, method='squish')]
```
Item transformations are run by the CPU prior to sending them to the GPU. Here we need all the images to be the same size so we squish to a max of 600px.

```
.dataloaders(training_root, bs=64)
```
Converts the block definition to a dataloader and sets the batch size to 64. Batch size is the number of images pushed to the GPU at once.

In [None]:
dls = DataBlock(
    blocks=(ImageBlock, CategoryBlock), 
    get_items=get_image_files, 
    splitter=GrandparentSplitter(train_name=training_set_folder, valid_name=validation_set_folder),
    get_y=parent_label,
    batch_tfms=aug_transforms(size=224),
    item_tfms=[Resize(600, method='squish')]
).dataloaders(training_root, bs=64)

## Show a batch of Test Data

Just to examine what the image augmentations are doing and what our data set looks like, here we load a few images at random and inspect.

In [None]:
dls.show_batch(max_n=12)

## Create the learner

Here we load a resnet50 pretrained model and pass it our data loaders.

In [None]:
learn = vision_learner(dls, resnet50, metrics=error_rate)

## Train the model

Now we fine tine the model for 4 epochs. Fine tuning resets a few of the top layers of the model without erasing all of the weights from lower layers. For image classification this generally reduces the amount of training time since the model is already set up do do things like recognize edges, and sets of smaller features that will help our model.

A training epoch involves taking each batch of the training set and using it to adjust the weights of the top few layers we're fine tuning, then measuring the loss using the validation set.

In [None]:
learn.fine_tune(4)

## Inspecting results

In order to have some visibility into where our model had problems, we create an interpretation and then plot the top losses that occurred during validation.

In [None]:
interp = Interpretation.from_learner(learn)
interp.plot_top_losses(5)

## Testing against a separate dataset

Create a test dataset and validate against it. Test data should be structured like the training set but without a separate validation set.
So the root should look like this:

- /test_root
    - valorant
    - not_valorant

In [None]:
test_dl = learn.dls.test_dl(get_image_files(test_root), with_labels=True, shuffle=True)
learn.validate(dl=test_dl)