# Cassava Classification - fastai Starter


The datasets for this notebook are from [this](https://www.kaggle.com/tanlikesmath/cassava-classification-eda-fastai-starter) notebook, while this tutorial will be a twist on what has already been done!

## What will this tutorial cover?

We will be looking at how to use the high-level `DataBlock` API for this challenge, how to use some advanced training features in the library, as well as some advanced inference features.

## Importing the Library

First, let's import the `fastai.vision` library for us to use and work with:

In [None]:
from fastai.vision.all import *

For these results to be reproducable on your end, we will go ahead and set the `random`, `torch`, and `numpy` seeds with the `set_seed` function

In [None]:
set_seed(16)

## Setting up our data

Alright, now that we have our imports let's go ahead and look at our data.

First we'll make a `Path` object so we can see what all we have available to us:

In [None]:
path = Path("../input")

We can then run an `ls()` (a monkey-patched function by `fastcore`) to see all the files and directories in here:

In [None]:
path.ls()

We can see that the resnet50 pretrained model is there and the competition data. We'll make a `data_path` to point to this directory and take another peek:

In [None]:
data_path = path/'cassava-leaf-disease-classification'
data_path.ls()

Our images are stored away in `train_images` and `test_images`, and we have a `train.csv` for our labels. Let's load it into `pandas`:

In [None]:
df = pd.read_csv(data_path/'train.csv')

And take a look:

In [None]:
df.head()

### Adjusting the `image_id`

We have an `image_id` and a `label`. We're going to modify our values in `image_id` to make our lives easier when it comes to running inference. 

Why? 


In fastai we have a `get_x` and a `get_y` and this will dictate how it will *always* look for our data, regardless of how it is stored. If we built a `get_y` based on the current `DataFrame`, it would look something like so:

In [None]:
def get_x(row): return data_path/'train_images'/row['image_id']

And we can see it work below:

In [None]:
PILImage.create(get_x(df.iloc[0]))

That's great! ***But*** there is a very large issue here. We always have our `get_x` tied to the training directory which makes it more complicated for us to work with our `test_images` directory.

What's the solution? 

Add `train_images` into the dataframe through a `lambda` function:

In [None]:
df.head()

In [None]:
df['image_id'] = df['image_id'].apply(lambda x: f'train_images/{x}')

And now we can see our new table:

In [None]:
df.head()

Now we won't run into an issue when we're testing. 


### Adjusting our label

What else can we do?

Let's change our lables into something more readable through a dictionary (these come from the `json` file):

In [None]:
idx2lbl = {0:"Cassava Bacterial Blight (CBB)",
          1:"Cassava Brown Streak Disease (CBSD)",
          2:"Cassava Green Mottle (CGM)",
          3:"Cassava Mosaic Disease (CMD)",
          4:"Healthy"}

df['label'].replace(idx2lbl, inplace=True)

In [None]:
df.head()

And now we're good to go! Let's build the `DataBlock`

## Building the `DataBlock`

Let's think about how our problem looks. `fastai` provides blocks to center around *most* situations, and this is no exception.

We know our input is an image and our output is a category, so let's use `ImageBlock` and `CategoryBlock`:

In [None]:
blocks = (ImageBlock, CategoryBlock)

Next we'll want to split our data somehow. We'll use a `RandomSplitter` and split our data 80/20

In [None]:
splitter = RandomSplitter(valid_pct=0.2)

Our `DataBlock` is also going to want to know how to get our data. Since our data all stems from a `csv`, we will make a `get_x` and `get_y` function:

> We already made our `get_x` earlier, so I have brought it down here

In [None]:
def get_x(row): return data_path/row['image_id']

def get_y(row): return row['label']

We can see that when we write custom `get_` functions, it will accept one *row* of our `DataFrame` to look at, and so we can filter as a result.

Next we'll come up with some basic data augmentations. 

Our `item_tfms` should ensure everything is ready to go into a batch, so we will use `Resize`.

Our `batch_tfms` should apply any extra augmentations we may want. We'll use `RandomResizedCropGPU`, `aug_transforms`, and apply our `Normalize`:
> We will normalize our data based on ImageNet, since that is what our pretrained model was trained with

In [None]:
item_tfms = [Resize(448)]
batch_tfms = [RandomResizedCropGPU(224), *aug_transforms(), Normalize.from_stats(*imagenet_stats)]

Finally, let's build the `DataBlock`!

In [None]:
block = DataBlock(blocks = blocks,
                 get_x = get_x,
                 get_y = get_y,
                 splitter = splitter,
                 item_tfms = item_tfms,
                 batch_tfms = batch_tfms)

And now we can turn this into some `DataLoaders`. We're going to pass in some items (which in our case is our `DataFrame`) and a batch size to use. We will use 64:

In [None]:
dls = block.dataloaders(df, bs=64)

Let's look at a batch of data to make sure everything looks alright:

In [None]:
dls.show_batch(figsize=(12,12))

Looks great! Let's move onto training our model

## The Model

The code here is from tanlikesmath's notebook linked at the start. This will move our pretrained weights to where fastai will expect it:

In [None]:
# Making pretrained weights work without needing to find the default filename
if not os.path.exists('/root/.cache/torch/hub/checkpoints/'):
        os.makedirs('/root/.cache/torch/hub/checkpoints/')
!cp '../input/resnet50/resnet50.pth' '/root/.cache/torch/hub/checkpoints/resnet50-19c8e357.pth'

Now that our weights are setup, let's look at how to use `cnn_learner`. We're going to use a few tricks during our training that fastai can help us out with. 

Specifically we will be using the `ranger` optimizer function and `LabelSmoothingCrossEntropy` as our loss function.

Along with these we'll be using the `accuracy` metric as this is how this competition will grade our results with:

In [None]:
learn = cnn_learner(dls, resnet50, opt_func=ranger, loss_func=LabelSmoothingCrossEntropy(), metrics=accuracy)

`fastai` has a `fit_flat_cos` function designed to best utilize the `ranger` optimizer function. Jeremy and Sylvain also came up with a `fine_tune` function best utilized for transfer learning which uses the `fit_one_cycle`, or One-Cycle Policy. We're going to create our own hybrid `fine_tune` method that will do a similar paradigm.

We can also tie it to our `Learner` objects through the `@patch` functionality. First we'll look at what `fine_tune`'s source code looks like, and rewrite it for `fit_flat_cos`:

In [None]:
def fine_tune(self:Learner, epochs, base_lr=2e-3, freeze_epochs=1, lr_mult=100,
              pct_start=0.3, div=5.0, **kwargs):
    "Fine tune with `freeze` for `freeze_epochs` then with `unfreeze` from `epochs` using discriminative LR"
    self.freeze()
    self.fit_one_cycle(freeze_epochs, slice(base_lr), pct_start=0.99, **kwargs)
    base_lr /= 2
    self.unfreeze()
    self.fit_one_cycle(epochs, slice(base_lr/lr_mult, base_lr), pct_start=pct_start, div=div, **kwargs)

Here's what it rewritten as looks like:

> I have also added in the potential for callbacks, we will see more on why later

In [None]:
@patch
def fine_tune_flat(self:Learner, epochs, base_lr=4e-3, freeze_epochs=1, lr_mult=100, pct_start=0.75, 
                   first_callbacks = [], second_callbacks = [], **kwargs):
    "Fine-tune applied to `fit_flat_cos`"
    self.freeze()
    self.fit_flat_cos(freeze_epochs, slice(base_lr), pct_start=0.99, cbs=first_callbacks, **kwargs)
    base_lr /= 2
    self.unfreeze()
    self.fit_flat_cos(epochs, slice(base_lr/lr_mult, base_lr), pct_start=pct_start, cbs=second_callbacks)

Now we're good to train! Let's find a learning rate first:

In [None]:
learn.lr_find()

We'll choose a learning rate of roughly 4e-3 to start. 

Finally, remember how we had those extra callback parameters? We're going to utilize the `MixUp` training methodology with a decreasing `MixUp` percentage:

In [None]:
cbs1 = [MixUp(alpha = 0.7)]
cbs2 = [MixUp(alpha = 0.3)]

Let's train for 1 epoch frozen and 10 unfrozen and a `start_pct` of 0.72:

In [None]:
learn.fine_tune_flat(5, base_lr=1e-3, start_pct=0.72, first_callbacks=cbs1, second_callbacks=cbs2)

Next up we'll move to submissions

## Submitting some results

Let's look at the sample submission dataframe first:

In [None]:
sample_df = pd.read_csv(data_path/'sample_submission.csv')
sample_df.head()

We'll want this to be similar to our training data so let's prepend that `test`:

In [None]:
sample_copy = sample_df.copy()

In [None]:
sample_copy['image_id'] = sample_copy['image_id'].apply(lambda x: f'test_images/{x}')

Next we'll make an inference dataloader through the `test_dl` method:

In [None]:
test_dl = learn.dls.test_dl(sample_copy)

We'll look at a batch of data to make sure it all looks okay:

In [None]:
test_dl.show_batch()

Great! Next we'll grab some predictions. We will use the `.tta` method to run test-time-augmentation which can help boost our accuracy some:

In [None]:
preds, _ = learn.tta(dl=test_dl)

And now we can submit them:

In [None]:
sample_df['label'] = preds.argmax(dim=-1).numpy()

In [None]:
sample_df.to_csv('submission.csv',index=False)

And that's it! We looked at a few of the neat tricks fastai can offer while also taking a look at how the `DataBlock` API can be used for such a problem.

If you enjoyed this notebook or it helped you get started please leave an upvote and if there are any questions please leave a comment! Thanks!