# Training a State-of-the-Art Model

## Imagenette

In [None]:
from fastai.vision.all import *
path = untar_data(URLs.IMAGENETTE)

In [None]:
dblock = DataBlock(blocks=(ImageBlock(), CategoryBlock()),
                   get_items=get_image_files,
                   get_y=parent_label,
                   item_tfms=Resize(460),
                   batch_tfms=aug_transforms(size=224, min_scale=0.75))
dls = dblock.dataloaders(path, bs=64)

epoch,train_loss,valid_loss,accuracy,time
0,1.583403,2.064317,0.401792,01:03
1,1.208877,1.260106,0.601568,01:02
2,0.925265,1.036154,0.664302,01:03
3,0.73019,0.700906,0.777819,01:03
4,0.585707,0.54181,0.825243,01:03


## Normalization

(TensorImage([0.4842, 0.4711, 0.4511], device='cuda:5'),
 TensorImage([0.2873, 0.2893, 0.3110], device='cuda:5'))

In [None]:
def get_dls(bs, size):
    dblock = DataBlock(blocks=(ImageBlock, CategoryBlock),
                   get_items=get_image_files,
                   get_y=parent_label,
                   item_tfms=Resize(460),
                   batch_tfms=[*aug_transforms(size=size, min_scale=0.75),
                               Normalize.from_stats(*imagenet_stats)])
    return dblock.dataloaders(path, bs=bs)

In [None]:
dls = get_dls(64, 224)

(TensorImage([-0.0787,  0.0525,  0.2136], device='cuda:5'),
 TensorImage([1.2330, 1.2112, 1.3031], device='cuda:5'))

epoch,train_loss,valid_loss,accuracy,time
0,1.632865,2.250024,0.391337,01:02
1,1.294041,1.579932,0.517177,01:02
2,0.960535,1.069164,0.657207,01:04
3,0.73022,0.767433,0.771845,01:05
4,0.577889,0.550673,0.824496,01:06


## Progressive Resizing

epoch,train_loss,valid_loss,accuracy,time
0,1.902943,2.447006,0.401419,00:30
1,1.315203,1.572992,0.525765,00:30
2,1.001199,0.767886,0.759149,00:30
3,0.765864,0.665562,0.797984,00:30


epoch,train_loss,valid_loss,accuracy,time
0,0.985213,1.654063,0.565721,01:06


epoch,train_loss,valid_loss,accuracy,time
0,0.706869,0.689622,0.784541,01:07
1,0.739217,0.928541,0.712472,01:07
2,0.629462,0.788906,0.764003,01:07
3,0.491912,0.502622,0.836445,01:06
4,0.41488,0.431332,0.863331,01:06


## Test Time Augmentation

0.8737863898277283

## Mixup

### Sidebar: Papers and Math

### End sidebar

## Label Smoothing

### Sidebar: Label Smoothing, the Paper

### End sidebar

## Conclusion

## Questionnaire

1. What is the difference between ImageNet and Imagenette? When is it better to experiment on one versus the other?
1. What is normalization?
1. Why didn't we have to care about normalization when using a pretrained model?
1. What is progressive resizing?
1. Implement progressive resizing in your own project. Did it help?
1. What is test time augmentation? How do you use it in fastai?
1. Is using TTA at inference slower or faster than regular inference? Why?
1. What is Mixup? How do you use it in fastai?
1. Why does Mixup prevent the model from being too confident?
1. Why does training with Mixup for five epochs end up worse than training without Mixup?
1. What is the idea behind label smoothing?
1. What problems in your data can label smoothing help with?
1. When using label smoothing with five categories, what is the target associated with the index 1?
1. What is the first step to take when you want to prototype quick experiments on a new dataset?

### Further Research

1. Use the fastai documentation to build a function that crops an image to a square in each of the four corners, then implement a TTA method that averages the predictions on a center crop and those four crops. Did it help? Is it better than the TTA method of fastai?
1. Find the Mixup paper on arXiv and read it. Pick one or two more recent articles introducing variants of Mixup and read them, then try to implement them on your problem.
1. Find the script training Imagenette using Mixup and use it as an example to build a script for a long training on your own project. Execute it and see if it helps.
1. Read the sidebar "Label Smoothing, the Paper", look at the relevant section of the original paper and see if you can follow it. Don't be afraid to ask for help!