**INITIALIZATION**
- I use these three lines of code on top of my each notebooks because it will help to prevent any problems while reloading the same project. And the third line of code helps to make visualization within the notebook.

In [1]:
#@ INITIALIZATION: 
%reload_ext autoreload
%autoreload 2
%matplotlib inline

**LIBRARIES AND DEPENDENCIES**
- I have downloaded all the libraries and dependencies required for the project in one particular cell.

In [3]:
#@ INSTALLING DEPENDENCIES: UNCOMMENT BELOW: 
# !pip install -Uqq fastbook
# import fastbook
# fastbook.setup_book()

In [4]:
#@ DOWNLOADING LIBRARIES AND DEPENDENCIES: 
from fastbook import *                                  # Getting all the Libraries. 
from fastai.callback.fp16 import *
from fastai.vision.all import *                         # Getting all the Libraries.

**GETTING THE DATASET**
- I will use **Imagenette** dataset here. 

In [5]:
#@ GETTING THE DATASET: 
path = untar_data(URLs.IMAGENETTE)                      # Path to the Dataset. 
path.ls()                                               # Inspecting the Dataset. 

(#3) [Path('/root/.fastai/data/imagenette2/train'),Path('/root/.fastai/data/imagenette2/val'),Path('/root/.fastai/data/imagenette2/noisy_imagenette.csv')]

**DATABLOCK AND DATALOADERS**
- I will get the dataset into **DataLoaders** object using the **Presizing**. 

In [7]:
#@ INITIALIZING DATABLOCK AND DATALOADERS:
dblock = DataBlock(blocks=(ImageBlock(), CategoryBlock()),        # Initializing DataBlock and Category Block. 
                   get_items=get_image_files,                     # Getting Images. 
                   get_y=parent_label,                            # Getting Labels. 
                   item_tfms=Resize(460),                         # Resizing Images. 
                   batch_tfms=aug_transforms(size=224, 
                                             min_scale=0.75))     # Initializing DataBlock and Augmentation. 
dls = dblock.dataloaders(path, bs=64)                             # Initializing DataLoaders and Batchsize. 

**TRAINING THE MODEL**

In [9]:
#@ TRAINING THE MODEL: BASELINE:
model = xresnet50()                                               # Initializing the Model. 
learn = Learner(dls, model, loss_func=CrossEntropyLossFlat(), 
                metrics=accuracy).to_fp16()                       # Initializing the Learner. 
learn.fit_one_cycle(5, 3e-3)                                      # Training the Model. 

epoch,train_loss,valid_loss,accuracy,time
0,1.630768,2.106363,0.429052,02:12
1,1.248832,1.514785,0.532487,02:13
2,0.974968,0.999128,0.700149,02:13
3,0.738964,0.789645,0.748693,02:15
4,0.583899,0.583539,0.820762,02:15


**NORMALIZATION**
- When the model is training, it helps if the input data is normalized i.e has a mean of 0 and standard deviation of 1. 

In [10]:
#@ GETTING MEAN AND STANDARD DEVIATION: 
x, y = dls.one_batch()                          # Getting a Batch of Data. 
x.mean(dim=[0, 2, 3]), x.std(dim=[0, 2, 3])     # Getting Mean and SD. 

(TensorImage([0.4465, 0.4698, 0.4701], device='cuda:0'),
 TensorImage([0.2860, 0.2840, 0.3174], device='cuda:0'))

In [11]:
#@ INITIALIZING NORMALIZATION: 
def get_dls(bs, size):
    dblock = DataBlock(blocks=(ImageBlock, CategoryBlock),                     # Initializing DataBlock and Category Block.
                       get_items=get_image_files,                              # Getting Images. 
                       get_y=parent_label,                                     # Getting Labels.
                       item_tfms=Resize(460),                                  # Resizing Images.
                       batch_tfms=[*aug_transforms(size=size, min_scale=0.75), # Initializing DataBlock.
                                   Normalize.from_stats(*imagenet_stats)])     # Normalization. 
    return dblock.dataloaders(path, bs=bs)                                     # Getting DataLoaders. 

#@ GETTING DATALOADERS: 
dls = get_dls(64, 224)                                            # Getting DataLoaders. 
#@ GETTING MEAN AND STANDARD DEVIATION: 
x, y = dls.one_batch()                                            # Getting a Batch of Data. 
x.mean(dim=[0, 2, 3]), x.std(dim=[0, 2, 3])                       # Getting Mean and SD. 

(TensorImage([-0.0286,  0.0227,  0.1717], device='cuda:0'),
 TensorImage([1.2934, 1.3130, 1.3981], device='cuda:0'))

In [12]:
#@ TRAINING THE MODEL: AFTER NORMALIZATION: 
model = xresnet50()                                               # Initializing the Model. 
learn = Learner(dls, model, loss_func=CrossEntropyLossFlat(), 
                metrics=accuracy).to_fp16()                       # Initializing the Learner. 
learn.fit_one_cycle(5, 3e-3)                                      # Training the Model. 

epoch,train_loss,valid_loss,accuracy,time
0,1.654204,2.848061,0.401419,02:14
1,1.247107,1.241298,0.613891,02:14
2,0.951346,1.396691,0.634802,02:13
3,0.747125,0.74411,0.771471,02:13
4,0.582027,0.594142,0.820388,02:14


**PROGRESSIVE RESIZING**
- Progressive Resizing is the process of gradually using larger and larger images as training progresses. 

In [13]:
#@ TRAINING THE MODEL: SMALL IMAGES: 
dls = get_dls(128, 128)                                           # Getting DataLoaders. 
learn = Learner(dls,xresnet50(),loss_func=CrossEntropyLossFlat(), 
                metrics=accuracy).to_fp16()                       # Initializing Learner. 
learn.fit_one_cycle(4, 3e-3)                                      # Training the Model. 

epoch,train_loss,valid_loss,accuracy,time
0,1.887902,2.439564,0.415235,02:00
1,1.311301,1.190988,0.6236,01:59
2,0.970431,0.912942,0.711352,01:59
3,0.757482,0.686185,0.786034,02:00


In [14]:
#@ TRAINING THE MODEL: LARGE IMAGES: 
learn.dls = get_dls(64, 224)                                      # Initializing DataLoaders. 
learn.fine_tune(5, 1e-3)                                          # Training the Model. 

epoch,train_loss,valid_loss,accuracy,time
0,0.854225,0.900401,0.712845,02:14


epoch,train_loss,valid_loss,accuracy,time
0,0.673041,0.746189,0.76699,02:15
1,0.645187,0.793768,0.759895,02:16
2,0.597342,0.772277,0.76475,02:15
3,0.481323,0.48045,0.851755,02:14
4,0.441154,0.445771,0.865198,02:14


**TEST TIME AUGMENTATION**
- During inference or validation, creating multiple versions of each image using data augmentation and then taking the average or maximum of the predictions for each augmented version of the image is called **Test Time Augmentation**. 

In [15]:
#@ INSPECTING ACCURACY: TEST TIME AUGMENTATION: 
preds, targs = learn.tta()                          # Implementation of TTA. 
accuracy(preds, targs).item()                       # Getting Accuracy. 

0.8711724877357483

**MIXUP AUGMENTATION**
- Mixup Augmentation works as follows: 
    - Select another image from the dataset at random. 
    - Pick a weight at random. 
    - Take a weighted average using the weight from above of the selected image with your image and it will be independent variable. 
    - Take a weighted average using the same weight of this image label with your image label and it will be dependent variable. 

In [17]:
#@ TRAINING THE MODEL WITH MIXUP AUGMENTATION: REQUIRES MORE EPOCHS: 
model = xresnet50()                                               # Initializing the Model. 
learn = Learner(dls, model, loss_func=CrossEntropyLossFlat(), 
                metrics=accuracy, cbs=MixUp).to_fp16()            # Initializing the Learner. 
learn.fit_one_cycle(5, 3e-3)                                      # Training the Model. 

epoch,train_loss,valid_loss,accuracy,time
0,2.161976,2.607004,0.365198,01:59
1,1.699805,1.532858,0.56236,02:00
2,1.47142,1.041581,0.665795,01:59
3,1.284537,0.951021,0.703137,01:58
4,1.182544,0.708046,0.790142,02:00


**LABEL SMOOTHING**
- **Label Smoothing** is a process which replaces all the labels i.e 1s with a number a bit less than 1 and 0s with a number a bit more than 0 for training. It will make training more robust even if there is mislabeled data which results to be a model that generalizes better at inference. 

In [18]:
#@ TRAINING THE MODEL WITH LABEL SMOOTHING: REQUIRES MORE EPOCHS: 
model = xresnet50()                                               # Initializing the Model. 
learn = Learner(dls,model,loss_func=LabelSmoothingCrossEntropy(), 
                metrics=accuracy).to_fp16()                       # Initializing the Learner. 
learn.fit_one_cycle(5, 3e-3)                                      # Training the Model. 

epoch,train_loss,valid_loss,accuracy,time
0,2.746474,3.716041,0.340553,01:59
1,2.235328,2.053193,0.643391,02:00
2,1.954926,2.016363,0.651606,02:00
3,1.756929,1.641183,0.789768,01:59
4,1.615035,1.590236,0.809186,02:00
