# <span id='0'></span> Understanding Learner.freeze and freeze_to 
- [How to understand](#use)
- [prepare data](#1)
- [model Res18](#2)
- [model Res34](#3)
- [model Res152](#4)
- [model Dense121](#densenet)

## <span id='use'></span> How to use `freeze` and `freeze_to`

`freeze` Docs:
> Freeze up to last layer group.    
> Sets every layer group except the last to untrainable (i.e. requires_grad=False).

What does 'the last layer group' mean?    
In the case of transfer learning, such as `learn = cnn_learner(data, models.resnet18, metrics=error_rate)`, `learn.model`will print out two large groups of layers: (0) Sequential and (1) Sequental in the following structure. We can consider the last conv layer as the break line between the two groups.
```
Sequential(
  (0): Sequential(
    (0): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
    (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): ReLU(inplace)
    ...
    
            (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
             (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
  )
  (1): Sequential(
    (0): AdaptiveConcatPool2d(
      (ap): AdaptiveAvgPool2d(output_size=1)
      (mp): AdaptiveMaxPool2d(output_size=1)
    )
    (1): Flatten()
    (2): BatchNorm1d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (3): Dropout(p=0.25)
    (4): Linear(in_features=1024, out_features=512, bias=True)
    (5): ReLU(inplace)
    (6): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (7): Dropout(p=0.5)
    (8): Linear(in_features=512, out_features=12, bias=True)
  )
)
```

`learn.freeze` freezes the first group and keeps the second or last group free to train, including multiple layers inside (this is why calling it 'group'), as you can see in `learn.summary()` output. How to read the table below, please see [model summary docs](https://docs.fast.ai/callbacks.hooks.html#model_summary).

```
======================================================================
Layer (type)         Output Shape         Param #    Trainable 
======================================================================
...
...
...
______________________________________________________________________
Conv2d               [1, 512, 4, 4]       2,359,296  False     
______________________________________________________________________
BatchNorm2d          [1, 512, 4, 4]       1,024      True      
______________________________________________________________________
AdaptiveAvgPool2d    [1, 512, 1, 1]       0          False     
______________________________________________________________________
AdaptiveMaxPool2d    [1, 512, 1, 1]       0          False     
______________________________________________________________________
Flatten              [1, 1024]            0          False     
______________________________________________________________________
BatchNorm1d          [1, 1024]            2,048      True      
______________________________________________________________________
Dropout              [1, 1024]            0          False     
______________________________________________________________________
Linear               [1, 512]             524,800    True      
______________________________________________________________________
ReLU                 [1, 512]             0          False     
______________________________________________________________________
BatchNorm1d          [1, 512]             1,024      True      
______________________________________________________________________
Dropout              [1, 512]             0          False     
______________________________________________________________________
Linear               [1, 12]              6,156      True      
______________________________________________________________________

Total params: 11,710,540
Total trainable params: 543,628
Total non-trainable params: 11,166,912
```

`freeze_to(int:n)` Docs:
> Freeze layers up to layer group `n`.    

How to understand the use of integer `n`?     
If you experiment with the `learn` object from `learn.freeze` above, you will come to the following conclusions:
- `freeze()` is equivalent to `freeze_to(-1)`, meaning all layer groups are frozen and can't be trained, except the last layer group. 
- `freeze_to(-3)` is equivalent to `unfreeze()`, all trainable parameters are ready to train. 
- `freeze_to(-2)` only freeze only a small proportion of conv layers in the (0) Sequential

To verfiy this understanding with four models below

# <span id='1'></span> Prepare Data

In [None]:
import fastai.vision as fv

In [None]:
fv.__version__

In [None]:
path_test =  fv.Path('/kaggle/input/test');
path_train = fv.Path('/kaggle/input/train'); path_train.ls()

In [None]:
fv.np.random.seed(1)

### 创建DataBunch

data = fv.ImageDataBunch.from_folder(path_train,
                                  test=path_test, 
                                  ds_tfms=fv.get_transforms(),
                                  valid_pct=0.25,
                                  size=128, 
                                  bs=32,
                                  num_workers=0)
data.normalize(fv.imagenet_stats)
data

[back](#0)

# <span id='2'></span> Model Res18

In [None]:
learn = fv.cnn_learner(data, 
                      fv.models.resnet18, 
                      metrics=fv.error_rate,
                      model_dir="/kaggle/working/")

In [None]:
learn.save('start')
!ls .

In [None]:
learn.load('start')
learn.summary()

In [None]:
learn.load('start')
learn.freeze()
learn.summary() # only linear and BatchNorm are trainable

In [None]:
learn.load('start')
learn.freeze_to(-1) # same to learn.freeze()
learn.summary()

In [None]:
learn.model 
# it has two large Sequential

In [None]:
learn.load('start')
learn.freeze_to(-2)
learn.summary() # seem half of conv layer are trainable

In [None]:
learn.unfreeze()
learn.summary() # all trainable layers are free to train

[back](#0)

# <span id='3'></span> Model Res34

In [None]:
learn = fv.cnn_learner(data, 
                      fv.models.resnet34, 
                      metrics=fv.error_rate,
                      model_dir="/kaggle/working/")

In [None]:
learn.save('start34')
!ls .

In [None]:
learn.load('start34')
learn.summary()

In [None]:
learn.load('start34')
learn.freeze()
learn.summary() 

In [None]:
learn.load('start34')
learn.freeze_to(-1) # same to learn.freeze()
learn.summary()

In [None]:
learn.model 
# it has two large Sequential, 
# the second or last one has 553,628 parameters 
# including BatchNorms params in first Sequential

In [None]:
learn.load('start34')
learn.freeze_to(-2)
learn.summary() # first 16 conv layer params turned off

In [None]:
learn.load('start34')
learn.freeze_to(-3)
learn.summary() # seem half of conv layer are trainable

In [None]:
learn.load('start34')
learn.unfreeze()
learn.summary() # all trainable layers are free to train

[back](#0)

# <span id='4'></span> Model Res152

In [None]:
learn = fv.cnn_learner(data, 
                      fv.models.resnet152, 
                      metrics=fv.error_rate,
                      model_dir="/kaggle/working/")

In [None]:
learn.summary()

In [None]:
learn.model # still just two overarching Sequental

[back](#0)

# <span id='densenet'></span> DenseNet121

In [None]:
learn = fv.cnn_learner(data, 
                      fv.models.densenet121, 
                      metrics=fv.error_rate,
                      model_dir="/kaggle/working/")

In [None]:
learn.summary()

In [None]:
learn.model

[back](#0)