In order to understand how the freeze.py module we made works and how to freeze layers during training, we can follow the notebook below:

1. First we need to get the model, a pretrained model (if we want), and set up the appropiate arguments, like in our notebook 0_The_EDSRx4_model.ipynb. (Note: we could ignore this part if we did it already).
2. Then we can see the parameters that can be trainable in the model, and how to prevent them to update during the training (freezing).
3. Set up the repository in order to use the freeze.py module.

We will use the directories:

In [1]:
import os

dir_base = os.getcwd()
dir_edsrpytorch = os.path.join(dir_base,"EDSR-PyTorch")     # ./EDSR-PyTorch
dir_src = os.path.join(dir_edsrpytorch,"src")               # ./EDSR-PyTorch/src
dir_pretrain = os.path.join(dir_edsrpytorch, "pre-train")   # ./EDSR-PyTorch/pre-train

# 1. Getting the model ready

## 1.1 Getting the repository

In [2]:
# Cloning the repository as instructed in https://github.com/sanghyun-son/EDSR-PyTorch
!git clone https://github.com/thstkdgus35/EDSR-PyTorch {dir_edsrpytorch}

Cloning into '/content/EDSR-PyTorch'...
remote: Enumerating objects: 806, done.[K
remote: Total 806 (delta 0), reused 0 (delta 0), pack-reused 806[K
Receiving objects: 100% (806/806), 63.09 MiB | 30.40 MiB/s, done.
Resolving deltas: 100% (516/516), done.


## 1.2 Get a pretrained model

In [4]:
import os
import urllib.request

# Pre-train folder
if not os.path.exists(dir_pretrain):
    os.makedirs(dir_pretrain, exist_ok=True)

# Pretrained model
pretrain_model = "edsr_x4-4f62e9ef.pt"
pretrain_model_path = os.path.join(dir_pretrain, pretrain_model)

# Download it if not present
if not os.path.isfile(pretrain_model_path):
  url = "https://cv.snu.ac.kr/research/EDSR/models/edsr_x4-4f62e9ef.pt"
  with urllib.request.urlopen(url) as response, open(pretrain_model_path, 'wb') as out_file:
    data = response.read()
    out_file.write(data)
  print(f"Pretrained model {pretrain_model} has been downloaded inside {dir_pretrain}")
else :
  print(f"Using pretrained model {pretrain_model_path}")

Pretrained model edsr_x4-4f62e9ef.pt has been downloaded inside /content/EDSR-PyTorch/pre-train


## 1.3 Move current working directory to the src folder

In [5]:
## Change the current working directory to EDSR-PyTorch/src, to run the model
import os

os.chdir(dir_src)
print(f"The Current working directory is: {os.getcwd()}")

The Current working directory is: /content/EDSR-PyTorch/src


We can change the current working directory back to the base directory at any time by uncommenting and running:

In [6]:
#os.chdir(dir_base)
#print(f"The Current working directory is: {os.getcwd()}")

## 1.4 Setting up the option.py and template.py files

In [7]:
# Setting up the option.py file
"""
NOTE: This will always try to read the option.py file from a file
named as option-backup.py, which corresponds to the original
option.py from the EDSR-PyTorch repository.

If such file is not found, the code below will assume it's a fresh
clone of the repository, and it will make it.
"""

import os

# Line we want to add
new_line = 'parser.add_argument("-f", "--file", required=False)\n\n'

# File paths
option_file_path = os.path.join(dir_src,'option.py')                 # Original
option_file_path_backup = os.path.join(dir_src,'option-backup.py')   # Back-up

# If the back-up file is there, read it, or make it if not
if os.path.isfile(option_file_path_backup):
  # Read the content of the back-up file
  with open(option_file_path_backup, 'r') as file:
      lines = file.readlines()
else:
  # Read the content of the original file
  with open(option_file_path, 'r') as file:
      lines = file.readlines()
  # Write the content to the back-up file
  with open(option_file_path_backup, 'w') as file:
      file.writelines(lines)

# Insert the new line at the desired location (between lines 145 and 146)
lines.insert(145, new_line)

# Write the modified content back to the file
with open(option_file_path, 'w') as file:
    file.writelines(lines)

print(f"Option file successfully updated in {option_file_path}")

Option file successfully updated in /content/EDSR-PyTorch/src/option.py


In [8]:
# Setting up the template.py file
"""
NOTE: This will always try to read the template.py file from a file
named as template-backup.py, which corresponds to the original
template.py from the EDSR-PyTorch repository.

If such file is not found, the code below will assume it's a fresh
clone of the repository, and it will make it.
"""

# Lines to add to the file
template = [
    '    if args.template.find(\'EDSR_custom\') >= 0:\n',
    '        args.dir_data =  \"../image-data\"\n',
    '        args.data_train = \"Custom\"\n',
    '        args.data_test = \"Custom\"\n',
    '        args.data_range = \"1-2400/2401-2500\"\n',
    '        args.ext = \"sep\"\n',
    '        args.scale = \"4\"\n',
    '        args.model = \"EDSR\"\n',
    '        args.pre_train = \"../pre-train/edsr_x4-4f62e9ef.pt\"\n',
    '        args.n_resblocks = 32\n',
    '        args.n_feats = 256\n',
    '        args.res_scale = 0.1\n',
    '        args.test_every = 100\n',
    '        args.epochs = 11\n',
    '        args.batch_size = 16\n',
    '        args.save = \"edsr_x4-train\"\n'
]
#Note: spaces/tabulations are important.

# Specify the file path
template_file_path = os.path.join(dir_src,'template.py')

# Have a back-up of the original template.py
template_file_path_backup = os.path.join(dir_src,'template-backup.py')

# If the back-up file is there, read it, or make it if not
if os.path.isfile(template_file_path_backup):
  # Read the content of the back-up file
  with open(template_file_path_backup, 'r') as file:
      lines = file.readlines()
else:
  # Read the content of the original file
  with open(template_file_path, 'r') as file:
      lines = file.readlines()
  # Write the content to the back-up file
  with open(template_file_path_backup, 'w') as file:
      file.writelines(lines)

# Insert the new lines at the desired location (on line 54)
for idx,line in enumerate(template):
  lines.insert(54+idx, line)

# Write the modified content back to the file
with open(template_file_path, 'w') as file:
    file.writelines(lines)

print(f"Template successfully added to {template_file_path}")

Template successfully added to /content/EDSR-PyTorch/src/template.py


## 1.5 Setting up the arguments args

In [9]:
from option import args
import template

# Use the desired template of args:
args.template = "EDSR_custom"
template.set_template(args)

# Set-up args as in the option.py file, after setting the template:
args.scale = list(map(lambda x: int(x), args.scale.split('+')))
args.data_train = args.data_train.split('+')
args.data_test = args.data_test.split('+')

if args.epochs == 0:
  args.epochs = 1e8

for arg in vars(args):
  if vars(args)[arg] == 'True':
    vars(args)[arg] = True
  elif vars(args)[arg] == 'False':
    vars(args)[arg] = False

# Set-up additional parameters
args.chop = True
args.save_results = True
args.param_to_freeze = ""
args.body_to_freeze = ""
args.tail_to_freeze = ""
args.print_frozen_param = True
args.save_models_each = 3

# 2. Looking for the trainable parameters on the model

## 2.1 Loading the model

In [10]:
import torch

import utility
import model

torch.manual_seed(args.seed)
checkpoint = utility.checkpoint(args)

# Load the model
model = model.Model(args, checkpoint)

# Print it (optional)
#print(model)

Making model...
Load the model from ../pre-train/edsr_x4-4f62e9ef.pt


## 2.2 Model parameters

Once the model is loaded, we can see it's "named parameters" by using the named_parameters() method, as shown below:

In [11]:
for name,value in model.named_parameters():
  print(name)

model.sub_mean.weight
model.sub_mean.bias
model.add_mean.weight
model.add_mean.bias
model.head.0.weight
model.head.0.bias
model.body.0.body.0.weight
model.body.0.body.0.bias
model.body.0.body.2.weight
model.body.0.body.2.bias
model.body.1.body.0.weight
model.body.1.body.0.bias
model.body.1.body.2.weight
model.body.1.body.2.bias
model.body.2.body.0.weight
model.body.2.body.0.bias
model.body.2.body.2.weight
model.body.2.body.2.bias
model.body.3.body.0.weight
model.body.3.body.0.bias
model.body.3.body.2.weight
model.body.3.body.2.bias
model.body.4.body.0.weight
model.body.4.body.0.bias
model.body.4.body.2.weight
model.body.4.body.2.bias
model.body.5.body.0.weight
model.body.5.body.0.bias
model.body.5.body.2.weight
model.body.5.body.2.bias
model.body.6.body.0.weight
model.body.6.body.0.bias
model.body.6.body.2.weight
model.body.6.body.2.bias
model.body.7.body.0.weight
model.body.7.body.0.bias
model.body.7.body.2.weight
model.body.7.body.2.bias
model.body.8.body.0.weight
model.body.8.body.0

There we can see the parameters "weight" and "bias" for each layer of the model. The layers are:

* Sub_mean
* Add_mean
* Head
* Body: 32 Resblocks (from 0 to 31) and a last Conv2d layer (32). Each Resblock is also sub-divided into two more sub-layers (0 and 2)
* Tail: Upsampler (0) divided in two sub-layers (0 and 2), and a final Conv2d layer (1).

## 2.3 Disabling gradients on the parameters

We can "freeze" the corresponding parameter by setting the attribute requires_grad of its value to False. For example, if we would like to disable gradients from the head layers:

In [13]:
parameter_to_freeze = "head"

# We want:
#model.head.0.weight.requires_grad = False
#model.head.0.bias.requires_grad = False

# We do:
for name,value in model.named_parameters():
  if name.split(".")[1] == parameter_to_freeze :
    value.requires_grad = False

# We could print them again to see
for name,value in model.named_parameters():
  print(f"Is {name} trainable? {value.requires_grad}")


Is model.sub_mean.weight trainable? False
Is model.sub_mean.bias trainable? False
Is model.add_mean.weight trainable? False
Is model.add_mean.bias trainable? False
Is model.head.0.weight trainable? False
Is model.head.0.bias trainable? False
Is model.body.0.body.0.weight trainable? True
Is model.body.0.body.0.bias trainable? True
Is model.body.0.body.2.weight trainable? True
Is model.body.0.body.2.bias trainable? True
Is model.body.1.body.0.weight trainable? True
Is model.body.1.body.0.bias trainable? True
Is model.body.1.body.2.weight trainable? True
Is model.body.1.body.2.bias trainable? True
Is model.body.2.body.0.weight trainable? True
Is model.body.2.body.0.bias trainable? True
Is model.body.2.body.2.weight trainable? True
Is model.body.2.body.2.bias trainable? True
Is model.body.3.body.0.weight trainable? True
Is model.body.3.body.0.bias trainable? True
Is model.body.3.body.2.weight trainable? True
Is model.body.3.body.2.bias trainable? True
Is model.body.4.body.0.weight trainabl

## 2.4 Visualizing it with torchinfo

NOTE: If torchinfo is not installed, we can install it in Google Colab by doing:

In [None]:
# Checking for torchinfo
try:
  from torchinfo import summary
except:
  !pip install -q torchinfo
  from torchinfo import summary

In [None]:
from torchinfo import summary

batch_size = 16
image_dim = (3, 64, 64)

summary(model=model,
        input_size=(batch_size, *image_dim),
        col_names=["input_size", "output_size", "num_params", "trainable"],
        col_width=20,
        cache_forward_pass=True,
        depth=5,
        mode="train",
        row_settings=["var_names"],
        idx_scale=0
)

There we can see in the last column "Trainable" if the parameter is trainable or not (frozen).

# 3. Setting up the freeze.py module

We prepared a freeze.py file with a freeze_model() function to be able to select which parameters from to the model to freeze, if any.

In order to use it, we need:
* Place the freeze.py file inside of the src folder.
* Modify the option.py file in order to add the arguments required to use the freeze_model() function.
* Modify the main.py file in order to insert the use of the freeze_model() function.
* Pass the corresponding parameters to the command to train the model on the terminal.

## 3.1 Changes to the option.py file

We need to add the following parameters:
* param_to_freeze: names of the parameters to freeze (sub_mean, add_mean, head, body or tail)
* body_to_freeze: layers of the body to freeze
* tail_to_freeze: layers of the tail to freeze
* print_frozen_param: if True, prints the number of parameters which gradients are disable during the training
* torchinfo_summary: if True, will print a summary of the model structure using torchinfo
* torchinfo_inputsize: input size to use in the summary function from torhinfo

We can do it with the following code:

In [14]:
"""
NOTE: This will always try to read the option.py file from a file
named as option-backup.py, which corresponds to the original
option.py from the EDSR-PyTorch repository.

If such file is not found, the code below will assume it's a fresh
clone of the repository, and it will make it.
"""

import os

# 1. Add this line because the argument parser was giving errors on Google Colab
new_line = 'parser.add_argument("-f", "--file", required=False)\n\n'

# 2. Add to the arg parser the arguments for freezing the model before training
freezing_arguments = [
    "# Transfer Learning specifications\n",
    "parser.add_argument('--param_to_freeze', type=str, default='',\n",
    "                    help='named parameters in the model to freeze during training')\n",
    "parser.add_argument('--body_to_freeze', type=str, default='',\n",
    "                    help='range of layers to freeze in the body of the model during training')\n",
    "parser.add_argument('--tail_to_freeze', type=str, default='',\n",
    "                    help='range of layers to freeze in the tail of the model during training')\n",
    "parser.add_argument('--print_frozen_param', action='store_true',\n",
    "                    help='print the number of parameters from the model frozen during training')\n",
    "parser.add_argument('--torchinfo_summary', action='store_true',\n",
    "                    help='print the model summary using torchinfo')\n",
    "parser.add_argument('--torchinfo_inputsize', type=str, default='510,339',\n",
    "                    help='input size of a test image to use for printing torchinfo summary')\n\n"
]
#Note: spaces/tabulations are important.


# File paths
option_file_path = os.path.join(dir_src,'option.py')                 # Original
option_file_path_backup = os.path.join(dir_src,'option-backup.py')   # Back-up

# If the back-up file is there, read it, or make it if not
if os.path.isfile(option_file_path_backup):
  # Read the content of the back-up file
  with open(option_file_path_backup, 'r') as file:
      lines = file.readlines()
else:
  # Read the content of the original file
  with open(option_file_path, 'r') as file:
      lines = file.readlines()
  # Write the content to the back-up file
  with open(option_file_path_backup, 'w') as file:
      file.writelines(lines)

# 1. Insert new_line at the desired location (between lines 145 and 146)
lines.insert(145, new_line)

# 2. Insert the freezing arguments lines at the desired location (line 145)
for idx,line in enumerate(freezing_arguments):
  lines.insert(145+idx, line)

# Write the modified content back to the file
with open(option_file_path, 'w') as file:
    file.writelines(lines)

print(f"Option file successfully updated in {option_file_path}")

Option file successfully updated in /content/EDSR-PyTorch/src/option.py


## 3.2 Changes to the main.py file

We need to import the freeze module and to apply the freeze_model() function to the model before passing it to the Trainer in the main.py script, we can do it with:

In [17]:
"""
NOTE: This will always try to read the main.py file from a file
named as main-backup.py, which corresponds to the original
main.py from the EDSR-PyTorch repository.

If such file is not found, the code below will assume it's a fresh
clone of the repository, and it will make it.
"""

import os

# Lines to add
new_line1 = "import freeze\n"
new_line2 = "            _model = freeze.freeze_model(_model, args)\n"
#Note: spaces/tabulations are important.


# File paths
main_file_path = os.path.join(dir_src,'main.py')                 # Original
main_file_path_backup = os.path.join(dir_src,'main-backup.py')   # Back-up

# If the back-up file is there, read it, or make it if not
if os.path.isfile(main_file_path_backup):
  # Read the content of the back-up file
  with open(main_file_path_backup, 'r') as file:
      lines = file.readlines()
else:
  # Read the content of the original file
  with open(main_file_path, 'r') as file:
      lines = file.readlines()
  # Write the content to the back-up file
  with open(main_file_path_backup, 'w') as file:
      file.writelines(lines)

# Insert the new lines at the desired location (from bottom to top)
lines.insert(23, new_line2)
lines.insert(8, new_line1)

# Write the modified content back to the file
with open(main_file_path, 'w') as file:
    file.writelines(lines)

print(f"Option file successfully updated in {main_file_path}")

Option file successfully updated in /content/EDSR-PyTorch/src/main.py


## 3.3 Pass the arguments to the command on the terminal for a training session

For example, if we want to perform a training session with by freezing all the layers except the tail, we can run a command on the terminal like the following:

In [None]:
# Command to train the model (from the src folder)
!python main.py --template EDSR_custom --param_to_freeze sub_mean+add_mean+head+body --body_to_freeze "0-32" --print_frozen_param --save_models --chop