# torch.no_grad() vs. param.requires_grad

- **torch.no_grad()** -> close modules and no calculation anymore
    - Defines a context manager that implicitly does not perform gradient updates and does not change requires_grad.
    - This is suitable for modules that do not update gradients during the **eval phase** or during the model forward process (these modules only perform feature extraction (forward calculation) and do not perform backward updates).
- **param.requires_grad** -> freeze parameters
    - Explicitly freeze the gradient updates of some modules (layers)
    - layer/module level，
    - May be more flexible

In [1]:
from transformers import BertModel
import torch
from torch import nn

In [9]:
model_name = 'bert-base-uncased'

bert = BertModel.from_pretrained(model_name)

In [4]:
def calc_learnable_params(model):
    total_param = 0
    for name, param in model.named_parameters():
        if param.requires_grad:
            total_param += param.numel()
    return total_param

In [5]:
calc_learnable_params(bert)

109482240

In [6]:
with torch.no_grad():
    print(calc_learnable_params(bert))

109482240


In [7]:
for name, param in bert.named_parameters():
    if param.requires_grad:
        param.requires_grad = False

In [8]:
calc_learnable_params(bert)

0