# Containers

## Module

In [1]:
import torch.nn as nn
import torch.nn.functional as F

class Model(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(1, 20, 5)
        self.conv2 = nn.Conv2d(20, 20, 5)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        return F.relu(self.conv2(x))

In [2]:
model = Model()

Model(
  (conv1): Conv2d(1, 20, kernel_size=(5, 5), stride=(1, 1))
  (conv2): Conv2d(20, 20, kernel_size=(5, 5), stride=(1, 1))
)

### 1 Module.add_module()

This is particularly useful when you want to dynamically add sub-modules to your model, especially when the structure of your neural network is determined during runtime.

In [3]:
class MyModel(nn.Module):

    def __init__(self):
        super(MyModel, self).__init__()

        self.add_module('linear1', nn.Linear(10, 20))
        self.add_module('relu', nn.ReLU())
        self.add_module('linear2', nn.Linear(20, 5))

    def forward(self, x):
        x = self.linear1(x)
        x = self.relu(x)
        x = self.linear(x)

        return x

model = MyModel()
print(model)


MyModel(
  (linear1): Linear(in_features=10, out_features=20, bias=True)
  (relu): ReLU()
  (linear2): Linear(in_features=20, out_features=5, bias=True)
)


In [4]:
linear_layer_1 = model.linear1
linear_layer_2 = model.linear2
relu_layer = model.relu

linear_layer_1, linear_layer_2, relu_layer


(Linear(in_features=10, out_features=20, bias=True),
 Linear(in_features=20, out_features=5, bias=True),
 ReLU())

### 2 Module.apply()

The `Module.apply` function in PyTorch is a convenient method that applies a given function to each submodule recursively. It traverses the entire module hierarchy and applies the specified function to each submodule within the module. This can be useful for various tasks such as initialization, modification, or inspection of the parameters and layers within a model.

In this example, the `custom_weights_initialization` function is applied to all the linear layers in the MyModel class using model.apply(custom_weights_initialization). The purpose of this is to customize the initialization of the weights of the linear layers. In this case, we are using Xavier normal initialization for the weights of linear layers.

The Module.apply function is particularly useful when you want to perform some operation on each submodule, like modifying parameters, initializing weights, or any other task that involves traversing the structure of a neural network. It provides a clean and concise way to apply a function to all submodules within a PyTorch module.

In [7]:
class MyModel(nn.Module):
    def __init__(self):
        super(MyModel, self).__init__()

        self.fc1 = nn.Linear(10, 5)
        self.relu = nn.ReLU()
        self.fc2 = nn.Linear(5, 2)

#Function to initialize the weights of linear layers with custom initialization
def custom_weights_initialization(module):
    if isinstance(module, nn.Linear):
        nn.init.xavier_normal(module.weight)

model = MyModel()
model.apply(custom_weights_initialization)
print(model)


MyModel(
  (fc1): Linear(in_features=10, out_features=5, bias=True)
  (relu): ReLU()
  (fc2): Linear(in_features=5, out_features=2, bias=True)
)


  nn.init.xavier_normal(module.weight)


#### Other examples for the `Module.apply`

#### 2.1 Parameter Inspection or Modification

To inspect or modify the parameters of certain layers based on some condition. For instance, to freeze the parameters of specific layers during training

In [9]:
def freeze_parameters(m):
    if isinstance(m, nn.Linear):
        m.weight.requires_grad = False
        m.bias.requires_grad = False

model.apply(freeze_parameters)

MyModel(
  (fc1): Linear(in_features=10, out_features=5, bias=True)
  (relu): ReLU()
  (fc2): Linear(in_features=5, out_features=2, bias=True)
)

#### 2.2 Logging or Printing information

To print or log information about the layers in your model, such as the shapes of input and output tensors.

In [11]:
def log_information(m):
    if isinstance(m, nn.Linear):
        print(f"Layer: {m.__class__.__name__}, Input Shape: {m.in_features}, Output Shape: {m.out_features}")

model.apply(log_information)

Layer: Linear, Input Shape: 10, Output Shape: 5
Layer: Linear, Input Shape: 5, Output Shape: 2


MyModel(
  (fc1): Linear(in_features=10, out_features=5, bias=True)
  (relu): ReLU()
  (fc2): Linear(in_features=5, out_features=2, bias=True)
)

#### 2.3 Custom Layer Operations

A custom layer or operation that needs to be applied to specific types of layers in your model. #FIXME

In [12]:
# def custom_operation(m):
#     if isinstance(m, MyCustomLayer):
#         m.custom_function()

# model.apply(custom_operation)

#### 2.4 Learning Rate Scheduling

To implement a custom learning rate scheduling strategy for specific layers in the model

In [14]:
original_learning_rate = 0.001
epoch = 70
def adjust_learning_rate(m):
    if isinstance(m, nn.Conv2d):
        new_learning_rate = original_learning_rate * (0.1 ** (epoch//10))
        m.optimizer.param_groups[0]['lr'] = new_learning_rate

model.apply(adjust_learning_rate)

MyModel(
  (fc1): Linear(in_features=10, out_features=5, bias=True)
  (relu): ReLU()
  (fc2): Linear(in_features=5, out_features=2, bias=True)
)

#### 2.5 Model Structure modification

Dynamically modify the structure of your model based on certain conditions or parameters.

In [15]:
def add_drop_out_layers(m):
    if isinstance(m, nn.Linear):
        m.add_module('dropout', nn.Dropout(0.5))

model.apply(add_drop_out_layers)

MyModel(
  (fc1): Linear(
    in_features=10, out_features=5, bias=True
    (dropout): Dropout(p=0.5, inplace=False)
  )
  (relu): ReLU()
  (fc2): Linear(
    in_features=5, out_features=2, bias=True
    (dropout): Dropout(p=0.5, inplace=False)
  )
)

### 3 `bfloat16()`

Casts all floating point parameters and buffers to bfloat16 datatype. #TODO

### 4 `buffers(recurse=True)`

Used to retrieve the buffers of a `nn.Module` and its submodules. Buffers in PyTorch are a way to store persistent state that is not a parameter. They are typically used for non-learnable and non-trainable quantities.

- `Buffers`: Buffers in PyTorch are attributes that are not considered parameters but still persist within the model. They might store information relevant for inference or any other non-learnable state.
- `buffers(recurse=True)`: This method is used to get an iterator over all the buffers in a module, including buffers in its submodules if recurse=True.

In this example, the MyModel class has two buffers (running_mean and running_var). The buffers(recurse=True) method is then used to iterate over all the buffers in the model, including those in its submodules. This can be useful when you want to inspect or manipulate non-learnable parameters across your entire model.

The purpose of using buffers, and subsequently the buffers method, includes:

i. Maintaining State: Buffers are useful for maintaining state information that is not a learnable parameter. For example, in batch normalization layers, running mean and running variance are typically stored as buffers.

ii. Inspection and Modification: The buffers method allows you to iterate over buffers, inspect their values, and potentially modify them. This can be useful for tasks like resetting or updating certain states during training or inference.

iii. Serialization and Saving: Buffers are included when saving and loading models. They are part of the model's persistent state.

In [4]:
import torch
import torch.nn as nn

class MyModel(nn.Module):

    def __init__(self):
        super(MyModel, self).__init__()

        self.register_buffer('running_mean', torch.zeros(5))
        self.register_buffer('running_var', torch.ones(5))

        self.layer = nn.Linear(5, 2)

model = MyModel()



for buffer in model.buffers(recurse=True):
    print(f"Buffer Values: {buffer}")

Buffer Values: tensor([0., 0., 0., 0., 0.])
Buffer Values: tensor([1., 1., 1., 1., 1.])


### 5 `children()`

This is used to get an iterator over the immediate child modules in a nn.Module. It returns an iterator that yields pairs containing the name and module of each immediate child.

- `Children`: In the context of PyTorch modules, "children" refer to immediate submodules contained within a module.
- `children()`: This method is used to get an iterator over the immediate child modules of a module.

In [6]:
import torch.nn as nn

class MyModel(nn.Module):
    def __init__(self):
        super(MyModel, self).__init__()

        # Submodules
        self.layer1 = nn.Linear(10, 5)
        self.relu = nn.ReLU()
        self.layer2 = nn.Linear(5, 2)

# Instantiate the model
model = MyModel()

for child in model.children():
    print(f"Child Module : {child}")

Child Module : Linear(in_features=10, out_features=5, bias=True)
Child Module : ReLU()
Child Module : Linear(in_features=5, out_features=2, bias=True)


The purpose of using children() includes:

i. `Inspection and Modification`: You can use children() to iterate over immediate child modules for inspection or modification. For example, you might want to freeze the parameters of specific layers or apply a certain operation to each submodule.

ii. `Dynamic Model Construction`: If you are dynamically constructing or modifying your model based on certain conditions, children() allows you to traverse the immediate child modules for such operations.

iii. `Layer-wise Operations`: You may want to perform layer-wise operations, and children() provides a convenient way to access individual layers within your model.

In [7]:
for child in model.children():
    if isinstance(child, nn.Linear):
        for param in child.parameters():
            param.requires_grad = False

In the above example, it freezes the parameters of all linear layers within the model by setting requires_grad to False. This demonstrates how children() can be used for layer-wise operations.