Transfer the weights from one model to another one with the same operations but coded in a different way

In [15]:
import torch
import torch.nn as nn

model_a = nn.Sequential(nn.Linear(1, 64), nn.ReLU(), nn.Linear(64,10), nn.ReLU())

def block(in_features, out_features):
    return nn.Sequential(nn.Linear(in_features, out_features),
                        nn.ReLU())

model_b = nn.Sequential(block(1,64), block(64,10))

They are the same model, just defined in two different ways. Now let's initialize the weight of `model_a` with all zeros

In [24]:
model_a

Sequential(
  (0): Linear(in_features=1, out_features=64, bias=True)
  (1): ReLU()
  (2): Linear(in_features=64, out_features=10, bias=True)
  (3): ReLU()
)

In [25]:
model_b

Sequential(
  (0): Sequential(
    (0): Linear(in_features=1, out_features=64, bias=True)
    (1): ReLU()
  )
  (1): Sequential(
    (0): Linear(in_features=64, out_features=10, bias=True)
    (1): ReLU()
  )
)

In [17]:
for m in model_a.modules():
    if isinstance(m, nn.Linear):
        nn.init.zeros_(m.weight)
        
model_a[0].weight.sum()

tensor(0., grad_fn=<SumBackward0>)

And the first layer of `model_b` has weight

In [20]:
model_b[0][0].weight.sum()

tensor(-3.1065, grad_fn=<SumBackward0>)

Now I want to transfer the weights of `model_a` to `model_b`. We can use `ModuleTransfer` with a dummy input.

In [22]:
from torchlego.utils import ModuleTransfer
x = torch.ones(1, 1)

trans = ModuleTransfer(src=model_a, dest=model_b)
trans(x)

Let's check if it worked

In [23]:
model_b[0][0].weight.sum()

tensor(0., grad_fn=<SumBackward0>)

yep