# Parameter Management

In [1]:
import torch
from torch import nn

In [2]:
net = nn.Sequential(nn.LazyLinear(8), nn.ReLU(), nn.LazyLinear(1))
X = torch.rand(size=(2, 4))
net(X).shape



torch.Size([2, 1])

With the `Sequential` class, we can access any layer by indexing into the model as though it were a list. Each layer's parameters are conveniently located in its attribute.

In [4]:
# these are the parameters of the second fully connected layer
net[2].state_dict()

OrderedDict([('weight',
              tensor([[-0.0554, -0.0875, -0.2165, -0.2495, -0.0541, -0.2581,  0.2215,  0.1358]])),
             ('bias', tensor([0.1051]))])

These values are represented as instance of the parameter class. To manipulate the numerical values.

The following code extracts the bias from the second neural network later, whicch returns a parameter class instance, and further accesses that parameter's value:

In [5]:
type(net[2].bias), net[2].bias.data

(torch.nn.parameter.Parameter, tensor([0.1051]))

Getting all the parameters

In [6]:
[(name, param.shape) for name, param in net.named_parameters()]

[('0.weight', torch.Size([8, 4])),
 ('0.bias', torch.Size([8])),
 ('2.weight', torch.Size([1, 8])),
 ('2.bias', torch.Size([1]))]

In [7]:
# sharing parameters across multiple layers
# we need to give the shared layer a name so that we can refer to its parameters 
shared = nn.LazyLinear(8)
net = nn.Sequential(nn.LazyLinear(8), nn.ReLU(), shared, nn.ReLU(), shared, nn.ReLU(), nn.LazyLinear(1))
net(X)

# check whether the parameters are the same
print(net[2].weight.data[0] == net[4].weight.data[0])
net[2].weight.data[0, 0] = 100
# make sure that they are actually the same object rather than just having the same value
print(net[2].weight.data[0] == net[4].weight.data[0])

tensor([True, True, True, True, True, True, True, True])
tensor([True, True, True, True, True, True, True, True])


