<a href="https://colab.research.google.com/github/YonggunJung/Fastcompus/blob/main/05_04_linear_layer.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Linear Layer

In [None]:
import torch

## Raw Linear Layer

$$\begin{gathered}
y=x\cdot{W}+b, \\
\text{where }x\in\mathbb{R}^{N\times{n}}\text{, }y\in\mathbb{R}^{N\times{m}}. \\
\\
\text{Thus, }W\in\mathbb{R}^{n\times{m}}\text{ and }b\in\mathbb{R}^m.
\end{gathered}$$

In [None]:
W = torch.FloatTensor([[1, 2],
                       [3, 4],
                       [5, 6]])
b = torch.FloatTensor([2, 2])

In [None]:
print(W.size())
print(b.size())

torch.Size([3, 2])
torch.Size([2])


In [None]:
def linear(x, W, b):
    y = torch.matmul(x, W) + b

    return y

In [None]:
x = torch.FloatTensor([[1, 1, 1],
                       [2, 2, 2],
                       [3, 3, 3],
                       [4, 4, 4]])

print(x.size())

torch.Size([4, 3])


In [None]:
y = linear(x, W, b)

In [None]:
print(y.size())

torch.Size([4, 2])


## nn.Module

In [None]:
import torch.nn as nn

In [None]:
class MyLinear(nn.Module):  # nn.Module을 상속 받는다
# 그리고 보통 __init__함수와 forward함수를 오버라이드한다

    # 나중에 forward에서 쓸 것들을 미리 설정
    def __init__(self, input_dim=3, output_dim=2):
        self.input_dim = input_dim
        self.output_dim = output_dim

        super().__init__()

        self.W = torch.FloatTensor(input_dim, output_dim)
        self.b = torch.FloatTensor(output_dim)

    # You should override 'forward' method to implement detail.
    # The input arguments and outputs can be designed as you wish.
    def forward(self, x):
        # |x| = (batch_size, input_dim)
        y = torch.matmul(x, self.W) + self.b
        # |y| = (batch_size, input_dim) * (input_dim, output_dim)
        #     = (batch_size, output_dim)

        return y

In [None]:
linear = MyLinear(3, 2)

y = linear(x)

In [None]:
print(y.size())

torch.Size([4, 2])


In [None]:
for p in linear.parameters():
    print(p)

You can see that there is no weight parameters to learn.
Above way can forward(or calculate) values, but it cannot be trained.

### Correct way: nn.Parameter

In [None]:
class MyLinear(nn.Module):

    def __init__(self, input_dim=3, output_dim=2):
        self.input_dim = input_dim
        self.output_dim = output_dim

        super().__init__()

        self.W = nn.Parameter(torch.FloatTensor(input_dim, output_dim))
        self.b = nn.Parameter(torch.FloatTensor(output_dim))

    def forward(self, x):
        # |x| = (batch_size, input_dim)
        y = torch.matmul(x, self.W) + self.b
        # |y| = (batch_size, input_dim) * (input_dim, output_dim)
        #     = (batch_size, output_dim)

        return y

Reference: https://pytorch.org/docs/stable/nn.html#torch.nn.Parameter

A kind of Tensor that is to be considered a module parameter.

Parameters are Tensor subclasses, that have a very special property when used with Module s - when they’re assigned as Module attributes they are automatically added to the list of its parameters, and will appear e.g. in parameters() iterator. Assigning a Tensor doesn’t have such effect. This is because one might want to cache some temporary state, like last hidden state of the RNN, in the model. If there was no such class as Parameter, these temporaries would get registered too.

In [None]:
linear = MyLinear(3, 2)

y = linear(x)

In [None]:
print(y.size())

torch.Size([4, 2])


In [None]:
for p in linear.parameters():
    print(p)

Parameter containing:
tensor([[ 0.0000e+00, -0.0000e+00],
        [ 9.1002e+31, -2.8586e-42],
        [ 8.4078e-45,  0.0000e+00]], requires_grad=True)
Parameter containing:
tensor([4.7428e+30, 7.1429e+31], requires_grad=True)


## nn.Linear

In [None]:
# 이걸 하면 위에껄 할 필요 없음
linear = nn.Linear(3, 2)

y = linear(x)

In [None]:
print(y.size())

torch.Size([4, 2])


In [None]:
for p in linear.parameters():
    print(p)

Parameter containing:
tensor([[-0.3386, -0.5355, -0.4991],
        [ 0.1993,  0.4776,  0.1894]], requires_grad=True)
Parameter containing:
tensor([0.1819, 0.0941], requires_grad=True)


### nn.Module can contain other nn.Module's child classes.

In [None]:
class MyLinear(nn.Module):

    def __init__(self, input_dim=3, output_dim=2):
        self.input_dim = input_dim
        self.output_dim = output_dim

        super().__init__()

        self.linear = nn.Linear(input_dim, output_dim)

    def forward(self, x):
        # |x| = (batch_size, input_dim)
        y = self.linear(x)
        # |y| = (batch_size, output_dim)

        return y

In [None]:
linear = MyLinear(3, 2)

y = linear(x)

In [None]:
print(y.size())

torch.Size([4, 2])


In [None]:
for p in linear.parameters():
    print(p)

Parameter containing:
tensor([[-0.1267,  0.0563,  0.3951],
        [ 0.2291,  0.3214,  0.2595]], requires_grad=True)
Parameter containing:
tensor([0.3659, 0.4013], requires_grad=True)
