# Linear Layer

In [1]:
import torch

## Raw Linear Layer

- 신경망의 가장 기본 구성 요소
- fully-connected(FC) Layer라고 불리기도 함
- 내부 파라미터에 따른 선형 변환을 수행하는 함수
- **각 입력 노드들에 weight(가중치)를 곱하고 모두 합친 뒤, bias(편향)을 더한다.**
- 행렬 곱으로 구현 가능
- n차원에서 m차원으로의 ``선형 변환 함수``

$$\begin{gathered}
y=x\cdot{W}+b, \\
\text{where }x\in\mathbb{R}^{N\times{n}}\text{, }y\in\mathbb{R}^{N\times{m}}. \\
\\
\text{Thus, }W\in\mathbb{R}^{n\times{m}}\text{ and }b\in\mathbb{R}^m.
\end{gathered}$$

- 내부 가중치 파라미터 W와 b에 의해 정의됨
- 함수의 파라미터를 잘 조절하면, 주어진 입력에 대해 원하는 출력을 만들 수 있음

In [2]:
W = torch.FloatTensor([[1, 2],
                       [3, 4],
                       [5, 6]])
b = torch.FloatTensor([2, 2])

In [3]:
print(W.size())
print(b.size())

torch.Size([3, 2])
torch.Size([2])


In [4]:
def linear(x, W, b):
    y = torch.matmul(x, W) + b
    
    return y

In [5]:
x = torch.FloatTensor([[1, 1, 1],
                       [2, 2, 2],
                       [3, 3, 3],
                       [4, 4, 4]])

print(x.size())

torch.Size([4, 3])


In [6]:
y = linear(x, W, b)

In [7]:
print(y.size())

torch.Size([4, 2])


## nn.Module 

In [8]:
import torch.nn as nn

In [9]:
class MyLinear(nn.Module):

    def __init__(self, input_dim=3, output_dim=2):
        self.input_dim = input_dim
        self.output_dim = output_dim
        
        super().__init__()
        
        self.W = torch.FloatTensor(input_dim, output_dim)
        self.b = torch.FloatTensor(output_dim)

    # You should override 'forward' method to implement detail.
    # The input arguments and outputs can be designed as you wish.
    def forward(self, x):
        # |x| = (batch_size, input_dim)
        y = torch.matmul(x, self.W) + self.b
        # |y| = (batch_size, input_dim) * (input_dim, output_dim)
        #     = (batch_size, output_dim)
        
        return y

In [10]:
linear = MyLinear(3, 2)

y = linear(x)

In [11]:
print(y.size())

torch.Size([4, 2])


In [12]:
# 틀린 방법이라 답 안나옴
for p in linear.parameters():
    print(p)

You can see that there is no weight parameters to learn.
Above way can forward(or calculate) values, but it cannot be trained.

### Correct way: nn.Parameter

In [13]:
class MyLinear(nn.Module):

    def __init__(self, input_dim=3, output_dim=2):
        self.input_dim = input_dim
        self.output_dim = output_dim
        
        super().__init__()
        
        self.W = nn.Parameter(torch.FloatTensor(input_dim, output_dim)) # 제대로 등록하려면 nn.params 등록
        self.b = nn.Parameter(torch.FloatTensor(output_dim))
        
    def forward(self, x):
        # |x| = (batch_size, input_dim)
        y = torch.matmul(x, self.W) + self.b
        # |y| = (batch_size, input_dim) * (input_dim, output_dim)
        #     = (batch_size, output_dim)
        
        return y

Reference: https://pytorch.org/docs/stable/nn.html#torch.nn.Parameter

A kind of Tensor that is to be considered a module parameter.

Parameters are Tensor subclasses, that have a very special property when used with Module s - when they’re assigned as Module attributes they are automatically added to the list of its parameters, and will appear e.g. in parameters() iterator. Assigning a Tensor doesn’t have such effect. This is because one might want to cache some temporary state, like last hidden state of the RNN, in the model. If there was no such class as Parameter, these temporaries would get registered too.

In [14]:
linear = MyLinear(3, 2)

y = linear(x) # 매핑하지 않아도 forward를 부름(파이토치가 해줌)

In [15]:
print(y.size())

torch.Size([4, 2])


In [16]:
for p in linear.parameters():
    print(p)

Parameter containing:
tensor([[ 0.0000e+00, -0.0000e+00],
        [ 9.1002e+31, -2.8586e-42],
        [ 8.4078e-45,  0.0000e+00]], requires_grad=True)
Parameter containing:
tensor([4.7428e+30, 7.1429e+31], requires_grad=True)


## nn.Linear (선형 변환 함수)
클래스 구현 안해도 nn사용해서 불러올 수 있다! (내장함수 nn 이용)

In [17]:
linear = nn.Linear(3, 2)

y = linear(x)

In [18]:
print(y.size())

torch.Size([4, 2])


In [19]:
for p in linear.parameters():
    print(p)

Parameter containing:
tensor([[-0.3386, -0.5355, -0.4991],
        [ 0.1993,  0.4776,  0.1894]], requires_grad=True)
Parameter containing:
tensor([0.1819, 0.0941], requires_grad=True)


### nn.Module can contain other nn.Module's child classes.

In [20]:
class MyLinear(nn.Module):

    def __init__(self, input_dim=3, output_dim=2):
        self.input_dim = input_dim
        self.output_dim = output_dim
        
        super().__init__()
        
        self.linear = nn.Linear(input_dim, output_dim)
        
    def forward(self, x):
        # |x| = (batch_size, input_dim)
        y = self.linear(x)
        # |y| = (batch_size, output_dim)
        
        return y

In [21]:
linear = MyLinear(3, 2)

y = linear(x)

In [22]:
print(y.size())

torch.Size([4, 2])


In [23]:
for p in linear.parameters():
    print(p)

Parameter containing:
tensor([[-0.1267,  0.0563,  0.3951],
        [ 0.2291,  0.3214,  0.2595]], requires_grad=True)
Parameter containing:
tensor([0.3659, 0.4013], requires_grad=True)
