# nn과 nn.functional의 차이점

### nn 지원기능

* Parameter
* Containers 
* Conv
* Pooling 
* Padding 
* Non-linear Activation
* Normalization
* Recurrent 
* Linear
* Dropout
* Sparse
* Distance
* Loss 
* Vision
* Data Parallel
* Utilities 

### nn.functional 지원기능

* Conv
* Pooling
* Non-linear activation
* Normalization
* Linear function(=Fully connected layer) 
* Dropout
* Distance
* Loss
* Vision

### Conv2d로 살펴본 차이점

#### nn - torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True) 

추가적으로 in_channels가 만약 (1,1,3,3)이라면 첫 번째 1은 batch_size, 두 번째 1은 channel의 개수 세 번째 3은 hight, 네 번째 3은 width입니다.

#### nn.functional - torch.nn.functional.conv2d(input, weight, bias=None, stride=1, padding=0, dilation=1, groups=1)

여기서는 weight를 직접 지정해줘야 합니다. 따라서 외부에서 직접 filter(kernel_size)를 만들어주어야합니다. 

더 구체적으로 먼저 nn.functional을 이용한 방법부터 살펴보겠습니다.

***

# nn.functional을 이용한 3x3 Conv2d

In [1]:
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.autograd import Variable
import numpy as np

In [2]:
input = torch.ones(1,1,3,3)
filter = torch.ones(1,1,3,3)

input = Variable(input, requires_grad=True)
print(input)

Variable containing:
(0 ,0 ,.,.) = 
  1  1  1
  1  1  1
  1  1  1
[torch.FloatTensor of size 1x1x3x3]



In [3]:
# filter는 w로 생각하기 때문에 따로 requires_grad를 하지 않습니다.
filter = Variable(filter) 

out = F.conv2d(input, filter)
print(out)

Variable containing:
(0 ,0 ,.,.) = 
  9
[torch.FloatTensor of size 1x1x1x1]



In [4]:
out.backward()

print(out.grad_fn)

print(input.grad)

<ConvNdBackward object at 0x7f7f6a7eaa58>
Variable containing:
(0 ,0 ,.,.) = 
  1  1  1
  1  1  1
  1  1  1
[torch.FloatTensor of size 1x1x3x3]



***

# nn을 이용한 3x3 Conv2d

In [5]:
input1 = torch.ones(1,1,3,3)
input1 = Variable(input1, requires_grad=True)
print(input1)

Variable containing:
(0 ,0 ,.,.) = 
  1  1  1
  1  1  1
  1  1  1
[torch.FloatTensor of size 1x1x3x3]



In [6]:
func = nn.Conv2d(1,1,3)
print(func.weight)

Parameter containing:
(0 ,0 ,.,.) = 
 -0.1806 -0.1525  0.2559
  0.2310 -0.0641 -0.1582
  0.3043  0.1705 -0.2148
[torch.FloatTensor of size 1x1x3x3]



In [7]:
out1 = func(input1)
print(out1)

print(input1.grad)

Variable containing:
(0 ,0 ,.,.) = 
1.00000e-02 *
   5.5095
[torch.FloatTensor of size 1x1x1x1]

None


In [8]:
out1.backward()

print(input1.grad)

Variable containing:
(0 ,0 ,.,.) = 
 -0.1806 -0.1525  0.2559
  0.2310 -0.0641 -0.1582
  0.3043  0.1705 -0.2148
[torch.FloatTensor of size 1x1x3x3]



***

# 5x5를 3x3 filter로 Conv2d

In [9]:
input2 = torch.ones(1,1,5,5)
input2 = Variable(input2, requires_grad=True)

func = nn.Conv2d(1,1,3, bias=None)
# 아직 채널이 없으므로 (1,1,?)로 고정
print(func.weight)

Parameter containing:
(0 ,0 ,.,.) = 
  0.0934 -0.2952 -0.2234
 -0.1953  0.1912  0.3060
  0.1122 -0.0918  0.2457
[torch.FloatTensor of size 1x1x3x3]



nn.functional로 weight값을 설정해줄 수도 있지만 nn으로도 이렇게 Conv2d의 weight를 직접 설정해줄 수 있습니다. 나중에 직접 layer를 만들거나 다른 네트워크에서 free train된 weight값에다가 직접 weight를 지정하여 집어넣어야 하는 상황이 발생하면 밑의 코드처럼 하면 됩니다. 

In [10]:
func.weight = torch.nn.Parameter(torch.ones(1,1,3,3) + 1)
print(func.weight)

Parameter containing:
(0 ,0 ,.,.) = 
  2  2  2
  2  2  2
  2  2  2
[torch.FloatTensor of size 1x1x3x3]



In [11]:
out2 = func(input2)
print(out2)

Variable containing:
(0 ,0 ,.,.) = 
  18  18  18
  18  18  18
  18  18  18
[torch.FloatTensor of size 1x1x3x3]



In [12]:
out2.backward()

RuntimeError: grad can be implicitly created only for scalar outputs

weight를 지정했기 때문에 backward가 진행되지 않습니다.

***

# ReLu, Max_Pooling, Sigmoid, Tanh

### ReLu

In [13]:
act_input = Variable(torch.randn(1,1,3,3))
print(act_input)

act = F.relu(act_input)
print(act)

Variable containing:
(0 ,0 ,.,.) = 
  0.8566  0.2786  1.3763
  0.1923 -0.7132  2.1710
 -1.3916  0.4761  0.9420
[torch.FloatTensor of size 1x1x3x3]

Variable containing:
(0 ,0 ,.,.) = 
  0.8566  0.2786  1.3763
  0.1923  0.0000  2.1710
  0.0000  0.4761  0.9420
[torch.FloatTensor of size 1x1x3x3]



### Max_Pooling

In [14]:
m = nn.MaxPool2d(2, stride=1)
m_out = m(act)

print(m_out)

Variable containing:
(0 ,0 ,.,.) = 
  0.8566  2.1710
  0.4761  2.1710
[torch.FloatTensor of size 1x1x2x2]



### Sigmoid

In [15]:
act2 = F.sigmoid(m_out)
print(act2)

Variable containing:
(0 ,0 ,.,.) = 
  0.7019  0.8976
  0.6168  0.8976
[torch.FloatTensor of size 1x1x2x2]



### Tanh

In [16]:
act3 = F.tanh(m_out)
print(act3)

Variable containing:
(0 ,0 ,.,.) = 
  0.6945  0.9743
  0.4431  0.9743
[torch.FloatTensor of size 1x1x2x2]



***

# Class로 묶어서 마지막으로 정리

In [17]:
class model(nn.Module):
  def __init__(self):
    super(model, self).__init__()
    self.Max_Pooling = nn.MaxPool2d(2, stride=1)
    self.Avg_Pooling = nn.MaxPool2d(2, stride=1)
  def forward(self, x):
    x = F.relu(x)
    x = self.Max_Pooling(x)
    x = F.tanh(x)
    x = self.Avg_Pooling(x)
    return x

In [18]:
out = model()
print(out(act_input))

# model로 어떤 것들을 선언했는 지를 출력해보겠습니다.
print(out)

Variable containing:
(0 ,0 ,.,.) = 
  0.9743
[torch.FloatTensor of size 1x1x1x1]

model(
  (Max_Pooling): MaxPool2d(kernel_size=(2, 2), stride=(1, 1), dilation=(1, 1), ceil_mode=False)
  (Avg_Pooling): MaxPool2d(kernel_size=(2, 2), stride=(1, 1), dilation=(1, 1), ceil_mode=False)
)
