## 4. Build Model

torch.nn은 신경망을 구성하는데 필요한 모든 요소를 제공한다.

PyTorch에서 구현될 모든 모듈은 torch.nn.Module의 하위 클래스여야 한다.

nn.Module의 하위 클래스는 forward 함수에 연산을 구현해야 한다. (forward는 직접 사용하지 말 것!)

In [1]:
import os
import torch
from torch import nn
from torch.utils.data import DataLoader
from torchvision import datasets, transforms

**Get device for training**

In [3]:
device = "mps:0" if torch.backends.mps.is_available() else "cpu"
print(f'Using {device} device')

Using mps:0 device


**Define the class**

In [4]:
class NeuralNetwork(nn.Module):
    def __init__(self):
        super(NeuralNetwork, self).__init__()
        self.flatten = nn.Flatten()
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(28*28, 512),
            nn.ReLU(),
            nn.Linear(512, 512),
            nn.ReLU(),
            nn.Linear(512, 10)
        )

    def forward(self, x):
        x = self.flatten(x)
        logits = self.linear_relu_stack(x)
        return logits

In [5]:
model = NeuralNetwork().to(device)
print(model)

NeuralNetwork(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (linear_relu_stack): Sequential(
    (0): Linear(in_features=784, out_features=512, bias=True)
    (1): ReLU()
    (2): Linear(in_features=512, out_features=512, bias=True)
    (3): ReLU()
    (4): Linear(in_features=512, out_features=10, bias=True)
  )
)


In [30]:
X = torch.rand((2, 28, 28), device=device)
logits = model(X)
pred_probab = nn.Softmax(dim=1)(logits)
y_pred = pred_probab.argmax(1)

print(f'Predictedd class: {y_pred[0]}')
print(f'Predicted probability: {pred_probab[0][y_pred[0]]}')

Predictedd class: 8
Predicted probability: 0.10586117208003998


**Model parameters**

In [39]:
print(f'Model strucrue: {model}\n\n')

for name, param in model.named_parameters():
    print(f'Layer: {name}, Param shape: {param.shape}')
    print(f'Param values: \n {param[:2]} \n')

Model strucrue: NeuralNetwork(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (linear_relu_stack): Sequential(
    (0): Linear(in_features=784, out_features=512, bias=True)
    (1): ReLU()
    (2): Linear(in_features=512, out_features=512, bias=True)
    (3): ReLU()
    (4): Linear(in_features=512, out_features=10, bias=True)
  )
)


Layer: linear_relu_stack.0.weight, Param shape: torch.Size([512, 784])
Param values: 
 tensor([[-0.0279, -0.0204, -0.0153,  ..., -0.0335, -0.0043, -0.0124],
        [-0.0110,  0.0248,  0.0199,  ..., -0.0241, -0.0311, -0.0109]],
       device='mps:0', grad_fn=<SliceBackward0>) 

Layer: linear_relu_stack.0.bias, Param shape: torch.Size([512])
Param values: 
 tensor([ 0.0117, -0.0070], device='mps:0', grad_fn=<SliceBackward0>) 

Layer: linear_relu_stack.2.weight, Param shape: torch.Size([512, 512])
Param values: 
 tensor([[-0.0179,  0.0067,  0.0067,  ...,  0.0243,  0.0263, -0.0154],
        [ 0.0148, -0.0069, -0.0125,  ..., -0.0135,  0.0350, -0.0090]],
     