# PyTorch로 시작하는 딥 러닝 입문

- Book: https://wikidocs.net/book/2788
- Chapter : https://wikidocs.net/53560



## 01. Linear Regression

개념

### Data Definition

#### Train-set, Test-set

- 모든 train-set은 `torch.tensor` 형태를 갖고 있어야

#### Hypothesis

- $y = Wx + b$ 
- $H(x)=Wx+b$

#### Compute Loss

- Cost function == loss function == error function == objective function
- MSE ; Mean Squared Error, 평균 제곱 오차
- MSE를 최소화하는 곡선

#### Optimizer - Gradient Descent

- basic optimizer algorithm
- learning rate; 학습률, W-cost 곡선에서 W 값을 얼마나 크게 움직일지 결정



### PyTorch로 구현하기

#### Initial Setting

In [3]:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

torch.manual_seed(1)

<torch._C.Generator at 0x7f00f4e247b0>

In [4]:
import numpy as np
import numpy.random as rd

#### train-set

In [0]:
x_train = torch.FloatTensor(np.arange(1, 4).reshape(3,-1))
y_train = torch.FloatTensor(np.arange(1, 4).reshape(3,-1)*2)

x_train, y_train

(tensor([[1.],
         [2.],
         [3.]]), tensor([[2.],
         [4.],
         [6.]]))

#### 가중치 W, 편향 b 초기화

`torch.zeros`

- `requires_grad=True` : 학습을 통해 값이 변경되는 변수임을 명시

In [0]:
torch.zeros?

In [0]:
W = torch.zeros(1, requires_grad=True)
W

tensor([0.], requires_grad=True)

In [0]:
b = torch.zeros(1, requires_grad=True)
b

tensor([0.], requires_grad=True)

#### set Hypothesis

In [0]:
hypo = x_train * W + b
hypo

tensor([[0.],
        [0.],
        [0.]], grad_fn=<AddBackward0>)

#### declare cost function

In [0]:
cost = torch.mean( (hypo - y_train) ** 2 )
cost

tensor(18.6667, grad_fn=<MeanBackward0>)

#### implement Gradient Descent

```python
optim.SGD(
    params,
    lr=torch.optim.optimizer._RequiredParameter instance, 
    momentum=0, 
    dampening=0, 
    weight_decay=0, 
    nesterov=False
    )
```

- lr : learning rate

In [0]:
??optim.SGD

In [0]:
optimizer = optim.SGD([W, b], lr=0.01)
optimizer

SGD (
Parameter Group 0
    dampening: 0
    lr: 0.01
    momentum: 0
    nesterov: False
    weight_decay: 0
)

- `optimizer.zero_grad()`
    - 미분 값 gradient를 0으로 초기화
    - Pytorch는 gradient 값을 이전 gradient 값에 누적시키는 특징이 있으므로.. (??)
- `cost.backward()` : 비용함수 미분하여 gradient 계산
- `optimizer.step()` : update W, b

In [0]:
epochs = 3000

for epoch in range(epochs + 1) :
    hypo = x_train * W + b
    cost = torch.mean((hypo - y_train) ** 2)

    optimizer.zero_grad()

    cost.backward()

    optimizer.step()

    if epoch % 100 == 0 :
        print (f"Epoch {epoch:4d} / {epochs} \nW : {W.item():.3f}, b : {b.item():.3f}, cost : {cost.item():.6f}")

Epoch    0 / 3000 
W : 0.187, b : 0.080, cost : 18.666666
Epoch  100 / 3000 
W : 1.746, b : 0.578, cost : 0.048171
Epoch  200 / 3000 
W : 1.800, b : 0.454, cost : 0.029767
Epoch  300 / 3000 
W : 1.843, b : 0.357, cost : 0.018394
Epoch  400 / 3000 
W : 1.876, b : 0.281, cost : 0.011366
Epoch  500 / 3000 
W : 1.903, b : 0.221, cost : 0.007024
Epoch  600 / 3000 
W : 1.924, b : 0.174, cost : 0.004340
Epoch  700 / 3000 
W : 1.940, b : 0.136, cost : 0.002682
Epoch  800 / 3000 
W : 1.953, b : 0.107, cost : 0.001657
Epoch  900 / 3000 
W : 1.963, b : 0.084, cost : 0.001024
Epoch 1000 / 3000 
W : 1.971, b : 0.066, cost : 0.000633
Epoch 1100 / 3000 
W : 1.977, b : 0.052, cost : 0.000391
Epoch 1200 / 3000 
W : 1.982, b : 0.041, cost : 0.000242
Epoch 1300 / 3000 
W : 1.986, b : 0.032, cost : 0.000149
Epoch 1400 / 3000 
W : 1.989, b : 0.025, cost : 0.000092
Epoch 1500 / 3000 
W : 1.991, b : 0.020, cost : 0.000057
Epoch 1600 / 3000 
W : 1.993, b : 0.016, cost : 0.000035
Epoch 1700 / 3000 
W : 1.995, 

## 02. 자동 미분 Autograd

`requires_grad=`, `backward()` 등

- `requires_grad=` : `grad` 에 gradient 값 저장


In [0]:
auto_w = torch.tensor(2.0, requires_grad=True)

for i in range(4) : 
    auto_y = auto_w**2
    auto_z = 2 * auto_y + 5
    auto_z.backward()
    print (f"수식 auto_z를 auto_w로 미분한 값 : {auto_w.grad}")

수식 auto_z를 auto_w로 미분한 값 : 8.0
수식 auto_z를 auto_w로 미분한 값 : 16.0
수식 auto_z를 auto_w로 미분한 값 : 24.0
수식 auto_z를 auto_w로 미분한 값 : 32.0


In [0]:
noauto_w = torch.tensor(2.0, requires_grad=False)

for i in range(4) : 
    noauto_y = auto_w**2
    noauto_z = 2 * noauto_y + 5
    noauto_z.backward()
    print (f"수식 auto_z를 auto_w로 미분한 값 : {noauto_w.grad}")

수식 auto_z를 auto_w로 미분한 값 : None
수식 auto_z를 auto_w로 미분한 값 : None
수식 auto_z를 auto_w로 미분한 값 : None
수식 auto_z를 auto_w로 미분한 값 : None


## 03. Multivariable Linear Regression

### 03-01. Data Definition

### 03-02. Implement in PyTorch

make dummy train-set

_train-set에 3자리수 데이터를 주면 답을 못찾고 nan만 뿜어댐..._

In [0]:
torch.manual_seed(0)

x1_train = torch.FloatTensor(np.random.randint(10, 100, (5, 1)))
print (x1_train)

x2_train = torch.FloatTensor(np.random.randint(10, 100, (5, 1)))
print (x2_train)

x3_train = torch.FloatTensor(np.random.randint(10, 100, (5, 1)))
print (x3_train)

# x1_train = torch.FloatTensor([[73], [93], [89], [96], [73]])
# x2_train = torch.FloatTensor([[80], [88], [91], [98], [66]])
# x3_train = torch.FloatTensor([[75], [93], [90], [100], [70]])
# y_train = torch.FloatTensor([[152], [185], [180], [196], [142]])

# x2_train = torch.randint(101, 1000, (5,1), dtype=float)
# x3_train = torch.randint(101, 1000, (5,1), dtype=float)

y_train = torch.FloatTensor (x1_train + x2_train + x3_train)

# x1_train = torch.FloatTensor([[776],         [923],         [555],         [946],         [920]])
# x2_train = torch.FloatTensor([[879],        [887],        [367],        [482],        [867]])
# x3_train = torch.FloatTensor([[313],        [676],        [265],        [560],        [534]])
# y_train = torch.FloatTensor ([[1968], [2486], [1187], [1988], [2321]])

print (y_train)

tensor([[11.],
        [97.],
        [54.],
        [31.],
        [75.]])
tensor([[38.],
        [13.],
        [25.],
        [46.],
        [77.]])
tensor([[16.],
        [80.],
        [96.],
        [83.],
        [34.]])
tensor([[ 65.],
        [190.],
        [175.],
        [160.],
        [186.]])


In [0]:
# print (x1_train, "\n", x2_train,"\n", x3_train,  "\n", y_train)

initialize w1, w2, w3, b

In [0]:
w1 = torch.zeros(1, requires_grad=True)
w2 = torch.zeros(1, requires_grad=True)
w3 = torch.zeros(1, requires_grad=True)
b = torch.zeros(1, requires_grad=True)

b

tensor([0.], requires_grad=True)

declare optimizer

In [0]:
opti = optim.SGD([w1,w2,w3,b], lr=1e-5)
opti

SGD (
Parameter Group 0
    dampening: 0
    lr: 1e-05
    momentum: 0
    nesterov: False
    weight_decay: 0
)

execute gradient descent 10,000 times

In [0]:
epochs = int(1e+5)

for epo in range(1, epochs + 1) :
    hypo = x1_train * w1 + x2_train * w2 + x3_train * w3 + b
    cost = torch.mean((hypo - y_train) ** 2)
    # print (cost)

    opti.zero_grad()
    cost.backward()
    opti.step()

    if epo % 100 == 0 or epo == 1: 
        print (f"Epoch : {epo:4d}/{epochs}\nw1 : {w1.item():.3f} w2 : {w2.item():.3f} w3 : {w3.item():.3f} b : {b.item():.3f} cost : {cost.item():.6f}")

Epoch :    1/100000
w1 : 0.190 w2 : 0.124 w3 : 0.211 b : 0.003 cost : 26229.199219
Epoch :  100/100000
w1 : 1.018 w2 : 0.938 w3 : 1.018 b : 0.019 cost : 3.952382
Epoch :  200/100000
w1 : 1.007 w2 : 0.988 w3 : 1.000 b : 0.019 cost : 0.140882
Epoch :  300/100000
w1 : 1.003 w2 : 0.997 w3 : 0.999 b : 0.019 cost : 0.008583
Epoch :  400/100000
w1 : 1.001 w2 : 0.999 w3 : 0.999 b : 0.019 cost : 0.000809
Epoch :  500/100000
w1 : 1.000 w2 : 1.000 w3 : 1.000 b : 0.019 cost : 0.000105
Epoch :  600/100000
w1 : 1.000 w2 : 1.000 w3 : 1.000 b : 0.019 cost : 0.000030
Epoch :  700/100000
w1 : 1.000 w2 : 1.000 w3 : 1.000 b : 0.019 cost : 0.000022
Epoch :  800/100000
w1 : 1.000 w2 : 1.000 w3 : 1.000 b : 0.019 cost : 0.000021
Epoch :  900/100000
w1 : 1.000 w2 : 1.000 w3 : 1.000 b : 0.019 cost : 0.000021
Epoch : 1000/100000
w1 : 1.000 w2 : 1.000 w3 : 1.000 b : 0.019 cost : 0.000021
Epoch : 1100/100000
w1 : 1.000 w2 : 1.000 w3 : 1.000 b : 0.019 cost : 0.000021
Epoch : 1200/100000
w1 : 1.000 w2 : 1.000 w3 : 1

### 03-03. 벡터와 행렬 연산으로 바꾸기

- 위의 코드는 변수x와 가중치w 수가 수천 개일 경우, 굉장히 비효율적
- 이를 위해 벡터의 내적(Dot Product) 사용할 필요
- 종속변수 x의 수 = samples (data 단위) * features (독립변수 x)


### 03-04. 행렬 연산 고려하여 PyTorch로 구현

In [4]:
x_train  =  torch.FloatTensor([[73,  80,  75], 
                               [93,  88,  93], 
                               [89,  91,  90], 
                               [96,  98,  100],   
                               [73,  66,  70]])  
y_train  =  torch.FloatTensor([[152],  [185],  [180],  [196],  [142]])

x_train.shape, y_train.shape

(torch.Size([5, 3]), torch.Size([5, 1]))

In [0]:
W = torch.zeros((3,1), requires_grad=True)
b = torch.zeros(1, requires_grad=True)

In [11]:
opti = optim.SGD([W, b], lr=1e-5)
opti

SGD (
Parameter Group 0
    dampening: 0
    lr: 1e-05
    momentum: 0
    nesterov: False
    weight_decay: 0
)

In [12]:
epochs = int(1e+5)

for epo in range(1, epochs+1) :
    hypo = x_train.matmul(W) + b

    cost = torch.mean((hypo - y_train) ** 2)

    opti.zero_grad()
    cost.backward()
    opti.step()

    if epo % 100 == 0 :
        print (f"Epoch : {epo:4d}/{epochs}\nhypothesis : {hypo.squeeze().detach()} cost : {cost.item():.6f}")

Epoch :  100/100000
hypothesis : tensor([152.7695, 183.6982, 180.9592, 197.0628, 140.1332]) cost : 1.564299
Epoch :  200/100000
hypothesis : tensor([152.7277, 183.7271, 180.9466, 197.0518, 140.1727]) cost : 1.498234
Epoch :  300/100000
hypothesis : tensor([152.6870, 183.7551, 180.9344, 197.0410, 140.2112]) cost : 1.435647
Epoch :  400/100000
hypothesis : tensor([152.6474, 183.7825, 180.9225, 197.0305, 140.2487]) cost : 1.376296
Epoch :  500/100000
hypothesis : tensor([152.6089, 183.8091, 180.9109, 197.0202, 140.2852]) cost : 1.320047
Epoch :  600/100000
hypothesis : tensor([152.5714, 183.8349, 180.8997, 197.0102, 140.3208]) cost : 1.266736
Epoch :  700/100000
hypothesis : tensor([152.5350, 183.8601, 180.8888, 197.0004, 140.3554]) cost : 1.216203
Epoch :  800/100000
hypothesis : tensor([152.4995, 183.8846, 180.8781, 196.9908, 140.3891]) cost : 1.168279
Epoch :  900/100000
hypothesis : tensor([152.4651, 183.9085, 180.8678, 196.9815, 140.4220]) cost : 1.122853
Epoch : 1000/100000
hypothes

In [13]:
hypo

tensor([[151.5023],
        [184.6398],
        [180.6607],
        [196.1304],
        [141.9784]], grad_fn=<AddBackward0>)

In [14]:
W

tensor([[1.0357],
        [0.5156],
        [0.4611]], requires_grad=True)

In [15]:
b

tensor([0.0695], requires_grad=True)

In [16]:
cost

tensor(0.1663, grad_fn=<MeanBackward0>)

## 04. nn.Module로 구현하는 선형 회귀

- 위까지는 선형 회귀 모델을 직접 구현
- 지금부터는 PyTorch에 구현된 함수 사용

### 04-01. 단순 선형 회귀 구현

간단하게 $y = 2x$ 구현
- _여기서는 x_train이 두자리만 넘어가도 nan 폭발.._

In [21]:
x_train = torch.randint(1,10, (10,1), dtype=torch.float32)
y_train = x_train*2

print (x_train)

print (y_train)

tensor([[9.],
        [7.],
        [3.],
        [3.],
        [7.],
        [6.],
        [6.],
        [6.],
        [1.],
        [7.]])
tensor([[18.],
        [14.],
        [ 6.],
        [ 6.],
        [14.],
        [12.],
        [12.],
        [12.],
        [ 2.],
        [14.]])


In [16]:
# nn.Linear?

In [30]:
model = nn.Linear(in_features=1, out_features=1)

model의 W, b 값은 random하게 초기화

In [31]:
list (model.parameters())

[Parameter containing:
 tensor([[-0.9614]], requires_grad=True),
 Parameter containing:
 tensor([-0.4768], requires_grad=True)]

In [32]:
opti = torch.optim.SGD(model.parameters(), lr = 0.01)
opti

SGD (
Parameter Group 0
    dampening: 0
    lr: 0.01
    momentum: 0
    nesterov: False
    weight_decay: 0
)

train data는 float type으로 입력해야함
- long, double 모두 에러 발생

> *Expected object of scalar type Float but got scalar type Double for argument #2 'mat1' in call to _th_addmm*



In [33]:
epochs = int( 1e+3 ) * 3

for epo in range(1, epochs+1) :
    hypo = model(x_train)
    cost = F.mse_loss(hypo, y_train)

    opti.zero_grad()
    cost.backward()
    opti.step()

    if epo % 100 == 0 : 
        print (f"Epoch {epo:4d} / {epochs} \tCost : {cost.item():.6f}\nHypothesis : {hypo}")

Epoch  100 / 3000 	Cost : 0.000021
Hypothesis : tensor([[18.0047],
        [14.0011],
        [ 5.9937],
        [ 5.9937],
        [14.0011],
        [11.9992],
        [11.9992],
        [11.9992],
        [ 1.9900],
        [14.0011]], grad_fn=<AddmmBackward>)
Epoch  200 / 3000 	Cost : 0.000012
Hypothesis : tensor([[18.0035],
        [14.0008],
        [ 5.9953],
        [ 5.9953],
        [14.0008],
        [11.9994],
        [11.9994],
        [11.9994],
        [ 1.9925],
        [14.0008]], grad_fn=<AddmmBackward>)
Epoch  300 / 3000 	Cost : 0.000006
Hypothesis : tensor([[18.0026],
        [14.0006],
        [ 5.9965],
        [ 5.9965],
        [14.0006],
        [11.9996],
        [11.9996],
        [11.9996],
        [ 1.9944],
        [14.0006]], grad_fn=<AddmmBackward>)
Epoch  400 / 3000 	Cost : 0.000004
Hypothesis : tensor([[18.0020],
        [14.0004],
        [ 5.9974],
        [ 5.9974],
        [14.0004],
        [11.9997],
        [11.9997],
        [11.9997],
        

- forward 연산 ; hypothesis를 통해 x 입력 - y 산출하는 연산

- backward 연산 ; 비용 함수 미분하여 기울기 구하는 연산
    - `cost.backward()`


In [42]:
new_var = torch.FloatTensor([[4.0]])

print (f"임의의 수 4에 대한 예측값 : {model(new_var).item()} ")

임의의 수 4에 대한 예측값 : 7.999998569488525 


In [43]:
list(model.parameters())

[Parameter containing:
 tensor([[2.0000]], requires_grad=True),
 Parameter containing:
 tensor([-5.2424e-06], requires_grad=True)]

### 04-02. 다중 선형 회귀 구현

In [66]:
x_train = torch.FloatTensor([[73, 80, 75],
                             [93, 88, 93],
                             [89, 91, 90],
                             [96, 98, 100],
                             [73, 66, 70]])
y_train = torch.FloatTensor([[152], [185], [180], [196], [142]])

In [71]:
model = nn.Linear(3, 1)

In [72]:
list (model.parameters())

[Parameter containing:
 tensor([[ 0.4202, -0.0856,  0.3247]], requires_grad=True),
 Parameter containing:
 tensor([0.1856], requires_grad=True)]

In [73]:
opti = torch.optim.SGD(model.parameters(), lr = 1e-6)

In [74]:
epochs = int(1e+5) * 5

for epo in range(1, epochs+1) :
    hypo = model(x_train)
    cost = F.mse_loss(hypo, y_train)

    opti.zero_grad()
    cost.backward()
    opti.step()

    if epo % 100 == 0 : 
        print (f"Epoch {epo:4d} / {epochs} \tCost : {cost.item():.6f}\nHypo : {hypo}\n")

      [184.6775],
        [180.5911],
        [196.2968],
        [141.8762]], grad_fn=<AddmmBackward>)

Epoch 488600 / 500000 	Cost : 0.178972
Hypo : tensor([[151.4186],
        [184.6775],
        [180.5912],
        [196.2967],
        [141.8762]], grad_fn=<AddmmBackward>)

Epoch 488700 / 500000 	Cost : 0.178962
Hypo : tensor([[151.4187],
        [184.6774],
        [180.5912],
        [196.2967],
        [141.8763]], grad_fn=<AddmmBackward>)

Epoch 488800 / 500000 	Cost : 0.178957
Hypo : tensor([[151.4187],
        [184.6774],
        [180.5912],
        [196.2966],
        [141.8763]], grad_fn=<AddmmBackward>)

Epoch 488900 / 500000 	Cost : 0.178945
Hypo : tensor([[151.4187],
        [184.6774],
        [180.5912],
        [196.2966],
        [141.8763]], grad_fn=<AddmmBackward>)

Epoch 489000 / 500000 	Cost : 0.178938
Hypo : tensor([[151.4187],
        [184.6774],
        [180.5912],
        [196.2965],
        [141.8763]], grad_fn=<AddmmBackward>)

Epoch 489100 / 500000 	Cost : 

In [79]:
new_val = torch.randint(11, 100, (1,3), dtype=torch.float32)
print (new_val)

print (f"훈련 후 {new_val} 에 대한 예측 {model(new_val)}")

tensor([[46., 49., 87.]])
훈련 후 tensor([[46., 49., 87.]]) 에 대한 예측 tensor([[116.4236]], grad_fn=<AddmmBackward>)


In [82]:
list (model.parameters())

[Parameter containing:
 tensor([[0.9783, 0.4874, 0.5438]], requires_grad=True),
 Parameter containing:
 tensor([0.2316], requires_grad=True)]

## 05. PyTorch Model in Class

url : https://wikidocs.net/60036

### 05-02. 단순 선형 회귀 -> 클래스로 구현

In [95]:
nn.Module?

[0;31mInit signature:[0m [0mnn[0m[0;34m.[0m[0mModule[0m[0;34m([0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m     
Base class for all neural network modules.

Your models should also subclass this class.

Modules can also contain other Modules, allowing to nest them in
a tree structure. You can assign the submodules as regular attributes::

    import torch.nn as nn
    import torch.nn.functional as F

    class Model(nn.Module):
        def __init__(self):
            super(Model, self).__init__()
            self.conv1 = nn.Conv2d(1, 20, 5)
            self.conv2 = nn.Conv2d(20, 20, 5)

        def forward(self, x):
            x = F.relu(self.conv1(x))
            return F.relu(self.conv2(x))

Submodules assigned in this way will be registered, and will have their
parameters converted too when you call :meth:`to`, etc.
[0;31mInit docstring:[0m Initializes internal Module state, shared by both nn.Module and ScriptModule.
[0;31mFile:[0m           /opt/con

In [85]:
x_train = torch.randint(11, 50, (5,1), dtype = torch.float32)
y_train = x_train * 2

y_train

tensor([[60.],
        [92.],
        [28.],
        [50.],
        [60.]])

In [90]:
class SimpleLinearRegressionModel(nn.Module) :
    def __init__(self):
        super().__init__()
        self.linear = nn.Linear(1,1)

    def forward(self, x) :
        return self.linear(x)

model_simple = SimpleLinearRegressionModel()

In [91]:
opti_simple = torch.optim.SGD(model_simple.parameters(), lr=1e-3)
opti_simple

SGD (
Parameter Group 0
    dampening: 0
    lr: 0.001
    momentum: 0
    nesterov: False
    weight_decay: 0
)

In [92]:
epochs_simple = int (1e+5)

for epo in range(1, epochs_simple+1) :
    hypo = model_simple(x_train)

    cost = F.mse_loss(hypo, y_train)

    opti_simple.zero_grad()
    cost.backward()
    opti_simple.step()

    if epo % 1000 == 0 : 
        print (f"Epoch {epo:4d} / {epochs_simple} \tCost : {cost.item():.6f}\nHypo : {hypo}\n")

Epoch 1000 / 100000 	Cost : 0.002441
Hypo : tensor([[60.0120],
        [91.9398],
        [28.0842],
        [50.0346],
        [60.0120]], grad_fn=<AddmmBackward>)

Epoch 2000 / 100000 	Cost : 0.001559
Hypo : tensor([[60.0096],
        [91.9519],
        [28.0673],
        [50.0276],
        [60.0096]], grad_fn=<AddmmBackward>)

Epoch 3000 / 100000 	Cost : 0.000995
Hypo : tensor([[60.0077],
        [91.9616],
        [28.0538],
        [50.0221],
        [60.0077]], grad_fn=<AddmmBackward>)

Epoch 4000 / 100000 	Cost : 0.000635
Hypo : tensor([[60.0061],
        [91.9693],
        [28.0430],
        [50.0176],
        [60.0061]], grad_fn=<AddmmBackward>)

Epoch 5000 / 100000 	Cost : 0.000405
Hypo : tensor([[60.0049],
        [91.9755],
        [28.0343],
        [50.0141],
        [60.0049]], grad_fn=<AddmmBackward>)

Epoch 6000 / 100000 	Cost : 0.000259
Hypo : tensor([[60.0039],
        [91.9804],
        [28.0274],
        [50.0113],
        [60.0039]], grad_fn=<AddmmBackward>)

Epoc

In [93]:
list(model_simple.parameters())

[Parameter containing:
 tensor([[2.0000]], requires_grad=True),
 Parameter containing:
 tensor([3.8145e-06], requires_grad=True)]

### 05-03. 다중 선형 회귀 클래스로 구현

In [94]:
x_train = torch.FloatTensor([[73, 80, 75],
                             [93, 88, 93],
                             [89, 91, 90],
                             [96, 98, 100],
                             [73, 66, 70]])
y_train = torch.FloatTensor([[152], [185], [180], [196], [142]])

In [98]:
class MultivariableLinearRegressionModel(nn.Module) :
    def __init__(self) :
        super().__init__()
        self.linear = nn.Linear(3, 1)

    def forward(self, x) :
        return self.linear(x)

model_multiple = MultivariableLinearRegressionModel()

opti_multiple = torch.optim.SGD(model_multiple.parameters(), lr = 1e-5)

epochs_multiple = int (1e+5)

for epo in range( 1, epochs_multiple +1 ) :
    hypo = model_multiple(x_train)

    cost = F.mse_loss(hypo, y_train)

    opti_multiple.zero_grad()
    cost.backward()
    opti_multiple.step()

    if epo % 1000 == 0 : 
        print (f"Epoch {epo:4d} / {epochs_multiple} \tCost : {cost.item():.6f}\nHypo : {hypo}\n")

Epoch 1000 / 100000 	Cost : 3.945061
Hypo : tensor([[153.8632],
        [182.9433],
        [181.3112],
        [197.3412],
        [139.0836]], grad_fn=<AddmmBackward>)

Epoch 2000 / 100000 	Cost : 2.433433
Hypo : tensor([[153.2261],
        [183.3821],
        [181.1185],
        [197.1817],
        [139.6765]], grad_fn=<AddmmBackward>)

Epoch 3000 / 100000 	Cost : 1.551108
Hypo : tensor([[152.7421],
        [183.7158],
        [180.9728],
        [197.0567],
        [140.1311]], grad_fn=<AddmmBackward>)

Epoch 4000 / 100000 	Cost : 1.034909
Hypo : tensor([[152.3750],
        [183.9695],
        [180.8627],
        [196.9579],
        [140.4801]], grad_fn=<AddmmBackward>)

Epoch 5000 / 100000 	Cost : 0.731808
Hypo : tensor([[152.0968],
        [184.1620],
        [180.7798],
        [196.8793],
        [140.7486]], grad_fn=<AddmmBackward>)

Epoch 6000 / 100000 	Cost : 0.552784
Hypo : tensor([[151.8866],
        [184.3080],
        [180.7177],
        [196.8162],
        [140.9556]], 

## 06. Mini Batch & Data Load

### 06-01. Mini Batch & Batch Size

- mini batch : 전체 데이터를 작은 단위(batch)로 나누어 해당 단위별 학습
- 1 epoch 마다 mini batch 개수만큼 GD 수행
- mini batch gradient descent

### 06-02. Iteration

- total data 2000, batch size 200 일 때 1 epoch 당 Iteration 은 10
- total data / batch size
- 1 epoch 마다 매개변수 update가 10번 이뤄짐

### 06-03. Data Load

- `TensorDataset` : x_data, y_data
- `DataLoader` : 
    - dataset ; TensorDataset
    - batch_size; 일반적으로 2의 배수 사용
    - shuffle; True = Epoch 마다 dataset shuffle, 권장

```python
DataLoader(
    dataset,
    batch_size=1,
    shuffle=False,
    sampler=None,
    batch_sampler=None,
    num_workers=0,
    collate_fn=None,
    pin_memory=False,
    drop_last=False,
    timeout=0,
    worker_init_fn=None,
    multiprocessing_context=None,
)
```


In [5]:
from torch.utils.data import TensorDataset
from torch.utils.data import DataLoader

In [6]:
DataLoader?

[0;31mInit signature:[0m
[0mDataLoader[0m[0;34m([0m[0;34m[0m
[0;34m[0m    [0mdataset[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mbatch_size[0m[0;34m=[0m[0;36m1[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mshuffle[0m[0;34m=[0m[0;32mFalse[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0msampler[0m[0;34m=[0m[0;32mNone[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mbatch_sampler[0m[0;34m=[0m[0;32mNone[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mnum_workers[0m[0;34m=[0m[0;36m0[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mcollate_fn[0m[0;34m=[0m[0;32mNone[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mpin_memory[0m[0;34m=[0m[0;32mFalse[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mdrop_last[0m[0;34m=[0m[0;32mFalse[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mtimeout[0m[0;34m=[0m[0;36m0[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mworker_init_fn[0m[0;34m=[0m[0;32mNone[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mmultiprocessing_context[0

In [7]:
x_train  =  torch.FloatTensor([[73,  80,  75], 
                               [93,  88,  93], 
                               [89,  91,  90], 
                               [96,  98,  100],   
                               [73,  66,  70]])  
y_train  =  torch.FloatTensor([[152],  [185],  [180],  [196],  [142]])

In [8]:
dataset = TensorDataset(x_train, y_train)

In [9]:
dataloader = DataLoader(dataset, batch_size=3, shuffle=True)

In [10]:
model = nn.Linear(3,1)
optimi = torch.optim.SGD(model.parameters(), lr = 1e-5)

epochs = int (1e+4)
for epo in range( 1, epochs+1 ) :
    for batch_idx, samples in enumerate (dataloader) :
        # print (batch_idx); print (samples)
        
        x_train, y_train = samples

        pred = model(x_train)

        cost = F.mse_loss(pred, y_train)

        optimi.zero_grad()
        cost.backward()
        optimi.step()

        if epo % 1000 == 0 : 
            print (f"Epoch {epo:4d} / {epochs} \tBatch : {batch_idx+1}/{len(dataloader)}\tCost : {cost.item():.6f}\nPrediction : {pred}\n")




Epoch 1000 / 10000 	Batch : 1/2	Cost : 0.238667
Prediction : tensor([[196.2700],
        [151.1987],
        [141.9671]], grad_fn=<AddmmBackward>)

Epoch 1000 / 10000 	Batch : 2/2	Cost : 0.165466
Prediction : tensor([[184.8323],
        [180.5503]], grad_fn=<AddmmBackward>)

Epoch 2000 / 10000 	Batch : 1/2	Cost : 0.127606
Prediction : tensor([[184.7809],
        [180.5775],
        [141.9638]], grad_fn=<AddmmBackward>)

Epoch 2000 / 10000 	Batch : 2/2	Cost : 0.293860
Prediction : tensor([[151.2896],
        [196.2883]], grad_fn=<AddmmBackward>)

Epoch 3000 / 10000 	Batch : 1/2	Cost : 0.181046
Prediction : tensor([[184.7761],
        [151.4037],
        [196.3708]], grad_fn=<AddmmBackward>)

Epoch 3000 / 10000 	Batch : 2/2	Cost : 0.226318
Prediction : tensor([[180.6728],
        [141.9970]], grad_fn=<AddmmBackward>)

Epoch 4000 / 10000 	Batch : 1/2	Cost : 0.168842
Prediction : tensor([[151.3706],
        [184.6863],
        [141.8903]], grad_fn=<AddmmBackward>)

Epoch 4000 / 10000 	Batc

In [12]:
list (model.parameters())

[Parameter containing:
 tensor([[0.9886, 0.4920, 0.5281]], requires_grad=True),
 Parameter containing:
 tensor([0.2905], requires_grad=True)]

## 07. Custom Dataset

### 07-01. Custom Dataset

- `torch.utils.data.Dataset` : abstract class
- `torch.utils.data.DataLoader`

basic structure

```python
class CustomDataset(torch.utils.data.Dataset) :
    def __init__(self) :
        # for Preprocessing

    def __len__(self) : 
        # length of samples

    def __getitem__(self, idx) : 
        # get an sample from dataset
```


### 07-02. Linear Regression with Custom dataset



In [17]:
from torch.utils.data import Dataset, DataLoader

class CustomDataset(Dataset) :

    def __init__(self) :
        self.x_data = [[73, 80, 75],
                    [93, 88, 93],
                    [89, 91, 90],
                    [96, 98, 100],
                    [73, 66, 70]]
        self.y_data = [[152], [185], [180], [196], [142]]

    def __len__(self) : 
        return len(self.x_data)

    def __getitem__(self, idx) : 
        x = torch.FloatTensor(self.x_data[idx])
        y = torch.FloatTensor(self.y_data[idx])
        return x, y

In [18]:
dataset = CustomDataset()
dataloader = DataLoader(dataset, batch_size=2, shuffle=True)

model = torch.nn.Linear(3, 1)
opti = torch.optim.SGD(model.parameters(), lr = 1e-5)

epochs = int (1e+5)
for epo in range(1, epochs+1) :
    for batch_idx, samples in enumerate(dataloader) :
        x_train, y_train = samples
        
        pred = model(x_train)
        cost = F.mse_loss(pred, y_train)

        opti.zero_grad()
        cost.backward()
        opti.step()

        if epo % 1000 == 0 : 
            print (f"Epoch {epo:4d} / {epochs} \tBatch : {batch_idx+1}/{len(dataloader)}\tCost : {cost.item():.6f}\nPrediction : {pred}\n")


AddmmBackward>)

Epoch 47000 / 100000 	Batch : 1/3	Cost : 0.000388
Prediction : tensor([[196.0008],
        [141.9721]], grad_fn=<AddmmBackward>)

Epoch 47000 / 100000 	Batch : 2/3	Cost : 0.244522
Prediction : tensor([[184.5369],
        [151.4760]], grad_fn=<AddmmBackward>)

Epoch 47000 / 100000 	Batch : 3/3	Cost : 0.611950
Prediction : tensor([[180.7823]], grad_fn=<AddmmBackward>)

Epoch 48000 / 100000 	Batch : 1/3	Cost : 0.087570
Prediction : tensor([[151.6842],
        [196.2746]], grad_fn=<AddmmBackward>)

Epoch 48000 / 100000 	Batch : 2/3	Cost : 0.037366
Prediction : tensor([[142.1613],
        [184.7793]], grad_fn=<AddmmBackward>)

Epoch 48000 / 100000 	Batch : 3/3	Cost : 0.677929
Prediction : tensor([[180.8234]], grad_fn=<AddmmBackward>)

Epoch 49000 / 100000 	Batch : 1/3	Cost : 0.339987
Prediction : tensor([[180.7320],
        [151.6204]], grad_fn=<AddmmBackward>)

Epoch 49000 / 100000 	Batch : 2/3	Cost : 0.076501
Prediction : tensor([[184.6102],
        [142.0331]], grad_fn=<

In [19]:
list (model.parameters())

[Parameter containing:
 tensor([[1.0287, 0.5120, 0.4683]], requires_grad=True),
 Parameter containing:
 tensor([0.4940], requires_grad=True)]

In [20]:
new = [[73, 80, 75]]
new_var =  torch.FloatTensor(new) 
print (f"{new} 에 대한 예측 값 : {model(new_var)}")

[[73, 80, 75]] 에 대한 예측 값 : tensor([[151.6714]], grad_fn=<AddmmBackward>)
