# Pytorch Study

## week 2
---

## 목차
- Slicing 1D Array
- Slicing 2D Array
- Loading Data from .csv file
- Imports
- Low-level Implementation
- High-level Implementation with nn.Module
- Dataset and DataLoader

### Slicing 1D Array

In [1]:
nums = [0, 1, 2, 3, 4]

In [2]:
print(nums)

[0, 1, 2, 3, 4]


In [4]:
print(nums[2:4])

[2, 3]


In [5]:
print(nums[2:])

[2, 3, 4]


In [6]:
print(nums[:2])

[0, 1]


In [7]:
print(nums[:])

[0, 1, 2, 3, 4]


In [8]:
nums[2:4] = [8, 9]

In [9]:
print(nums)

[0, 1, 8, 9, 4]


### Slicing 2D Array

In [10]:
import numpy as np

In [11]:
b = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])

In [12]:
print(b)


[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]


In [13]:
b[:, 1]

array([ 2,  6, 10])

In [14]:
b[-1]

array([ 9, 10, 11, 12])

In [15]:
b[-1, :]

array([ 9, 10, 11, 12])

In [16]:
b[-1, ...]

array([ 9, 10, 11, 12])

In [17]:
b[0:2, :]

array([[1, 2, 3, 4],
       [5, 6, 7, 8]])

### Loading Data from .csv file

In [18]:
import numpy as np

In [19]:
xy = np.loadtxt('data-01-test-score.csv', delimiter=',', dtype=np.float32)

In [20]:
x_data = xy[:, 0:-1]
y_data = xy[:, [-1]]

In [21]:
print(x_data.shape) # x_data shape
print(len(x_data))  # x_data 길이
print(x_data[:5])   # 첫 다섯 개

(25, 3)
25
[[ 73.  80.  75.]
 [ 93.  88.  93.]
 [ 89.  91.  90.]
 [ 96.  98. 100.]
 [ 73.  66.  70.]]


In [23]:
print(y_data.shape) # y_data shape
print(len(y_data))  # y_data 길이
print(y_data[:5])   # 첫 다섯 개

(25, 1)
25
[[152.]
 [185.]
 [180.]
 [196.]
 [142.]]


### Imports

In [25]:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

In [26]:
# For reproducibility
torch.manual_seed(1)

<torch._C.Generator at 0x1051699d0>

### Low-level Implementation

In [27]:
# 데이터
x_train = torch.FloatTensor(x_data)
y_train = torch.FloatTensor(y_data)
# 모델 초기화
W = torch.zeros((3, 1), requires_grad=True)
b = torch.zeros(1, requires_grad=True)
# optimizer 설정
optimizer = optim.SGD([W, b], lr=1e-5)

nb_epochs = 20
for epoch in range(nb_epochs + 1):
    
    # H(x) 계산
    hypothesis = x_train.matmul(W) + b # or .mm or @

    # cost 계산
    cost = torch.mean((hypothesis - y_train) ** 2)

    # cost로 H(x) 개선
    optimizer.zero_grad()
    cost.backward()
    optimizer.step()

    # 100번마다 로그 출력
    print('Epoch {:4d}/{} Cost: {:.6f}'.format(
        epoch, nb_epochs, cost.item()
    ))

Epoch    0/20 Cost: 26811.960938
Epoch    1/20 Cost: 9920.530273
Epoch    2/20 Cost: 3675.298340
Epoch    3/20 Cost: 1366.260742
Epoch    4/20 Cost: 512.542236
Epoch    5/20 Cost: 196.896500
Epoch    6/20 Cost: 80.190880
Epoch    7/20 Cost: 37.038647
Epoch    8/20 Cost: 21.081310
Epoch    9/20 Cost: 15.178737
Epoch   10/20 Cost: 12.993670
Epoch   11/20 Cost: 12.183015
Epoch   12/20 Cost: 11.880560
Epoch   13/20 Cost: 11.765955
Epoch   14/20 Cost: 11.720859
Epoch   15/20 Cost: 11.701429
Epoch   16/20 Cost: 11.691514
Epoch   17/20 Cost: 11.685092
Epoch   18/20 Cost: 11.679989
Epoch   19/20 Cost: 11.675388
Epoch   20/20 Cost: 11.670947


### High-level Implementation with nn.Module

In [28]:
class MultivariateLinearRegressionModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.linear = nn.Linear(3, 1)

    def forward(self, x):
        return self.linear(x)

In [29]:
# 데이터
x_train = torch.FloatTensor(x_data)
y_train = torch.FloatTensor(y_data)
# 모델 초기화
model = MultivariateLinearRegressionModel()
# optimizer 설정
optimizer = optim.SGD(model.parameters(), lr=1e-5)

nb_epochs = 20
for epoch in range(nb_epochs+1):
    
    # H(x) 계산
    prediction = model(x_train)
    
    # cost 계산
    cost = F.mse_loss(prediction, y_train)
    
    # cost로 H(x) 개선
    optimizer.zero_grad()
    cost.backward()
    optimizer.step()
    
    # 20번마다 로그 출력
    print('Epoch {:4d}/{} Cost: {:.6f}'.format(
        epoch, nb_epochs, cost.item()
    ))

Epoch    0/20 Cost: 28693.490234
Epoch    1/20 Cost: 10618.750000
Epoch    2/20 Cost: 3936.015625
Epoch    3/20 Cost: 1465.219727
Epoch    4/20 Cost: 551.693726
Epoch    5/20 Cost: 213.934692
Epoch    6/20 Cost: 89.052223
Epoch    7/20 Cost: 42.875988
Epoch    8/20 Cost: 25.799623
Epoch    9/20 Cost: 19.482416
Epoch   10/20 Cost: 17.143108
Epoch   11/20 Cost: 16.274498
Epoch   12/20 Cost: 15.949720
Epoch   13/20 Cost: 15.825976
Epoch   14/20 Cost: 15.776569
Epoch   15/20 Cost: 15.754656
Epoch   16/20 Cost: 15.742903
Epoch   17/20 Cost: 15.734902
Epoch   18/20 Cost: 15.728335
Epoch   19/20 Cost: 15.722226
Epoch   20/20 Cost: 15.716338


### Dataset and DataLoader

너무 데이터가 크면 x_data, y_data 를 전부 다 가져오지 말고, 필요한 배치만 가져올 수 밖에 없다.

[PyTorch Data Loading and Processing tutorial](https://pytorch.org/tutorials/beginner/data_loading_tutorial.html#iterating-through-the-dataset)