# 2. Linear Regression
## 2-1 Linear Regression
1. Data Definition
2. Hypothesis
3. Compute Loss
4. Gradient Descent

### 1. Data Definition
$X_{train}$: feature(입력, 특징들)
$Y_{train}$: label(결과값)

### 2. Hypothesis

$x$를 이용하여 $y$에 대한 식을 구성하는 것
* input을 이용하여 output을 구하는 전체적인 구조를 의미

*example*

$y=Wx+b$

### 3. Loss

cost function = loss function = objective function

example: MSE

### 4. Gradient Descent

computing the gradient of each variable and subtracting from the weight could modify the weight to have a better output. This leads to compute a better model than before

### 5. Implementation

In [18]:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

In [19]:
torch.manual_seed(42)

<torch._C.Generator at 0x1064d3e30>

In [20]:
x_train = torch.FloatTensor([[1], [2], [3]])
y_train = torch.FloatTensor([[2], [4], [6]])

In [21]:
W = torch.zeros(1, requires_grad=True) #requires_grad: autograd를 사용할건지
print(W)

tensor([0.], requires_grad=True)


In [22]:
b = torch.zeros(1, requires_grad=True)
print(b)

tensor([0.], requires_grad=True)


In [23]:
hypothesis = x_train * W + b
print(hypothesis)

tensor([[0.],
        [0.],
        [0.]], grad_fn=<AddBackward0>)


In [24]:
cost = torch.mean((hypothesis - y_train)**2)
print(cost)

tensor(18.6667, grad_fn=<MeanBackward0>)


In [25]:
optimizer = optim.SGD([W, b], lr=0.01)

In [26]:
optimizer.zero_grad() #gradient 0으로 초기화

cost.backward() #cost function의 gradient 계산

optimizer.step() # update parameter

In [27]:
x_train = torch.FloatTensor([[1], [2], [3]])
y_train = torch.FloatTensor([[2], [4], [6]])

W = torch.zeros(1, requires_grad=True) #requires_grad: autograd를 사용할건지
b = torch.zeros(1, requires_grad=True)
optimizer = optim.SGD([W, b], lr=0.01)
num_epochs = 2000
for epoch in range(num_epochs+1):
  hypothesis = x_train * W + b
  cost = torch.mean((hypothesis - y_train) ** 2)

  optimizer.zero_grad()
  cost.backward()
  optimizer.step()

  if epoch % 100 == 0:
    print('Epoch {:4d}/{} W: {:.3f}, b: {:.3f} Cost: {:.6f}'.format(
            epoch, num_epochs, W.item(), b.item(), cost.item()
        ))

Epoch    0/2000 W: 0.187, b: 0.080 Cost: 18.666666
Epoch  100/2000 W: 1.746, b: 0.578 Cost: 0.048171
Epoch  200/2000 W: 1.800, b: 0.454 Cost: 0.029767
Epoch  300/2000 W: 1.843, b: 0.357 Cost: 0.018394
Epoch  400/2000 W: 1.876, b: 0.281 Cost: 0.011366
Epoch  500/2000 W: 1.903, b: 0.221 Cost: 0.007024
Epoch  600/2000 W: 1.924, b: 0.174 Cost: 0.004340
Epoch  700/2000 W: 1.940, b: 0.136 Cost: 0.002682
Epoch  800/2000 W: 1.953, b: 0.107 Cost: 0.001657
Epoch  900/2000 W: 1.963, b: 0.084 Cost: 0.001024
Epoch 1000/2000 W: 1.971, b: 0.066 Cost: 0.000633
Epoch 1100/2000 W: 1.977, b: 0.052 Cost: 0.000391
Epoch 1200/2000 W: 1.982, b: 0.041 Cost: 0.000242
Epoch 1300/2000 W: 1.986, b: 0.032 Cost: 0.000149
Epoch 1400/2000 W: 1.989, b: 0.025 Cost: 0.000092
Epoch 1500/2000 W: 1.991, b: 0.020 Cost: 0.000057
Epoch 1600/2000 W: 1.993, b: 0.016 Cost: 0.000035
Epoch 1700/2000 W: 1.995, b: 0.012 Cost: 0.000022
Epoch 1800/2000 W: 1.996, b: 0.010 Cost: 0.000013
Epoch 1900/2000 W: 1.997, b: 0.008 Cost: 0.000008

> optimizer.zero_grad 의 필요성: pytorch accumulates the value of gradients, so it is essential to initialize the value of gradient in to 0

## 2-2 Autograd

* Try to understand the following principles(requires_grad, backward etc)
* Autograd helps user to run gradient descent easilyby computing the gradient of the given function automatically 

In [28]:
import torch
w = torch.tensor(2.0, requires_grad= True)
print(w)

y = w**2
z = 2*y + 5

tensor(2., requires_grad=True)


In [29]:
print(w, y ,z)

tensor(2., requires_grad=True) tensor(4., grad_fn=<PowBackward0>) tensor(13., grad_fn=<AddBackward0>)


In [30]:
z.backward() #parameter에 대해서 기울기를 계산한다.

In [31]:
print(w.grad) # w가 속한 수식을 w로 미분한 값이 저장되어 있음 (backward를 통해 얻은 미분값)

tensor(8.)


## 2-3 Multivariable Linear regression


In [32]:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

In [33]:
torch.manual_seed(1)

<torch._C.Generator at 0x1064d3e30>

$H(x) = w_1x_1 + w_2x_2 + w_3x_3 + b$

In [34]:
x1_train = torch.FloatTensor([[73], [93], [89], [96], [73]])
x2_train = torch.FloatTensor([[80], [88], [91], [98], [66]])
x3_train = torch.FloatTensor([[75], [93], [90], [100], [70]])
y_train = torch.FloatTensor([[152], [185], [180], [196], [142]])

In [35]:
w1 = torch.zeros(1, requires_grad= True)
w2 = torch.zeros(1, requires_grad= True)
w3 = torch.zeros(1, requires_grad= True)
b = torch.zeros(1, requires_grad= True)

In [36]:
optimizer = optim.SGD([w1, w2, w3, b], lr=1e-5)
num_epochs = 2000
for epoch in range(num_epochs+1):
  hypothesis = x1_train*w1+x2_train*w2+x3_train*w3+b
  cost = torch.mean((hypothesis - y_train)**2)
  optimizer.zero_grad()
  cost.backward()
  optimizer.step()

  if epoch % 100 == 0:
        print('Epoch {:4d}/{} w1: {:.3f} w2: {:.3f} w3: {:.3f} b: {:.3f} Cost: {:.6f}'.format(
            epoch, num_epochs, w1.item(), w2.item(), w3.item(), b.item(), cost.item()
        ))

Epoch    0/2000 w1: 0.294 w2: 0.294 w3: 0.297 b: 0.003 Cost: 29661.800781
Epoch  100/2000 w1: 0.674 w2: 0.661 w3: 0.676 b: 0.008 Cost: 1.563628
Epoch  200/2000 w1: 0.679 w2: 0.655 w3: 0.677 b: 0.008 Cost: 1.497595
Epoch  300/2000 w1: 0.684 w2: 0.649 w3: 0.677 b: 0.008 Cost: 1.435044
Epoch  400/2000 w1: 0.689 w2: 0.643 w3: 0.678 b: 0.008 Cost: 1.375726
Epoch  500/2000 w1: 0.694 w2: 0.638 w3: 0.678 b: 0.009 Cost: 1.319507
Epoch  600/2000 w1: 0.699 w2: 0.633 w3: 0.679 b: 0.009 Cost: 1.266222
Epoch  700/2000 w1: 0.704 w2: 0.627 w3: 0.679 b: 0.009 Cost: 1.215703
Epoch  800/2000 w1: 0.709 w2: 0.622 w3: 0.679 b: 0.009 Cost: 1.167810
Epoch  900/2000 w1: 0.713 w2: 0.617 w3: 0.680 b: 0.009 Cost: 1.122429
Epoch 1000/2000 w1: 0.718 w2: 0.613 w3: 0.680 b: 0.009 Cost: 1.079390
Epoch 1100/2000 w1: 0.722 w2: 0.608 w3: 0.680 b: 0.009 Cost: 1.038574
Epoch 1200/2000 w1: 0.727 w2: 0.603 w3: 0.681 b: 0.010 Cost: 0.999884
Epoch 1300/2000 w1: 0.731 w2: 0.599 w3: 0.681 b: 0.010 Cost: 0.963217
Epoch 1400/2000 

지금까지는 hypothesis를 적을 때 직접 하나씩 곱셈으로 표시하여 명시했다. 이제부터는 행렬 연산을 이용하여 훨씬 더 편리하게 적을 것이다.

In [37]:
x_train  =  torch.FloatTensor([[73,  80,  75], 
                               [93,  88,  93], 
                               [89,  91,  80], 
                               [96,  98,  100],   
                               [73,  66,  70]])  
y_train  =  torch.FloatTensor([[152],  [185],  [180],  [196],  [142]])

In [38]:
print(x_train.shape)
print(y_train.shape)

torch.Size([5, 3])
torch.Size([5, 1])


In [39]:
W = torch.zeros((3, 1), requires_grad=True)
b = torch.zeros(1, requires_grad=True)

In [40]:
hypothesis = x_train @ W + b

In [41]:
optimizer = optim.SGD([W, b], lr=1e-5)
num_epochs = 20
for epoch in range(num_epochs+1):
  hypothesis = x_train @ W + b
  cost = torch.mean((hypothesis - y_train) ** 2)
  optimizer.zero_grad()
  cost.backward()
  optimizer.step()
  print('Epoch {:4d}/{} hypothesis: {} Cost: {:.6f}'.format(
        epoch, num_epochs, hypothesis.squeeze().detach(), cost.item()))

Epoch    0/20 hypothesis: tensor([0., 0., 0., 0., 0.]) Cost: 29661.800781
Epoch    1/20 hypothesis: tensor([66.7178, 80.1701, 76.1025, 86.0194, 61.1565]) Cost: 9537.694336
Epoch    2/20 hypothesis: tensor([104.5421, 125.6208, 119.2478, 134.7861,  95.8280]) Cost: 3069.590332
Epoch    3/20 hypothesis: tensor([125.9858, 151.3882, 143.7087, 162.4333, 115.4844]) Cost: 990.670715
Epoch    4/20 hypothesis: tensor([138.1429, 165.9963, 157.5768, 178.1071, 126.6283]) Cost: 322.481903
Epoch    5/20 hypothesis: tensor([145.0350, 174.2780, 165.4395, 186.9928, 132.9461]) Cost: 107.717003
Epoch    6/20 hypothesis: tensor([148.9423, 178.9731, 169.8976, 192.0301, 136.5279]) Cost: 38.687401
Epoch    7/20 hypothesis: tensor([151.1574, 181.6347, 172.4254, 194.8856, 138.5585]) Cost: 16.499033
Epoch    8/20 hypothesis: tensor([152.4131, 183.1435, 173.8590, 196.5043, 139.7097]) Cost: 9.365660
Epoch    9/20 hypothesis: tensor([153.1250, 183.9988, 174.6723, 197.4217, 140.3625]) Cost: 7.071114
Epoch   10/20 hyp

## 2-4 Linear Regression using `nn.Module`

In [42]:
import torch
import torch.nn as nn
import torch.nn.functional as F

torch.manual_seed(42)

<torch._C.Generator at 0x1064d3e30>

In [43]:
x_train = torch.FloatTensor([[1], [2], [3]])
y_train = torch.FloatTensor([[2], [4], [6]])

In [44]:
print(x_train.shape)
print(y_train.shape)

torch.Size([3, 1])
torch.Size([3, 1])


In [45]:
model = nn.Linear(1, 1) #input_dim:1, output_dim:1

In [46]:
print(list(model.parameters()))

[Parameter containing:
tensor([[0.7645]], requires_grad=True), Parameter containing:
tensor([0.8300], requires_grad=True)]


In [47]:
optimizer = torch.optim.SGD(model.parameters(), lr= 0.01)
num_epochs = 2000
for epoch in range(num_epochs+1):
  prediction = model(x_train)
  cost = F.mse_loss(prediction, y_train)
  
  optimizer.zero_grad()
  cost.backward()
  optimizer.step()
  if epoch % 100 == 0:
    # 100번마다 로그 출력
      print('Epoch {:4d}/{} Cost: {:.6f}'.format(
          epoch, num_epochs, cost.item()
      ))

Epoch    0/2000 Cost: 3.710179
Epoch  100/2000 Cost: 0.117398
Epoch  200/2000 Cost: 0.072545
Epoch  300/2000 Cost: 0.044828
Epoch  400/2000 Cost: 0.027701
Epoch  500/2000 Cost: 0.017118
Epoch  600/2000 Cost: 0.010578
Epoch  700/2000 Cost: 0.006536
Epoch  800/2000 Cost: 0.004039
Epoch  900/2000 Cost: 0.002496
Epoch 1000/2000 Cost: 0.001542
Epoch 1100/2000 Cost: 0.000953
Epoch 1200/2000 Cost: 0.000589
Epoch 1300/2000 Cost: 0.000364
Epoch 1400/2000 Cost: 0.000225
Epoch 1500/2000 Cost: 0.000139
Epoch 1600/2000 Cost: 0.000086
Epoch 1700/2000 Cost: 0.000053
Epoch 1800/2000 Cost: 0.000033
Epoch 1900/2000 Cost: 0.000020
Epoch 2000/2000 Cost: 0.000013


In [48]:
new_var = torch.FloatTensor([[4.0]])

pred_y = model(new_var)

print(pred_y)

tensor([[7.9929]], grad_fn=<AddmmBackward0>)


In [49]:
print(list(model.parameters()))

[Parameter containing:
tensor([[1.9959]], requires_grad=True), Parameter containing:
tensor([0.0093], requires_grad=True)]


다중 선형회귀도 `nn.module`을 통해 구현할 수 있다.

In [50]:
import torch
import torch.nn as nn
import torch.nn.functional as F

torch.manual_seed(42)

<torch._C.Generator at 0x1064d3e30>

In [51]:
x_train = torch.FloatTensor([[73, 80, 75],
                             [93, 88, 93],
                             [89, 91, 90],
                             [96, 98, 100],
                             [73, 66, 70]])
y_train = torch.FloatTensor([[152], [185], [180], [196], [142]])

print(x_train.shape)
print(y_train.shape)

torch.Size([5, 3])
torch.Size([5, 1])


In [52]:
model = nn.Linear(3, 1)
print(list(model.parameters())) # 3 W, 1 b

[Parameter containing:
tensor([[ 0.4414,  0.4792, -0.1353]], requires_grad=True), Parameter containing:
tensor([0.5304], requires_grad=True)]


In [53]:
optimizer = torch.optim.SGD(model.parameters(), lr=1e-5)

num_epochs = 2000
for epoch in range(num_epochs+1):
  pred = model(x_train)
  cost = F.mse_loss(pred, y_train)

  optimizer.zero_grad()
  cost.backward()
  optimizer.step()

  if epoch % 100 == 0:
    # 100번마다 로그 출력
      print('Epoch {:4d}/{} Cost: {:.6f}'.format(
          epoch, num_epochs, cost.item()
      ))

Epoch    0/2000 Cost: 10995.318359
Epoch  100/2000 Cost: 2.533235
Epoch  200/2000 Cost: 2.409113
Epoch  300/2000 Cost: 2.291542
Epoch  400/2000 Cost: 2.180169
Epoch  500/2000 Cost: 2.074663
Epoch  600/2000 Cost: 1.974701
Epoch  700/2000 Cost: 1.880022
Epoch  800/2000 Cost: 1.790340
Epoch  900/2000 Cost: 1.705370
Epoch 1000/2000 Cost: 1.624879
Epoch 1100/2000 Cost: 1.548624
Epoch 1200/2000 Cost: 1.476396
Epoch 1300/2000 Cost: 1.407958
Epoch 1400/2000 Cost: 1.343144
Epoch 1500/2000 Cost: 1.281701
Epoch 1600/2000 Cost: 1.223541
Epoch 1700/2000 Cost: 1.168417
Epoch 1800/2000 Cost: 1.116197
Epoch 1900/2000 Cost: 1.066716
Epoch 2000/2000 Cost: 1.019860


In [54]:
new_var = torch.FloatTensor([[73, 80, 75]])
pred_y = model(new_var)
print(pred_y)

tensor([[153.0205]], grad_fn=<AddmmBackward0>)


In [55]:
print(list(model.parameters()))

[Parameter containing:
tensor([[0.9541, 0.7449, 0.3099]], requires_grad=True), Parameter containing:
tensor([0.5356], requires_grad=True)]


## 2-5 Implement Pytorch model using class

In [56]:
import torch
import torch.nn as nn
#using nn.module
model = nn.Linear(1, 1)

In [57]:
class LinearRegressionModel(nn.Module):
  def __init__(self):
    super().__init__()
    self.linear = nn.Linear(1, 1)

  def forward(self, x):
    return self.linear(x)

In [58]:
model_lrm = LinearRegressionModel()

In [59]:
class MultivariateLinearRegressionModel(nn.Module):
  def __init__(self):
    super().__init__()
    self.linear = nn.Linear(3, 1)

  def forward(self, x):
    return self.linear(x)

In [60]:
model_mlrm = MultivariateLinearRegressionModel()

## 2-6 Mini Batch and Data Load

* Learn how to load data
* Minibatch Gradient Descent

Minibatch란 학습시킬 데이터를 나누어서 학습을 시키는데 이때 나눈 한 뭉텅이의 단위를 minibatch라고 한다.

minibatch로 전체 데이터에 대한 학습이 1회 끝나면 1 에포크가 끝났다고 한다.

### Batch size, Iteration, and epoch

* Batch size: 한번 학습시킬 때 사용할 데이터의 개수 -> mini batch에 size
* epoch: 전체 데이터에 대한 학습을 1번 끝낸 횟수
* Iteration: 1 epoch을 끝내기 위해 batch의 개수(1epoch을 위한 학습 횟수)

![batch, iteration, and epoch](https://wikidocs.net/images/page/36033/batchandepochiteration.PNG)

### Data Load

Pytorch offers Dataset and DataLoader, which could conduct mini batch training, shuffle, parellel operation

In [61]:
import torch
import torch.nn as nn
import torch.nn.functional as F

In [62]:
from torch.utils.data import TensorDataset
from torch.utils.data import DataLoader

In [63]:
x_train  =  torch.FloatTensor([[73,  80,  75], 
                               [93,  88,  93], 
                               [89,  91,  90], 
                               [96,  98,  100],   
                               [73,  66,  70]])  
y_train  =  torch.FloatTensor([[152],  [185],  [180],  [196],  [142]])

In [64]:
dataset = TensorDataset(x_train, y_train)

In [65]:
dataloader = DataLoader(dataset,batch_size=2, shuffle=True)

In [66]:
model=nn.Linear(3,1)
optimizer = torch.optim.SGD(model.parameters(), lr=1e-5)

In [67]:
num_epochs =20
for epoch in range(num_epochs+1):
  for batch_idx, samples in enumerate(dataloader):
    print("batch_idx: {}, samples: {}".format(batch_idx, samples))
    x_train, y_train = samples
    pred = model(x_train)
    cost = F.mse_loss(pred, y_train)
    
    optimizer.zero_grad()
    cost.backward()
    optimizer.step()

    
    print('Epoch {:4d}/{} Batch {}/{} Cost: {:.6f}'.format(
      epoch, num_epochs, batch_idx+1, len(dataloader),
      cost.item()
      ))

batch_idx: 0, samples: [tensor([[73., 66., 70.],
        [89., 91., 90.]]), tensor([[142.],
        [180.]])]
Epoch    0/20 Batch 1/3 Cost: 9767.699219
batch_idx: 1, samples: [tensor([[ 96.,  98., 100.],
        [ 93.,  88.,  93.]]), tensor([[196.],
        [185.]])]
Epoch    0/20 Batch 2/3 Cost: 5038.088379
batch_idx: 2, samples: [tensor([[73., 80., 75.]]), tensor([[152.]])]
Epoch    0/20 Batch 3/3 Cost: 723.783325
batch_idx: 0, samples: [tensor([[ 96.,  98., 100.],
        [ 73.,  66.,  70.]]), tensor([[196.],
        [142.]])]
Epoch    1/20 Batch 1/3 Cost: 361.084839
batch_idx: 1, samples: [tensor([[93., 88., 93.],
        [73., 80., 75.]]), tensor([[185.],
        [152.]])]
Epoch    1/20 Batch 2/3 Cost: 125.787079
batch_idx: 2, samples: [tensor([[89., 91., 90.]]), tensor([[180.]])]
Epoch    1/20 Batch 3/3 Cost: 34.142281
batch_idx: 0, samples: [tensor([[ 96.,  98., 100.],
        [ 73.,  66.,  70.]]), tensor([[196.],
        [142.]])]
Epoch    2/20 Batch 1/3 Cost: 11.149569
batch_i

In [68]:
new_var = torch.FloatTensor([[73, 80, 75]])

pred_y = model(new_var)
print(pred_y)

tensor([[151.6379]], grad_fn=<AddmmBackward0>)


## 2-7 Custom Dataset

* By inheriting from torch.utils.data.Dataset, you could make your own custom Dataset

```python
class CustomDataset(torch.utils.data.Dataset):
  def __init__(self):
    #데이터 셋의 전처리 담당
  def __len__(self):
    #데이터셋의 길이, 총 sample의 수를 적어줌

  def __getitem__(self, idx):
    # 데이터셋에서 특정 1개의 sample을 가져온다.
```

In [69]:
import torch
import torch.nn.functional as F
from torch.utils.data import Dataset
from torch.utils.data import DataLoader

In [70]:
class CustomDataset(Dataset):
  def __init__(self):
    self.x_data = [[73, 80, 75],
                   [93, 88, 93],
                   [89, 91, 90],
                   [96, 98, 100],
                   [73, 66, 70]]
    self.y_data = [[152], [185], [180], [196], [142]]
  def __len__(self):
    return len(self.x_data)
  def __getitem__(self, idx):
    x = torch.FloatTensor(self.x_data[idx])
    y = torch.FloatTensor(self.y_data[idx])
    return x, y

In [71]:
dataset = CustomDataset()
dataloader = DataLoader(dataset, batch_size=2, shuffle=True)

In [72]:
model = torch.nn.Linear(3, 1)
optimizer = torch.optim.SGD(model.parameters(),lr=1e-5)

In [73]:
num_epochs = 20
for epoch in range(num_epochs+1):
  for batch_idx, samples in enumerate(dataloader):
    x_train, y_train = samples
    pred = model(x_train)

    cost = F.mse_loss(pred, y_train)

    optimizer.zero_grad()
    cost.backward()
    optimizer.step()

    print('Epoch {:4d}/{} Batch {}/{} Cost: {:.6f}'.format(
        epoch, num_epochs, batch_idx+1, len(dataloader),
        cost.item()
        ))

Epoch    0/20 Batch 1/3 Cost: 39138.378906
Epoch    0/20 Batch 2/3 Cost: 10714.405273
Epoch    0/20 Batch 3/3 Cost: 4158.664551
Epoch    1/20 Batch 1/3 Cost: 1113.740479
Epoch    1/20 Batch 2/3 Cost: 341.212341
Epoch    1/20 Batch 3/3 Cost: 91.544449
Epoch    2/20 Batch 1/3 Cost: 41.474979
Epoch    2/20 Batch 2/3 Cost: 26.065870
Epoch    2/20 Batch 3/3 Cost: 0.137971
Epoch    3/20 Batch 1/3 Cost: 12.251788
Epoch    3/20 Batch 2/3 Cost: 6.731195
Epoch    3/20 Batch 3/3 Cost: 1.802672
Epoch    4/20 Batch 1/3 Cost: 1.375394
Epoch    4/20 Batch 2/3 Cost: 7.725240
Epoch    4/20 Batch 3/3 Cost: 16.293694
Epoch    5/20 Batch 1/3 Cost: 9.256533
Epoch    5/20 Batch 2/3 Cost: 4.672593
Epoch    5/20 Batch 3/3 Cost: 13.019217
Epoch    6/20 Batch 1/3 Cost: 3.796662
Epoch    6/20 Batch 2/3 Cost: 15.531879
Epoch    6/20 Batch 3/3 Cost: 4.352408
Epoch    7/20 Batch 1/3 Cost: 5.348221
Epoch    7/20 Batch 2/3 Cost: 3.886833
Epoch    7/20 Batch 3/3 Cost: 19.480984
Epoch    8/20 Batch 1/3 Cost: 4.417746
E