# Lab 4-1: Multivariate Linear Regression

## Theoretical Overview

$$ H(x_1, x_2, x_3) = x_1w_1 + x_2w_2 + x_3w_3 + b $$

$$ cost(W, b) = \frac{1}{m} \sum^m_{i=1} \left( H(x^{(i)}) - y^{(i)} \right)^2 $$

 - $H(x)$: 주어진 $x$ 값에 대해 예측을 어떻게 할 것인가
 - $cost(W, b)$: $H(x)$ 가 $y$ 를 얼마나 잘 예측했는가

## Imports

In [None]:
import torch
import torch.optim as optim

In [None]:
# For reproducibility
torch.manual_seed(1)

<torch._C.Generator at 0x7f01a0305950>

## Naive Data Representation

We will use fake data for this example.

In [None]:
# 데이터: 학생의 퀴즈1, 퀴즈2, 퀴즈3 점수로 기말고사 점수 예측
x1_train = torch.FloatTensor([[73], [93], [89], [96], [73]])
x2_train = torch.FloatTensor([[80], [88], [91], [98], [66]])
x3_train = torch.FloatTensor([[75], [93], [90], [100], [70]])
y_train = torch.FloatTensor([[152], [185], [180], [196], [142]])

In [None]:
# 모델 초기화
w1 = torch.zeros(1, requires_grad=True) # 0으로 초기화 + requires_grad=True - 학습용 데이터로 사용!!
w2 = torch.zeros(1, requires_grad=True)
w3 = torch.zeros(1, requires_grad=True)
b = torch.zeros(1, requires_grad=True)

# optimizer 설정
optimizer = optim.SGD([w1, w2, w3, b], lr=1e-5) 
# lr = 1e-5 (0.00001) 이렇게 설정 -> w를 갱신하는 값이 너무 작음
# lr = 1e-3 (0.001) 이렇게 설정 -> lr을 너무 크게 잡아서 발산함 -> nan 값 뜸
# lr = 1e-4 (0.0001) 도 발산... 결국 lr = 1e-5 사용해야함!!

nb_epochs = 10000
for epoch in range(nb_epochs + 1):
    
    # H(x) 계산
    hypothesis = x1_train * w1 + x2_train * w2 + x3_train * w3 + b

    # cost 계산
    cost = torch.mean((hypothesis - y_train) ** 2)

    # cost로 H(x) 개선
    optimizer.zero_grad()
    cost.backward()
    optimizer.step()

    # 100번마다 로그 출력
    if epoch % 100 == 0:
        print('Epoch {:4d}/{} w1: {:.3f} w2: {:.3f} w3: {:.3f} b: {:.3f} Cost: {:.6f}'.format(
            epoch, nb_epochs, w1.item(), w3.item(), w3.item(), b.item(), cost.item()
        ))

Epoch    0/1000 w1: 29.401 w2: 29.738 w3: 29.738 b: 0.342 Cost: 29661.800781
Epoch  100/1000 w1: nan w2: nan w3: nan b: nan Cost: nan
Epoch  200/1000 w1: nan w2: nan w3: nan b: nan Cost: nan
Epoch  300/1000 w1: nan w2: nan w3: nan b: nan Cost: nan
Epoch  400/1000 w1: nan w2: nan w3: nan b: nan Cost: nan
Epoch  500/1000 w1: nan w2: nan w3: nan b: nan Cost: nan
Epoch  600/1000 w1: nan w2: nan w3: nan b: nan Cost: nan
Epoch  700/1000 w1: nan w2: nan w3: nan b: nan Cost: nan
Epoch  800/1000 w1: nan w2: nan w3: nan b: nan Cost: nan
Epoch  900/1000 w1: nan w2: nan w3: nan b: nan Cost: nan
Epoch 1000/1000 w1: nan w2: nan w3: nan b: nan Cost: nan


## Matrix Data Representation

$$
\begin{pmatrix}
x_1 & x_2 & x_3
\end{pmatrix}
\cdot
\begin{pmatrix}
w_1 \\
w_2 \\
w_3 \\
\end{pmatrix}
=
\begin{pmatrix}
x_1w_1 + x_2w_2 + x_3w_3
\end{pmatrix}
$$

$$ H(X) = XW $$

In [None]:
x_train = torch.FloatTensor([[73, 80, 75],
                             [93, 88, 93],
                             [89, 91, 90],
                             [96, 98, 100],
                             [73, 66, 70]])
y_train = torch.FloatTensor([[152], [185], [180], [196], [142]])

In [None]:
print(x_train.shape)
print(y_train.shape)

torch.Size([5, 3])
torch.Size([5, 1])


In [None]:
# 모델 초기화

W = torch.zeros((3, 1), requires_grad=True)
b = torch.zeros(1, requires_grad=True)
# optimizer 설정
optimizer = optim.SGD([W, b], lr=1e-5)

nb_epochs = 20
for epoch in range(nb_epochs + 1):
    
    # H(x) 계산
    # Matrix 연산!!
    hypothesis = x_train.matmul(W) + b # or .mm or @

    # cost 계산
    cost = torch.mean((hypothesis - y_train) ** 2)

    # cost로 H(x) 개선
    optimizer.zero_grad()
    cost.backward()
    optimizer.step()

    # 100번마다 로그 출력
    print('Epoch {:4d}/{} hypothesis: {} Cost: {:.6f}'.format(
        epoch, nb_epochs, hypothesis.squeeze().detach(), cost.item()
    ))

Epoch    0/20 hypothesis: tensor([0., 0., 0., 0., 0.]) Cost: 29661.800781
Epoch    1/20 hypothesis: tensor([67.2578, 80.8397, 79.6523, 86.7394, 61.6605]) Cost: 9298.520508
Epoch    2/20 hypothesis: tensor([104.9128, 126.0990, 124.2466, 135.3015,  96.1821]) Cost: 2915.712402
Epoch    3/20 hypothesis: tensor([125.9942, 151.4381, 149.2133, 162.4896, 115.5097]) Cost: 915.040527
Epoch    4/20 hypothesis: tensor([137.7967, 165.6247, 163.1911, 177.7112, 126.3307]) Cost: 287.936096
Epoch    5/20 hypothesis: tensor([144.4044, 173.5674, 171.0168, 186.2332, 132.3891]) Cost: 91.371063
Epoch    6/20 hypothesis: tensor([148.1035, 178.0143, 175.3980, 191.0042, 135.7812]) Cost: 29.758249
Epoch    7/20 hypothesis: tensor([150.1744, 180.5042, 177.8509, 193.6753, 137.6805]) Cost: 10.445267
Epoch    8/20 hypothesis: tensor([151.3336, 181.8983, 179.2240, 195.1707, 138.7440]) Cost: 4.391237
Epoch    9/20 hypothesis: tensor([151.9824, 182.6789, 179.9928, 196.0079, 139.3396]) Cost: 2.493121
Epoch   10/20 hypo

In [None]:
print(W)
print(b)

tensor([[0.6691],
        [0.6659],
        [0.6758]], requires_grad=True)
tensor([0.0078], requires_grad=True)
