# Lab-05 Logistic Regression

- Reminder
- Computing Hypothesis
- Computing Cost Function
- Evaluation
- Higher Implementation

Binary Classification Problem

**Hypothesis** $H(X) = \frac{1}{1+e^{-W^{T}X}}$

**Cost** $cost(W) = -\frac{1}{m} \sum y \log (H(x)) + (1-y)(\log(1-H(x)))$

- if $y \approx H(x)$, cost is near 0.
- if $y \neq H(x)$, cost is high.

$H(x) = P(x=1; W) = 1 - P(x=0; W)$

## Weight Update via Gradient Descent

$W := W - \alpha \frac{\partial}{\partial W} cost(W) (= W - \alpha \nabla_W cost(W))$

In [1]:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

In [3]:
# For reproducibility # 계속 똑같은 결과를 재현해 주기 위해 시드 정해놓기
torch.manual_seed(1)

<torch._C.Generator at 0x7fd7a01e9f10>

In [8]:
x_data = [[1, 2], [2, 3], [3, 1], [4, 3], [5, 3], [6, 2]]  # |x_data| = (6, 2)
y_data = [[0], [0], [0], [1], [1], [1]]                    # |y_data| = (6, 1)

In [6]:
x_train = torch.FloatTensor(x_data)
y_train = torch.FloatTensor(y_data)

In [7]:
print(x_train.shape)
print(y_train.shape)

torch.Size([6, 2])
torch.Size([6, 1])


In [9]:
print("e^1 equals: ", torch.exp(torch.FloatTensor([1])))

e^1 equals:  tensor([2.7183])


In [10]:
W = torch.zeros((2, 1), requires_grad=True)
b = torch.zeros(1, requires_grad=True)

In [11]:
hypothesis = 1 / (1 + torch.exp(-(x_train.matmul(W) + b)))

In [12]:
print(hypothesis)
print(hypothesis.shape)

tensor([[0.5000],
        [0.5000],
        [0.5000],
        [0.5000],
        [0.5000],
        [0.5000]], grad_fn=<MulBackward0>)
torch.Size([6, 1])


## Computing the Hypothesis

Or, we could use `torch.sigmoid()` function! This resembles the sigmoid function:

In [13]:
print("1/(1+e^{-1}) equals: ", torch.sigmoid(torch.FloatTensor([1])))

1/(1+e^{-1}) equals:  tensor([0.7311])


In [24]:
# Now, the code for hypothesis function is cleaner
hypothesis = torch.sigmoid(x_train.matmul(W) + b)

In [15]:
print(hypothesis)
print(hypothesis.shape)

tensor([[0.5000],
        [0.5000],
        [0.5000],
        [0.5000],
        [0.5000],
        [0.5000]], grad_fn=<SigmoidBackward0>)
torch.Size([6, 1])


We want to measure the difference between `hypothesis` and `y_train`.

In [17]:
print(hypothesis)
print(y_train)

tensor([[0.5000],
        [0.5000],
        [0.5000],
        [0.5000],
        [0.5000],
        [0.5000]], grad_fn=<SigmoidBackward0>)
tensor([[0.],
        [0.],
        [0.],
        [1.],
        [1.],
        [1.]])


```
-(y_train[0] * torch.log(hypothesis[0]) + (1 - y_train[0]) * torch.log(1 - hypothesis[0]))
```

In [19]:
losses = -(y_train * torch.log(hypothesis) + (1 - y_train) * torch.log(1 - hypothesis))
print(losses)

tensor([[0.6931],
        [0.6931],
        [0.6931],
        [0.6931],
        [0.6931],
        [0.6931]], grad_fn=<NegBackward0>)


Then, we just `.mean()` to take the mean of these individual losses.

In [20]:
cost = losses.mean()
print(cost)

tensor(0.6931, grad_fn=<MeanBackward0>)


In [22]:
F.binary_cross_entropy(hypothesis, y_train)  # abbr. bce

tensor(0.6931, grad_fn=<BinaryCrossEntropyBackward0>)

## Whole Training Procedure

In [23]:
# 모델 초기화
W = torch.zeros((2, 1), requires_grad=True)
b = torch.zeros(1, requires_grad=True)
# optimizer 설정
optimizer = optim.SGD([W, b], lr=1)

nb_epochs = 1000
for epoch in range(nb_epochs + 1):
    
    # Cost 계산
    hypothesis = torch.sigmoid(x_train.matmul(W) + b)  # or .mm or @
    cost = F.binary_cross_entropy(hypothesis, y_train)
    
    # cost로 H(x) 개선
    optimizer.zero_grad()   # 기존에 혹시나 gradient를 구해놓은 게 있으면 0으로 초기화
    cost.backward()         # cost에 back propagation 수행 (W, b에는 gradient가 구해져 있을 것)
    optimizer.step()        # minimize하는 방향으로 W, b를 갱신할 것.
    
    # 100번마다 로그 출력
    if epoch % 100 == 0:
        print("Epoch {:4d}/{} Cost: {:.6f}".format(epoch, nb_epochs, cost.item()))
    

Epoch    0/1000 Cost: 0.693147
Epoch  100/1000 Cost: 0.134722
Epoch  200/1000 Cost: 0.080643
Epoch  300/1000 Cost: 0.057900
Epoch  400/1000 Cost: 0.045300
Epoch  500/1000 Cost: 0.037261
Epoch  600/1000 Cost: 0.031673
Epoch  700/1000 Cost: 0.027556
Epoch  800/1000 Cost: 0.024394
Epoch  900/1000 Cost: 0.021888
Epoch 1000/1000 Cost: 0.019852


## Evaluation
After we finish training the model, we want to check how well our model fits the training set.

In [28]:
hypothesis = torch.sigmoid(x_train.matmul(W) + b)   # x_train -> x_test
print(hypothesis[:5])

tensor([[2.7648e-04],
        [3.1608e-02],
        [3.8977e-02],
        [9.5622e-01],
        [9.9823e-01]], grad_fn=<SliceBackward0>)


We can change **hypothesis** (real number from 0 to 1) to **binary predictions** (either 0 or 1) by comparing them to 0.5.

In [35]:
prediction = hypothesis >= torch.FloatTensor([0.5])   # ByteTensor
print(prediction[:5])

tensor([[False],
        [False],
        [False],
        [ True],
        [ True]])


In [39]:
# 얼마나 정확한데?
print(prediction[:5])
print(y_train[:5])

tensor([[False],
        [False],
        [False],
        [ True],
        [ True]])
tensor([[0.],
        [0.],
        [0.],
        [1.],
        [1.]])


In [45]:
correct_prediction = prediction.float() == y_train
print(correct_prediction[:5])
# 굉장히 잘 예측되었다!
# 평균을 내 보면 accuracy, 모델의 정확도를 구할 수 있을 것! (correct_prediction.float().mean())

tensor([[True],
        [True],
        [True],
        [True],
        [True]])


## Higher Implementation with Class

훈련과 평가에 대해 살펴보았는데, 사실은 좀 naive한 형태로 구현이 된 거니 실제 구현은 다음과 같은 방식이 될 것.

In [58]:
class BinaryClassifier(nn.Module):
    def __init__(self):
        super().__init__()
        # W, b가 다 들어있는 linear layer! (self.linear = {w, b})
        # w E R^(8x1), b E R^1
        self.linear = nn.Linear(2, 1)     # d=2, m=8
        self.sigmoid = nn.Sigmoid()
        
    def forward(self, x):
        return self.sigmoid(self.linear(x))

In [59]:
model = BinaryClassifier()

In [64]:
# optimizer 설정
optimizer = optim.SGD(model.parameters(), lr=1)

nb_epochs = 100
for epoch in range(nb_epochs + 1):
    
    # H(x) 계산
    hypothesis = model(x_train)      # H(x) => P(x=1) (즉 1일 확률)
    
    # cost 계산
    cost = F.binary_cross_entropy(hypothesis, y_train)
    
    # cost로 H(x) 개선
    optimizer.zero_grad()
    cost.backward()
    optimizer.step()
    
    # 20번마다 로그 출력
    if epoch % 10 == 0:
        prediction = hypothesis >= torch.FloatTensor([0.5])
        corrent_prediction = prediction.float() == y_train
        accuracy = correct_prediction.sum().item() / len(correct_prediction)
        print("Epoch {:4d}/{} Cost: {:.6f} Accuracy {:2.2f}%".format(
            epoch, nb_epochs, cost.item(), accuracy * 100,
        ))

Epoch    0/100 Cost: 0.057607 Accuracy 100.00%
Epoch   10/100 Cost: 0.056049 Accuracy 100.00%
Epoch   20/100 Cost: 0.054574 Accuracy 100.00%
Epoch   30/100 Cost: 0.053176 Accuracy 100.00%
Epoch   40/100 Cost: 0.051850 Accuracy 100.00%
Epoch   50/100 Cost: 0.050590 Accuracy 100.00%
Epoch   60/100 Cost: 0.049391 Accuracy 100.00%
Epoch   70/100 Cost: 0.048248 Accuracy 100.00%
Epoch   80/100 Cost: 0.047158 Accuracy 100.00%
Epoch   90/100 Cost: 0.046117 Accuracy 100.00%
Epoch  100/100 Cost: 0.045122 Accuracy 100.00%
