<a href="https://colab.research.google.com/github/AlpacaJake/AI-python-connect/blob/master/Pytorch_logistic_regression.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Logistic Regression**


*   이항형 로지스틱 회귀(binomial logistic regression)의 경우 종속 변수의 결과가 성공, 실패 와 같이 2개의 카테고리가 존재하는 것을 의미하며 다항형 로지스틱 회귀는 종속형 변수가 (맑음, 흐림, 비) 와 같이 2개 이상의 카테고리로 분류되는 것을 가리킨다. 이항형 로지스틱의 회귀 분석에서 2개의 카테고리는 0과 1로 나타내어지고 각각의 카테고리로 분류될 확률의 합은 1이 된다. 

+   Hypothesis 는 Sigmoid function을 사용  

    H(x) = 1 / (1 + e(-W.T X))  

    ![대체 텍스트](https://mblogthumb-phinf.pstatic.net/20150612_50/2feelus_14340467064157goJq_PNG/2015-06-12_at_3.21.28_AM.png?type=w2)  

    ![대체 텍스트](https://mblogthumb-phinf.pstatic.net/20150612_71/2feelus_14340466751522xoTj_PNG/2015-06-12_at_3.20.33_AM.png?type=w2)

     
+   Cost function은 cross entropy  사용  

    cost(W) = -1/m Sigma(ylog(H(x)) + (1-y)(log(1-H(x)))  

    ![대체 텍스트](https://wikimedia.org/api/rest_v1/media/math/render/svg/80f87a71d3a616a0939f5360cec24d702d2593a2)

     

In [2]:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

In [3]:
torch.manual_seed(1)

<torch._C.Generator at 0x7f9fa975d570>

x_data shape is (6,2)  
y_data shape is (6,1)  
2개 input 이 들어가서 output 1 개를 내는 6개의 데이터가 있다   
이경우 W 의 shape 은 (2,1)  b 의 shape 은 (1,)

In [4]:
x_data = [[1,2], [2,3], [3,1], [4,3], [5,3],[6,2]]
y_data = [[0], [0], [0], [1], [1], [1]]

x_train = torch.FloatTensor(x_data)
y_train = torch.FloatTensor(y_data)


In [5]:
print(x_train.shape)
print(y_train.shape)

torch.Size([6, 2])
torch.Size([6, 1])


In [6]:
print('e^1 equals: ', torch.exp(torch.FloatTensor([1])))

e^1 equals:  tensor([2.7183])


In [7]:
W = torch.zeros((2,1), requires_grad=True)
b = torch.zeros(1, requires_grad=True)


x_train 은 (6,2), W 는 (2,1) 이므로 matmul 하면 (6,1) 인 y_train 과 같은 shape

In [None]:
#hypothesis = 1 / (1 + torch.exp(-(x_train.matmul(W) + b)))
hypothesis = torch.sigmoid(x_train.matmul(W)+b)


In [None]:
print(hypothesis)
print(hypothesis.shape)

tensor([[0.5000],
        [0.5000],
        [0.5000],
        [0.5000],
        [0.5000],
        [0.5000]], grad_fn=<SigmoidBackward>)
torch.Size([6, 1])


In [None]:
print('1 / (1 + e^(-1)) equals: ', torch.sigmoid(torch.FloatTensor([1])))


1 / (1 + e^(-1)) equals:  tensor([0.7311])


In [None]:
losses = -(y_train * torch.log(hypothesis) + (1 - y_train) * torch.log(1 - hypothesis))
print(losses)
#cost = losses.mean()

tensor([[0.6931],
        [0.6931],
        [0.6931],
        [0.6931],
        [0.6931],
        [0.6931]], grad_fn=<NegBackward>)


In [None]:

cost = F.binary_cross_entropy(hypothesis, y_train)

print(cost)

tensor(0.6931, grad_fn=<BinaryCrossEntropyBackward>)


# Class 구현 방법

In [16]:
class BianryClassifier(nn.Module):
  def __init__(self):
    super().__init__()
    self.linear = nn.Linear(2,1)
    self.sigmoid = nn.Sigmoid()

  def forward(self, x):
    return self.sigmoid(self.linear(x))
    

In [18]:
model = BianryClassifier()

optimizer = optim.SGD(model.parameters(), lr=0.1)

In [19]:
nb_epochs = 1000

for epoch in range(nb_epochs + 1):
  hypothesis = model(x_train)
  cost = F.binary_cross_entropy(hypothesis, y_train)

  optimizer.zero_grad()
  cost.backward()
  optimizer.step()

  if epoch % 100 == 0:
    print('Epoch {:4d}/{} Cost: {:.6f}'.format(epoch, nb_epochs, cost.item()))

Epoch    0/1000 Cost: 0.539713
Epoch  100/1000 Cost: 0.407688
Epoch  200/1000 Cost: 0.345649
Epoch  300/1000 Cost: 0.298323
Epoch  400/1000 Cost: 0.261179
Epoch  500/1000 Cost: 0.231633
Epoch  600/1000 Cost: 0.207779
Epoch  700/1000 Cost: 0.188230
Epoch  800/1000 Cost: 0.171976
Epoch  900/1000 Cost: 0.158282
Epoch 1000/1000 Cost: 0.146605


#  Class 구현 방법

In [8]:
optimizer = optim.SGD([W, b], lr = 0.1)

Training

In [9]:
nb_epochs = 1000
for epoch in range(nb_epochs +1) :
  hypothesis = torch.sigmoid(x_train.matmul(W)+b)
  cost = F.binary_cross_entropy(hypothesis, y_train)

  optimizer.zero_grad()
  cost.backward()
  optimizer.step()

  if epoch % 100 == 0:
    print('Epoch {:4d}/{} Cost: {:.6f}'.format(epoch, nb_epochs, cost.item()))

Epoch    0/1000 Cost: 0.693147
Epoch  100/1000 Cost: 0.414327
Epoch  200/1000 Cost: 0.349521
Epoch  300/1000 Cost: 0.301302
Epoch  400/1000 Cost: 0.263532
Epoch  500/1000 Cost: 0.233518
Epoch  600/1000 Cost: 0.209313
Epoch  700/1000 Cost: 0.189496
Epoch  800/1000 Cost: 0.173035
Epoch  900/1000 Cost: 0.159179
Epoch 1000/1000 Cost: 0.147375


Evaluation  
 After we finish training the model, we want to check how well our model fits the training set

In [20]:
hypothesis = torch.sigmoid(x_train.matmul(W) + b)
print(hypothesis[:5])

tensor([[0.0298],
        [0.1576],
        [0.3004],
        [0.7834],
        [0.9409]], grad_fn=<SliceBackward>)


We can change hypothesis (real number from 0 to 1) to binary predictions(either 0 or 1) by comparing them to 0.5

In [21]:
prediction = hypothesis >= torch.FloatTensor([0.5])
print(prediction[:5].float())

tensor([[0.],
        [0.],
        [0.],
        [1.],
        [1.]])


In [22]:
correct_prediction = prediction.float() == y_train
print(correct_prediction[:5].float())

tensor([[1.],
        [1.],
        [1.],
        [1.],
        [1.]])
