<a href="https://colab.research.google.com/github/eunjoo-ny/ML-Code/blob/main/Pytorch_lab_5_Logistic_Classification.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Logistic Regression
Hypothesis
$$ H(X) = \frac{1}{1+e^{-W^T X}} $$
Cost
$$ cost(W) = -\frac{1}{m} \sum y \log\left(H(x)\right) + (1-y) \left( \log(1-H(x) \right) $$
If $y \simeq H(x)$, cost is near 0.
If $y \neq H(x)$, cost is high.
Weight Update via Gradient Descent
$$ W := W - \alpha \frac{\partial}{\partial W} cost(W) $$
$\alpha$: Learning rate

In [1]:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim


In [2]:
#for reproducibility
torch.manual_seed(1)

<torch._C.Generator at 0x7f40e7e45b58>

In [3]:
x_data=[[1,2],[2,3],[3,1],[4,3],[5,3],[6,2]]
y_data=[[0],[0],[0],[1],[1],[1]]

Consider the following classification problem: given the number of hours each student spent watching the lecture and working in the code lab, predict whether the student passed or failed a course. For example, the first (index 0) student watched the lecture for 1 hour and spent 2 hours in the lab session ([1, 2]), and ended up failing the course ([0]).

In [4]:
x_train=torch.FloatTensor(x_data)
y_train=torch.FloatTensor(y_data)

In [5]:
print(x_train.shape)
print(y_train.shape)

torch.Size([6, 2])
torch.Size([6, 1])


Computing the Hypothesis
$$ H(X) = \frac{1}{1+e^{-W^T X}} $$
PyTorch has a torch.exp() function that resembles the exponential function.

In [6]:
W=torch.zeros((2,1),requires_grad=True)
b=torch.zeros(1,requires_grad=True)

In [7]:
hypothesis=1/(1+torch.exp(-(x_train.matmul(W)+b)))
hypothesis,hypothesis.shape

(tensor([[0.5000],
         [0.5000],
         [0.5000],
         [0.5000],
         [0.5000],
         [0.5000]], grad_fn=<MulBackward0>), torch.Size([6, 1]))

In [8]:
hypothesis1=torch.sigmoid(x_train.matmul(W)+b)
hypothesis1,hypothesis1.shape

(tensor([[0.5000],
         [0.5000],
         [0.5000],
         [0.5000],
         [0.5000],
         [0.5000]], grad_fn=<SigmoidBackward>), torch.Size([6, 1]))

Computing the Cost Function (Low-level)
$$ cost(W) = -\frac{1}{m} \sum y \log\left(H(x)\right) + (1-y) \left( \log(1-H(x) \right) $$
We want to measure the difference between hypothesis and y_train.

In [9]:
print(y_train)

tensor([[0.],
        [0.],
        [0.],
        [1.],
        [1.],
        [1.]])


In [10]:
loss_0=-(y_train[0]*torch.log(hypothesis1[0])+(1-y_train[0])*torch.log(1-hypothesis1[0]))
loss_0

tensor([0.6931], grad_fn=<NegBackward>)

In [11]:
losses=-(y_train*torch.log(hypothesis1)+(1-y_train)*torch.log(1-hypothesis1))
print(losses)

tensor([[0.6931],
        [0.6931],
        [0.6931],
        [0.6931],
        [0.6931],
        [0.6931]], grad_fn=<NegBackward>)


In [None]:
cost=losses.mean()
print(cost)

tensor(0.6931, grad_fn=<MeanBackward0>)


Computing the Cost Function with F.binary_cross_entropy
In reality, binary classification is used so often that PyTorch has a simple function called F.binary_cross_entropy implemented to lighten the burden.

In [12]:
#This method is another way to get the cost
F.binary_cross_entropy(hypothesis1,y_train)

tensor(0.6931, grad_fn=<BinaryCrossEntropyBackward>)

In [13]:
x_data=[[1,2],[2,3],[3,1],[4,3],[5,3],[6,2]]
y_data=[[0],[0],[0],[1],[1],[1]]
x_train=torch.FloatTensor(x_data)
y_train=torch.FloatTensor(y_data)

In [14]:
#initialize the model
W=torch.zeros((2,1),requires_grad=True)
b=torch.zeros(1,requires_grad=True)
#set the optimizer
optimizer=optim.SGD([W,b],lr=1)
#loop of the epochs
nb_epochs=1000
for epoch in range(nb_epochs+1):
    
#hypothesis and cost
    hypothesis2=torch.sigmoid(x_train.matmul(W)+b)
    cost=-(y_train*torch.log(hypothesis2)+(1-y_train)*torch.log(1-hypothesis2)).mean()
#the improvement of the cost into H(x)
    optimizer.zero_grad()
    cost.backward()
    optimizer.step()
#print per 100 
    if epoch%100==0:
       print('Epoch:{:4d}/{},Cost:{:.6f}'.format(epoch,nb_epochs,cost.item()))


Epoch:   0/1000,Cost:0.693147
Epoch: 100/1000,Cost:0.134722
Epoch: 200/1000,Cost:0.080643
Epoch: 300/1000,Cost:0.057900
Epoch: 400/1000,Cost:0.045300
Epoch: 500/1000,Cost:0.037261
Epoch: 600/1000,Cost:0.031673
Epoch: 700/1000,Cost:0.027556
Epoch: 800/1000,Cost:0.024394
Epoch: 900/1000,Cost:0.021888
Epoch:1000/1000,Cost:0.019852


Training with F.binary_cross_entropy

In [15]:
#initialize the model
W=torch.zeros((2,1),requires_grad=True)
b=torch.zeros(1,requires_grad=True)
#set the optimizer
optimizer=optim.SGD([W,b],lr=1)
#loop of the epochs
nb_epochs=1000
for epoch in range(nb_epochs+1):
    
#hypothesis and cost
    hypothesis2=torch.sigmoid(x_train.matmul(W)+b)
    cost=F.binary_cross_entropy(hypothesis2,y_train)
#the improvement of the cost into H(x)
    optimizer.zero_grad()
    cost.backward()
    optimizer.step()
#print per 100 
    if epoch%100==0:
       print('Epoch:{:4d}/{},Cost:{:.6f}'.format(epoch,nb_epochs,cost.item()))

Epoch:   0/1000,Cost:0.693147
Epoch: 100/1000,Cost:0.134722
Epoch: 200/1000,Cost:0.080643
Epoch: 300/1000,Cost:0.057900
Epoch: 400/1000,Cost:0.045300
Epoch: 500/1000,Cost:0.037261
Epoch: 600/1000,Cost:0.031672
Epoch: 700/1000,Cost:0.027556
Epoch: 800/1000,Cost:0.024394
Epoch: 900/1000,Cost:0.021888
Epoch:1000/1000,Cost:0.019852


Loading Real Data

In [18]:
import numpy as np

In [21]:
import pandas as pd
data1=pd.read_csv('data-03_2-diabetes.csv', error_bad_lines=False)  

data=pd.DataFrame(data1)
data

Unnamed: 0,0,1,2,3,4,5,6,7,8
0,-0.294118,0.487437,0.180328,-0.292929,0.000000,0.001490,-0.531170,-0.033333,0
1,-0.882353,-0.145729,0.081967,-0.414141,0.000000,-0.207153,-0.766866,-0.666667,1
2,-0.058824,0.839196,0.049180,0.000000,0.000000,-0.305514,-0.492741,-0.633333,0
3,-0.882353,-0.105528,0.081967,-0.535354,-0.777778,-0.162444,-0.923997,0.000000,1
4,0.000000,0.376884,-0.344262,-0.292929,-0.602837,0.284650,0.887276,-0.600000,0
...,...,...,...,...,...,...,...,...,...
754,0.176471,0.015075,0.245902,-0.030303,-0.574468,-0.019374,-0.920581,0.400000,1
755,-0.764706,0.226131,0.147541,-0.454545,0.000000,0.096870,-0.776260,-0.800000,1
756,-0.411765,0.216080,0.180328,-0.535354,-0.735225,-0.219076,-0.857387,-0.700000,1
757,-0.882353,0.266332,-0.016393,0.000000,0.000000,-0.102832,-0.768574,-0.133333,0


In [22]:
df=np.array(data)
df

array([[-0.294118 ,  0.487437 ,  0.180328 , ..., -0.53117  , -0.0333333,
         0.       ],
       [-0.882353 , -0.145729 ,  0.0819672, ..., -0.766866 , -0.666667 ,
         1.       ],
       [-0.0588235,  0.839196 ,  0.0491803, ..., -0.492741 , -0.633333 ,
         0.       ],
       ...,
       [-0.411765 ,  0.21608  ,  0.180328 , ..., -0.857387 , -0.7      ,
         1.       ],
       [-0.882353 ,  0.266332 , -0.0163934, ..., -0.768574 , -0.133333 ,
         0.       ],
       [-0.882353 , -0.0653266,  0.147541 , ..., -0.797609 , -0.933333 ,
         1.       ]])

In [23]:
x_data=df[:,0:-1]
y_data=df[:,[-1]]
x_train=torch.FloatTensor(x_data)
y_train=torch.FloatTensor(y_data)


In [24]:
x_train.shape, y_train.shape

(torch.Size([759, 8]), torch.Size([759, 1]))

In [25]:
print(x_train[:5])
print(y_train[:5])

tensor([[-0.2941,  0.4874,  0.1803, -0.2929,  0.0000,  0.0015, -0.5312, -0.0333],
        [-0.8824, -0.1457,  0.0820, -0.4141,  0.0000, -0.2072, -0.7669, -0.6667],
        [-0.0588,  0.8392,  0.0492,  0.0000,  0.0000, -0.3055, -0.4927, -0.6333],
        [-0.8824, -0.1055,  0.0820, -0.5354, -0.7778, -0.1624, -0.9240,  0.0000],
        [ 0.0000,  0.3769, -0.3443, -0.2929, -0.6028,  0.2846,  0.8873, -0.6000]])
tensor([[0.],
        [1.],
        [0.],
        [1.],
        [0.]])


In [26]:
#initialize the model
W=torch.zeros((8,1),requires_grad=True)
b=torch.zeros(1,requires_grad=True)
#set the optimizer
optimizer=optim.SGD([W,b],lr=1)
nb_epochs=100
for epoch in range(nb_epochs+1):
#calculate cost and hypothesis
    hypothesis=torch.sigmoid(x_train.matmul(W)+b)
    cost=-(y_train*torch.log(hypothesis)+(1-y_train)*torch.log(1-hypothesis)).mean()
#improve the cost
    optimizer.zero_grad()
    cost.backward()
    optimizer.step()
#print the output per 10
    if epoch%10==0:
       print("Epoch:{:4d}/{}, Cost:{:.6f}".format(epoch,nb_epochs,cost.item()))

Epoch:   0/100, Cost:0.693147
Epoch:  10/100, Cost:0.572727
Epoch:  20/100, Cost:0.539493
Epoch:  30/100, Cost:0.519708
Epoch:  40/100, Cost:0.507066
Epoch:  50/100, Cost:0.498539
Epoch:  60/100, Cost:0.492549
Epoch:  70/100, Cost:0.488209
Epoch:  80/100, Cost:0.484985
Epoch:  90/100, Cost:0.482543
Epoch: 100/100, Cost:0.480661


Training with Real Data using F.binary_cross_entropy

In [27]:
#initialize the model
W=torch.zeros((8,1),requires_grad=True)
b=torch.zeros(1,requires_grad=True)
#set the optimizer
optimizer=optim.SGD([W,b],lr=1)
nb_epochs=100
for epoch in range(nb_epochs+1):
#calculate cost and hypothesis
    hypothesis=torch.sigmoid(x_train.matmul(W)+b)
    cost=F.binary_cross_entropy(hypothesis,y_train)
#improve the cost
    optimizer.zero_grad()
    cost.backward()
    optimizer.step()
#print the output per 10
    if epoch%10==0:
       print("Epoch:{:4d}/{}, Cost:{:.6f}".format(epoch,nb_epochs,cost.item()))

Epoch:   0/100, Cost:0.693147
Epoch:  10/100, Cost:0.572727
Epoch:  20/100, Cost:0.539493
Epoch:  30/100, Cost:0.519708
Epoch:  40/100, Cost:0.507066
Epoch:  50/100, Cost:0.498539
Epoch:  60/100, Cost:0.492549
Epoch:  70/100, Cost:0.488209
Epoch:  80/100, Cost:0.484985
Epoch:  90/100, Cost:0.482543
Epoch: 100/100, Cost:0.480661


In [28]:
hypothesis = torch.sigmoid(x_train.matmul(W)+b)
print(hypothesis[:5])

tensor([[0.4103],
        [0.9242],
        [0.2300],
        [0.9411],
        [0.1772]], grad_fn=<SliceBackward>)


In [36]:
prediction= hypothesis >= torch.FloatTensor([0.5])
 
print(prediction[:5])

tensor([[False],
        [ True],
        [False],
        [ True],
        [False]])


In [39]:
prediction=torch.gt(prediction, 0).int()
print(torch.gt(prediction, 0).int()[:5])

tensor([[0],
        [1],
        [0],
        [1],
        [0]], dtype=torch.int32)


In [40]:
print(prediction[:5])
print(y_train[:5])

tensor([[0],
        [1],
        [0],
        [1],
        [0]], dtype=torch.int32)
tensor([[0.],
        [1.],
        [0.],
        [1.],
        [0.]])


In [43]:
correct_prediction =prediction.float() == y_train
correct_prediction=torch.gt(correct_prediction,0).int()
print(correct_prediction[:5])

tensor([[1],
        [1],
        [1],
        [1],
        [1]], dtype=torch.int32)


In [44]:
accuracy=correct_prediction.sum().item()/len(correct_prediction)
print('The model has the accuracy of {:2.2f}% for the training set.'.format(accuracy*100))

The model has the accuracy of 76.68% for the training set.


Optional: High-level Implementation with nn.Module

In [51]:
class BinaryClassifier(nn.Module):
      def __init__(self):
          super().__init__()
          self.linear=nn.Linear(8,1)
          self.sigmoid = nn.Sigmoid()

      def forward(self,x):
          return self.sigmoid(self.linear(x))

In [52]:
model=BinaryClassifier()

In [53]:
#initialize the model

#set the optimizer
optimizer=optim.SGD(model.parameters(),lr=1)
nb_epochs=100
for epoch in range(nb_epochs+1):
#calculate cost and hypothesis
    hypothesis=model(x_train)
    cost=F.binary_cross_entropy(hypothesis,y_train)
#improve the cost
    optimizer.zero_grad()
    cost.backward()
    optimizer.step()
#print the output per 10
    if epoch%10==0:
       print("Epoch:{:4d}/{}, Cost:{:.6f}".format(epoch,nb_epochs,cost.item()))

Epoch:   0/100, Cost:0.670614
Epoch:  10/100, Cost:0.576439
Epoch:  20/100, Cost:0.540851
Epoch:  30/100, Cost:0.520186
Epoch:  40/100, Cost:0.507220
Epoch:  50/100, Cost:0.498581
Epoch:  60/100, Cost:0.492561
Epoch:  70/100, Cost:0.488220
Epoch:  80/100, Cost:0.485006
Epoch:  90/100, Cost:0.482574
Epoch: 100/100, Cost:0.480701
