<a href="https://colab.research.google.com/github/arkincognito/PyTorch/blob/main/05_Logistic_Regression.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Logistic Regression

Let X, W, and b to be matrices of $[m, d]$, $[d,1]$, $[1,1]$.<br>
For classification problems, y is a matrix of the class labels.<br>
To begin with, let's consider simple binary classification problem.<br>
y will be a matrix of $[m,1]$

# Hypothesis
For Single layer logistic regression model,<br>
Hypothesis:<br>
$$H(x^{(i)}) = {1\over 1 + e^{-f(x^{(i)})}}$$<br>
where <br>

$$f(x^{(i)}) = x^{(i)} \times W + b$$<br>

$H(x^{(i)})$ is in the form of Sigmoid function, which can also be interpreted as<br><br>
$$H(x^{(i)}) \approx P(y^{(i)}=1; W) = 1- P(y^{(i)} = 0; W)$$

# Cost 
Binary Cross Entropy:
$$Cost = - {1\over m} \sum _{i=1} ^m ylog(H(x^{(i)}) + (1-y)log(H(x^{(i)}))$$<br>
If the $Cost\approx0$, then $y^{(i)}\approx H(x^{(i)})$<br>
and if the cost is high, then $y^{(i)}\not= H(x^{(i)})$

# Update by Gradient Descent

$$W := W - lr {\partial\over \partial W}Cost(W) $$

# Logistic Regression from Scrap

Let's import files to build logistic regression model, and set seed.

In [None]:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

#For reproducibility
torch.manual_seed(1)

<torch._C.Generator at 0x7f5e140edb28>

Let's make a dataset. We could pretend x are the hours working on the course's homework and hours studying by them selves, and y are whether the student has failed the course or not.

In [None]:
x_data = [[1,2], [2,3], [3,1], [4,3], [5,3], [6,2]]
y_data = [[0], [0], [0], [1], [1], [1]]
x_train = torch.FloatTensor(x_data)
y_train = torch.FloatTensor(y_data)
print(x_train, '\n', y_train)
print(x_train.shape, y_train.shape)

tensor([[1., 2.],
        [2., 3.],
        [3., 1.],
        [4., 3.],
        [5., 3.],
        [6., 2.]]) 
 tensor([[0.],
        [0.],
        [0.],
        [1.],
        [1.],
        [1.]])
torch.Size([6, 2]) torch.Size([6, 1])


Set the weights and bias.

In [None]:
W = torch.zeros((2,1), requires_grad=True)
b = torch.zeros(1, requires_grad=True)

Let's see the hypothesis values at the initial state.

In [None]:
hypothesis = 1/(1 + torch.exp(-x_train.mm(W)+b))
print(hypothesis)
print(hypothesis.shape)

tensor([[0.5000],
        [0.5000],
        [0.5000],
        [0.5000],
        [0.5000],
        [0.5000]], grad_fn=<MulBackward0>)
torch.Size([6, 1])


We could use ```torch.sigmoid()``` instead.

In [None]:
hypothesis = torch.sigmoid(x_train.mm(W)+b)
print(hypothesis)
print(hypothesis.shape)

tensor([[0.5000],
        [0.5000],
        [0.5000],
        [0.5000],
        [0.5000],
        [0.5000]], grad_fn=<SigmoidBackward>)
torch.Size([6, 1])


In [None]:
cost = -(y_train*torch.log(hypothesis) + (1-y_train)*torch.log(hypothesis)).mean()
print(cost)

tensor(0.6931, grad_fn=<NegBackward>)


This is the Binary Cross Entropy, and we can use ```F.binary_cross_entropy()``` to simplify the code.

In [None]:
cost = F.binary_cross_entropy(hypothesis, y_train)
print(cost)

tensor(0.6931, grad_fn=<BinaryCrossEntropyBackward>)


In [None]:

x_data = [[1,2], [2,3], [3,1], [4,3], [5,3], [6,2]]
y_data = [[0], [0], [0], [1], [1], [1]]
x_train = torch.FloatTensor(x_data)
y_train = torch.FloatTensor(y_data)

W = torch.zeros((2,1), requires_grad=True)
b = torch.zeros(1, requires_grad=True)

optimizer = torch.optim.SGD([W, b], lr = 1)
nb_epoch = 1000
for epoch in range(nb_epoch+1):
  # Calculate H(x)
  hypothesis = torch.sigmoid(x_train.mm(W)+b)
  # Calculate cost
  cost = F.binary_cross_entropy(hypothesis, y_train)
  
  # Initialize all the gradients to zero
  optimizer.zero_grad()
  # Backward Propagation
  cost.backward()
  # Update
  optimizer.step()
  if epoch % 100 == 0:
    print(f'Epoch: {epoch:4d}\t|hypothesis: {hypothesis.squeeze().detach()}\t|cost: {cost.item():.4f}')
print('train finished')

Epoch:    0	|hypothesis: tensor([0.5000, 0.5000, 0.5000, 0.5000, 0.5000, 0.5000])	|cost: 0.6931
Epoch:  100	|hypothesis: tensor([0.0245, 0.1484, 0.2770, 0.7954, 0.9484, 0.9834])	|cost: 0.1347
Epoch:  200	|hypothesis: tensor([0.0080, 0.1065, 0.1632, 0.8566, 0.9769, 0.9931])	|cost: 0.0806
Epoch:  300	|hypothesis: tensor([0.0037, 0.0822, 0.1161, 0.8888, 0.9869, 0.9965])	|cost: 0.0579
Epoch:  400	|hypothesis: tensor([0.0021, 0.0669, 0.0902, 0.9090, 0.9916, 0.9979])	|cost: 0.0453
Epoch:  500	|hypothesis: tensor([0.0013, 0.0564, 0.0739, 0.9229, 0.9941, 0.9986])	|cost: 0.0373
Epoch:  600	|hypothesis: tensor([8.7256e-04, 4.8759e-02, 6.2629e-02, 9.3312e-01, 9.9567e-01, 9.9906e-01])	|cost: 0.0317
Epoch:  700	|hypothesis: tensor([6.2087e-04, 4.2945e-02, 5.4368e-02, 9.4091e-01, 9.9668e-01, 9.9932e-01])	|cost: 0.0276
Epoch:  800	|hypothesis: tensor([4.6039e-04, 3.8371e-02, 4.8050e-02, 9.4706e-01, 9.9737e-01, 9.9949e-01])	|cost: 0.0244
Epoch:  900	|hypothesis: tensor([3.5258e-04, 3.4679e-02, 4.3059e

In [None]:
prediction = hypothesis >= 0.5
print(prediction)

tensor([[False],
        [False],
        [False],
        [ True],
        [ True],
        [ True]])


In [None]:
accuracy = (prediction == y_train).float().mean()
print(accuracy)

tensor(1.)


Let's try this with larger dataset with 8 parameters.

In [None]:
import numpy as np

In [None]:
xy = np.loadtxt('https://raw.githubusercontent.com/deeplearningzerotoall/PyTorch/master/data-03-diabetes.csv', delimiter=',', dtype=np.float32)
x_data = xy[:, 0:-1]
y_data = xy[:, [-1]]
x_train = torch.FloatTensor(x_data)
y_train = torch.FloatTensor(y_data)

# Implementing nn.Module

In [None]:
class BinaryClassifier(nn.Module):
  def __init__(self):
    super().__init__()
    self.linear = nn.Linear(8,1)
    self.sigmoid = nn.Sigmoid()

  def forward(self, x):
    return self.sigmoid(self.linear(x))

In [None]:
model = BinaryClassifier()
optimizer = torch.optim.SGD(model.parameters(), lr = 1)
nb_epoch = 100
for epoch in range(nb_epoch+1):
  # Calculate H(x)
  hypothesis = model(x_train)
  # Calculate cost
  cost = F.binary_cross_entropy(hypothesis, y_train)
  # Update Weights and Bias
  optimizer.zero_grad()
  cost.backward()
  optimizer.step()
  if epoch % 10 == 0:
    prediction = hypothesis >= 0.5
    accuracy = (prediction == y_train).float().mean()
    print(f'Epoch: {epoch:4d}\t|accuracy: {accuracy * 100:.2f}%\t|cost: {cost.item():.4f}')
print('train finished')

Epoch:    0	|accuracy: 40.18%	|cost: 0.7176
Epoch:   10	|accuracy: 66.14%	|cost: 0.5884
Epoch:   20	|accuracy: 70.88%	|cost: 0.5501
Epoch:   30	|accuracy: 75.10%	|cost: 0.5269
Epoch:   40	|accuracy: 76.68%	|cost: 0.5121
Epoch:   50	|accuracy: 77.34%	|cost: 0.5021
Epoch:   60	|accuracy: 77.21%	|cost: 0.4951
Epoch:   70	|accuracy: 77.34%	|cost: 0.4901
Epoch:   80	|accuracy: 77.21%	|cost: 0.4864
Epoch:   90	|accuracy: 76.81%	|cost: 0.4837
Epoch:  100	|accuracy: 76.81%	|cost: 0.4815
train finished
