In [27]:
import numpy as np
import torch
import torch.nn as nn
torch.__version__

'1.6.0'


![Logisticr](./figures/1.8.logistic.png)

import matplotlib.pyplot as pyplot
x =10.*(torch.rand(500)-0.5).numpy()
y = (torch.rand(500)).numpy()

line_x = 5 * torch.arange(-500,500).float()/500.0
sigmoid_y = torch.sigmoid(line_x)

#print(x[x > 0.5])
pyplot.xlabel('X')
pyplot.ylabel('Y')
pyplot.plot(x, y, 'bo')
pyplot.plot(line_x , sigmoid_y,'r',linewidth=4, markersize=12)
pyplot.show()



# 1.8 Introduction to logistic regression
In this chapter, we will deal with structured data and use logistic regression to classify structured data simply.
Logistic regression is a statistical model that adopts a logistic function to model a binary dependent
variable, although many more complex extensions exist. In regression analysis, logistic regression
is estimating the parameters of a logistic model in the form of binary regression.
Mathematically, a binary logistic model has a dependent variable with two possible values,
such as "True/False" or "Yes/No" which is represented by an indicator variable $p$ that is labeled with "0" or "1".
Logistic regression is a kind of generalized linear regression (generalized linear model), which has many similarities
with multiple linear regression analysis. Their model formulas are basically the same, both have $wx + b$, where $w$ and
$b$ are the parameters to be learned. The major difference lies in their different dependent variables, multiple linear
regression directly uses $wx+b$ as the dependent variable, that is, $y = wx+b$. However, logistic regression uses the
function $L$ to correspond $wx+b$ to a hidden state $p$, $p = L(wx+b)$, and then determine the value of the
dependent variable according
to the size of $p$ and $1-p$. If $L$ is a logistic function, it is logistic regression while if $L$ is a polynomial
function,
it is polynomial regression.
In general, logistic regression will add a layer of a non-linear function on top of a linear regression layer.
Logistic regression is mainly for two-class prediction. We talked about the Sigmoid function in the activation function.
The Sigmoid function is the most common logistic function, because the output of the Sigmoid function is the probability
value between 0 and 1, when the probability is greater than 0.5 is predicted as 1, and less than 0.5 is predicted as 0.


## 1.8.2 UCI German Credit Data Set

UCI German Credit is UCI's German credit data set, which contains original data and numerical data.
The German Credit data is a data set that predicts the risk on loans based on personal
information and overdue loan applications from customers. The data set contains 1000 pieces of data in 24 dimensions.
This dataset classifies people described by a set of attributes as good or bad credit risks.
Comes in two formats (one all numeric).
Here we directly use the processed numerical data as a display.

[Address](https://archive.ics.uci.edu/ml/machine-learning-databases/statlog/german/)



## Data loading and normalization
Now we can load the data and we arte going to normalize each attribute by calculating
the mean and standard deviation of each feature.


In [28]:
data=np.loadtxt("./data/german.data-numeric")
targets = torch.from_numpy(data)[:,-1][:,None].float()-1.
data_tensor = torch.from_numpy(data)[:,:-1]
means = torch.mean(data_tensor,dim=0)
stds = torch.std(data_tensor,dim=0)
data_tensor = ((data_tensor-means)/stds).float()
print(targets.shape)
print(data_tensor.shape)






torch.Size([1000, 1])
torch.Size([1000, 24])


Distinguish between the training set and the test set. Since there is no official verification set here, we can
directly use the accuracy of the test set as the criterion to classify  a sample as  good or bad.

Split dataset: 900 for training and 100 for testing

The format of the dataset is that the first 24 columns are the attributes while the last one is the label
(1 or 2).



In [29]:

train_data = data_tensor[:900,:]
train_targets = targets[:900,:]
test_data = data_tensor[900:,:]
test_targets = targets[900:,:]
batch_size = 32
from torch.utils.data import TensorDataset,DataLoader
training_dataset = TensorDataset(train_data,train_targets)
train_loader = DataLoader(training_dataset, batch_size, shuffle=True)
test_dataset = TensorDataset(test_data,test_targets)

test_loader = DataLoader(test_dataset, batch_size, shuffle=True)
#print(targets)

Below we define the model, the model is very simple, a Linear layer (`nn.Linear()`)
 with a Sigmoid activation function (`nn.Sigmoid()`).

In [30]:
class LogisticRegressionModel(nn.Module):
    def __init__(self):
        super(LogisticRegressionModel,self).__init__()
        self.model = nn.Linear(24,1) # Since 24 dimensions have been fixed, write 24 here
        self.L_function = nn.Sigmoid()
    def forward(self,x):
        y = self.model(x)
        p = self.L_function(y)
        return p




Define our model, optimizer and loss functions. It is a binary classification problem so we'll use `nn.BCELoss`
$BCELoss(y,\hat{y}) = -\frac{1}{N}\sum_{i=0}^{N}(y*\log{\hat{y}}+(1-y)*\log{(1-\hat{y})})$

In [31]:
net = LogisticRegressionModel()
criterion = nn.BCELoss(size_average=True)# Use CrossEntropyLoss loss
optimizer = torch.optim.Adam(net.parameters(),lr=0.001) # Adam optimization
epochs = 1000 # Train 1000 times


Let’s start training

In [32]:
epochs = 10




for i in range(epochs):
    # Specify the model as training mode and calculate the gradient
    net.train()
    # Input values need to be converted into torch Tensor
    avg_loss = 0.
    total_samples =0.
    correct_predictions = 0.
    for batch_idx, (data, target) in enumerate(train_loader):
         #Clear the loss of the previous step
        optimizer.zero_grad()
        y_hat=net(data)
        loss=criterion(y_hat,target) # calculate loss
        avg_loss+=loss.item()
        pred_label = (y_hat>0.5).int()
        #print(pred_label)
        total_samples += data.size()[0]
        correct_predictions += (pred_label == target).sum()
        loss.backward() # Backpropagation
        optimizer.step() # optimization
        #if (batch_idx+1) % 100 == 0: # Here we output relevant information every 100 times
    print(f'Training Epoch {i} Loss {avg_loss/batch_idx:.2f} Batch = {batch_idx} {correct_predictions/total_samples:.2f}')
        # Specify the model as calculation mode
    net.eval()
    avg_loss = 0.
    total_samples =0.
    correct_predictions = 0.
    with torch.no_grad():
        for batch_idx, (data, target) in enumerate(test_loader):
            y_hat=net(data)
            loss=criterion(y_hat,target) # calculate loss
            avg_loss+=loss.item()
            pred_label = (y_hat>0.5).int()
            total_samples += data.size()[0]
            correct_predictions += (pred_label == target).sum()
        print(f'Test Epoch {i} Loss {avg_loss/len(test_loader):.2f} {(correct_predictions/total_samples):.2f} ')
        # test_in = torch.from_numpy(test_data).float()
        # test_l = torch.from_numpy(test_lab).long()
        # test_out = net(test_in)
        # # Use our test function to calculate accuracy
        # accuracy = test(test_out,test_l)
        # print("Epoch:{},Loss:{:.4f},Accuracy:{:.2f}".format(i+1,loss.item(),accuracy))

Training Epoch 0 Loss 0.76 Batch = 28 0.53
Test Epoch 0 Loss 0.77 0.47 
Training Epoch 1 Loss 0.73 Batch = 28 0.57
Test Epoch 1 Loss 0.78 0.49 
Training Epoch 2 Loss 0.70 Batch = 28 0.60
Test Epoch 2 Loss 0.75 0.54 
Training Epoch 3 Loss 0.68 Batch = 28 0.63
Test Epoch 3 Loss 0.70 0.56 
Training Epoch 4 Loss 0.66 Batch = 28 0.67
Test Epoch 4 Loss 0.69 0.56 
Training Epoch 5 Loss 0.64 Batch = 28 0.69
Test Epoch 5 Loss 0.63 0.60 
Training Epoch 6 Loss 0.63 Batch = 28 0.71
Test Epoch 6 Loss 0.64 0.65 
Training Epoch 7 Loss 0.62 Batch = 28 0.73
Test Epoch 7 Loss 0.64 0.67 
Training Epoch 8 Loss 0.60 Batch = 28 0.74
Test Epoch 8 Loss 0.58 0.71 
Training Epoch 9 Loss 0.60 Batch = 28 0.75
Test Epoch 9 Loss 0.63 0.73 


The training is complete, our accuracy reached ~75%