# 05_Logistic_Regression_Models
In this notebook, we will see how to define simple logistic regression models.

In [None]:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torchvision
import torchvision.transforms as transforms

import matplotlib.pyplot as plt
%matplotlib inline

torch.manual_seed(777)  # reproducibility

## Logistic Regression Models
Logistic regression models are the same as linear regression models, except that they use logistic sigmoid (or softmax) function for computing the probabilities of target classes.

You can define and train logistic regression models in the same way as linear regression models.

We will look at all the processes with a concrete example, MNIST.
The MNIST databse is a large database of handwritten digits, with the image size of 28x28.
The training set consists of 60,000 images and the test set consists of 10,000 images.

<img src="images/MnistExamples.png" width="500">

### DataLoader

In [None]:
batch_size = 100

# MNIST dataset 
train_dataset = torchvision.datasets.MNIST(root='./data', 
                                           train=True, 
                                           transform=transforms.ToTensor(),  
                                           download=True)

test_dataset = torchvision.datasets.MNIST(root='./data', 
                                          train=False, 
                                          transform=transforms.ToTensor())

# Data loader
train_loader = torch.utils.data.DataLoader(dataset=train_dataset, 
                                           batch_size=batch_size, 
                                           shuffle=True)

test_loader = torch.utils.data.DataLoader(dataset=test_dataset, 
                                          batch_size=batch_size, 
                                          shuffle=False)

# plot one example
print(train_dataset.train_data.size())                 # (60000, 28, 28)
print(train_dataset.train_labels.size())               # (60000)

idx = 0
plt.title('%d' % train_dataset.train_labels[idx].item())
plt.imshow(train_dataset.train_data[idx,:,:].numpy(), cmap='gray')

### Define Linear Regression Models

In [None]:
# Hyper-parameters 
input_size = 784
num_classes = 10

# Device configuration
# device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
device = torch.device('cpu')

# Logistic regression model
model = nn.Linear(input_size, num_classes).to(device)

### Loss function and Optimizer
A loss function takes the (output, target) pair of inputs, and computes a value that estimates how far away the output is from the target.

We use `nn.CrossEntropyLoss()` for logistic regression.

In [None]:
# nn.CrossEntropyLoss() computes softmax internally
criterion = nn.CrossEntropyLoss()  

Furtheremore, PyTorch supports several optimizers from `torch.optim`.
We use an Adam optimizer.

In [None]:
learning_rate = 0.001
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

## Tensorboard
To visualize model training, you can use [Tensorboard](https://www.tensorflow.org/tensorboard).
It was originally developed for Tensorflow, but you can also use it for PyTorch via [TensorboardX](https://github.com/lanpa/tensorboardX).

In [None]:
!pip install tensorboardX==1.2
from tensorboardX import SummaryWriter
summary = SummaryWriter("runs/experiment")
%load_ext tensorboard

### Train the network

In [None]:
num_epochs = 5
# Train the model
total_step = len(train_loader)
step_i = 0
for epoch in range(num_epochs):
    for i, (images, labels) in enumerate(train_loader):
        # Reshape images to (batch_size, input_size)
        images = images.reshape(-1, 28*28).to(device)
        
        # Forward pass
        outputs = model(images)
        loss = criterion(outputs, labels)
        
        # Backward and optimize
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        
        step_i += 1
               
        if (i+1) % 100 == 0:
            print ('Epoch [{}/{}], Step [{}/{}], Loss: {:.4f}' 
                   .format(epoch+1, num_epochs, i+1, total_step, loss.item()))
            
            # tensorboard logging
            summary.add_scalar("loss", loss.item(), step_i)
            summary.add_histogram("weight", model.weight.clone().detach().cpu().numpy(), step_i)
            summary.add_histogram("bias", model.bias.clone().detach().cpu().numpy(), step_i)

### Tensorboard log

In [None]:
%tensorboard --logdir runs/experiment

### Test the network

In [None]:
# Test the model
# In test phase, we don't need to compute gradients (for memory efficiency)
with torch.no_grad():
    correct = 0
    total = 0
    for images, labels in test_loader:
        images = images.reshape(-1, input_size).to(device)
        labels = labels.to(device)
        outputs = model(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

    print('Accuracy of the network on the 10000 test images: {} %'.format(100 * correct / total))

### Save/Load the network parameters

In [None]:
# Save the model checkpoint
torch.save(model.state_dict(), './data/logistic_regression_model.ckpt')

# Load the model checkpoint if needed
# new_model = nn.Linear(input_size, num_classes).to(device)
# new_model.load_state_dict(torch.load('./data/logistic_regression_model.ckpt'))