# Logistic Regression with PyTorch

In this notebook, I will  explore the MNIST dataset and try to develop a logistic regression model to classify the data via PyTorch.  


The MNIST database of handwritten digits, available from this page, has a training set of 60,000 examples, and a test set of 10,000 examples.

In [4]:
#You might need to run this, if youhave not installed the pytorch before
#pip install torchvision 


In [5]:
import torch
from torch.autograd import Variable
import torchvision.transforms as transforms
import torchvision.datasets as dsets


### Here are the steps: 

 Step 1. Load Dataset
 
 Step 2. Make Dataset Iterable
 
 Step 3. Create Model Class
 
 Step 4. Instantiate Model Class
 
 Step 5. Instantiate Loss Class
 
 Step 6. Instantiate Optimizer Class
 
 Step 7. Train Model

In [14]:
# lets load the dataset

train_dataset = dsets.MNIST(root='./data', train=True, transform=transforms.ToTensor(), download=True)
test_dataset = dsets.MNIST(root='./data', train=False, transform=transforms.ToTensor())


Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz


1.1%

Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to ./data/MNIST/raw/train-images-idx3-ubyte.gz


100.0%


Extracting ./data/MNIST/raw/train-images-idx3-ubyte.gz to ./data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz


102.8%

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to ./data/MNIST/raw/train-labels-idx1-ubyte.gz
Extracting ./data/MNIST/raw/train-labels-idx1-ubyte.gz to ./data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz



2.4%

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to ./data/MNIST/raw/t10k-images-idx3-ubyte.gz


100.0%


Extracting ./data/MNIST/raw/t10k-images-idx3-ubyte.gz to ./data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to ./data/MNIST/raw/t10k-labels-idx1-ubyte.gz


112.7%

Extracting ./data/MNIST/raw/t10k-labels-idx1-ubyte.gz to ./data/MNIST/raw




  return torch.from_numpy(parsed.astype(m[2], copy=False)).view(*s)


In [15]:
train_dataset

Dataset MNIST
    Number of datapoints: 60000
    Root location: ./data
    Split: Train
    StandardTransform
Transform: ToTensor()

In [16]:
test_dataset

Dataset MNIST
    Number of datapoints: 10000
    Root location: ./data
    Split: Test
    StandardTransform
Transform: ToTensor()

In [18]:
train_loader = torch.utils.data.DataLoader(dataset=train_dataset, shuffle=True)
test_loader = torch.utils.data.DataLoader(dataset=test_dataset, shuffle=False)

In [19]:
class LogisticRegression(torch.nn.Module):
    def __init__(self, input_dim, output_dim):
        super(LogisticRegression, self).__init__()
        self.linear = torch.nn.Linear(input_dim, output_dim)

    def forward(self, x):
        outputs = self.linear(x)
        return outputs

In [20]:
batch_size = 100
n_iters = 3000
epochs = n_iters / (len(train_dataset) / batch_size)
input_dim = 784
output_dim = 10
lr_rate = 0.001

In [21]:
model = LogisticRegression(input_dim, output_dim)


In [22]:
criterion = torch.nn.CrossEntropyLoss() # computes softmax and then the cross entropy

In [23]:
optimizer = torch.optim.SGD(model.parameters(), lr=lr_rate)


In [24]:
iter = 0

for epoch in range(int(epochs)):
    for i, (images, labels) in enumerate(train_loader):
        images = Variable(images.view(-1, 28 * 28))
        labels = Variable(labels)

        optimizer.zero_grad()
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        iter+=1
        if iter%500==0:
            # calculate Accuracy
            correct = 0
            total = 0
            for images, labels in test_loader:
                images = Variable(images.view(-1, 28*28))
                outputs = model(images)
                _, predicted = torch.max(outputs.data, 1)
                total+= labels.size(0)
                # for gpu, bring the predicted and labels back to cpu fro python operations to work
                correct+= (predicted == labels).sum()
            accuracy = 100 * correct/total
            print("Iteration: {}. Loss: {}. Accuracy: {}.".format(iter, loss.item(), accuracy))

Iteration: 500. Loss: 1.3536701202392578. Accuracy: 63.529998779296875.
Iteration: 1000. Loss: 1.5278326272964478. Accuracy: 72.86000061035156.
Iteration: 1500. Loss: 2.356659412384033. Accuracy: 78.0.
Iteration: 2000. Loss: 1.3369340896606445. Accuracy: 80.18000030517578.
Iteration: 2500. Loss: 0.6764758825302124. Accuracy: 80.47000122070312.
Iteration: 3000. Loss: 1.3610490560531616. Accuracy: 80.83000183105469.
Iteration: 3500. Loss: 0.48414960503578186. Accuracy: 80.44000244140625.
Iteration: 4000. Loss: 1.146453857421875. Accuracy: 82.97000122070312.
Iteration: 4500. Loss: 1.0295732021331787. Accuracy: 84.55999755859375.
Iteration: 5000. Loss: 2.2862539291381836. Accuracy: 84.5199966430664.
Iteration: 5500. Loss: 0.5406784415245056. Accuracy: 85.05999755859375.
Iteration: 6000. Loss: 0.2617846131324768. Accuracy: 85.38999938964844.
Iteration: 6500. Loss: 0.5688148736953735. Accuracy: 85.52999877929688.
Iteration: 7000. Loss: 0.4263997972011566. Accuracy: 85.70999908447266.
Iterati

Iteration: 57000. Loss: 0.33701372146606445. Accuracy: 90.19999694824219.
Iteration: 57500. Loss: 0.11711926758289337. Accuracy: 90.33000183105469.
Iteration: 58000. Loss: 0.4243853986263275. Accuracy: 90.33999633789062.
Iteration: 58500. Loss: 0.029139718040823936. Accuracy: 90.38999938964844.
Iteration: 59000. Loss: 0.11002454906702042. Accuracy: 90.30000305175781.
Iteration: 59500. Loss: 0.06259342283010483. Accuracy: 90.33999633789062.
Iteration: 60000. Loss: 0.03636601194739342. Accuracy: 90.27999877929688.
Iteration: 60500. Loss: 1.1344830989837646. Accuracy: 90.36000061035156.
Iteration: 61000. Loss: 0.004178246948868036. Accuracy: 90.33999633789062.
Iteration: 61500. Loss: 0.6885717511177063. Accuracy: 90.30000305175781.
Iteration: 62000. Loss: 0.00625663623213768. Accuracy: 90.30999755859375.
Iteration: 62500. Loss: 0.029521390795707703. Accuracy: 90.33999633789062.
Iteration: 63000. Loss: 0.019837552681565285. Accuracy: 90.37000274658203.
Iteration: 63500. Loss: 0.01874938979

KeyboardInterrupt: 

In [25]:
print("Accuracy: {0:.2f}".format(accuracy))

Accuracy: 91.13
