<a href="https://colab.research.google.com/github/JerryKurata/colab-pytorch/blob/master/Fashion_MNIST_Torch.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Fashion-MNIST in Pytorch 

This notebook demonstrates how we can implement Fashion-MNIST in Pytorch.  This implementation illustrates how we use:


*   torch
*   torch.nn - Neural Network
*   torch.optim - Optimizers
*   torchvision - Neural Netwoks for Computer Vision

*This code is largely based on this tutorial:https://pytorch.org/tutorials/intermediate/tensorboard_tutorial.html* and the code in https://github.com/pytorch/examples/blob/master/mnist/main.py

In [0]:
# Imports

# pyplot is plotting.  numpy is our best friend
import matplotlib.pyplot as plt
import numpy as np

# torch is general torch, torchvision is vision NN layers and utilities
#   .transforms is routine that transform vision data
import torch
import torchvision
import torchvision.transforms as transforms

# We are going to use torch NN libraries, functional API (keras-like), and optimizer
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim



In [0]:
# Parameters
num_epochs = 25
batch_size = 100
learning_rate = 0.001


## Load the Data

We first need to load the data.  We could do this from files.  But **torchvision** has a dataset class that supports loading of data for specific well-know models like Fashion-MNIST. And by using this dataset library we do not have to update this code if the data location changes.

Notice that we set the batch size in the DataLoader

In [3]:
# Define transform with to tensor and normalizer.  We normalize each channel
#  values to -1.0 to 1.0 via image = (image - mean)/std.  
transform = transforms.Compose(
    [transforms.ToTensor(),
    transforms.Normalize((0.5,), (0.5,))])
# Download and transform the Fashion-MNIST training and testing datasets
train_data = torchvision.datasets.FashionMNIST('./data',
    download=True,
    train=True,    # true for training data
    transform=transform)
test_data = torchvision.datasets.FashionMNIST('./data',
    download=True,
    train=False,   # false for test data
    transform=transform)

# Define Loaders for training and evaluating with the training and test datasets
#  num_workers = 2 runs 2 subprocesses to speed the loading
train_loader = torch.utils.data.DataLoader(train_data, batch_size= batch_size, 
                                           shuffle=True, num_workers=2)
test_loader = torch.utils.data.DataLoader(test_data, batch_size=batch_size, 
                                           shuffle=False, num_workers=2)


Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz to ./data/FashionMNIST/raw/train-images-idx3-ubyte.gz


HBox(children=(IntProgress(value=1, bar_style='info', max=1), HTML(value='')))


Extracting ./data/FashionMNIST/raw/train-images-idx3-ubyte.gz to ./data/FashionMNIST/raw
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz to ./data/FashionMNIST/raw/train-labels-idx1-ubyte.gz


HBox(children=(IntProgress(value=1, bar_style='info', max=1), HTML(value='')))


Extracting ./data/FashionMNIST/raw/train-labels-idx1-ubyte.gz to ./data/FashionMNIST/raw
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz to ./data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz


HBox(children=(IntProgress(value=1, bar_style='info', max=1), HTML(value='')))


Extracting ./data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz to ./data/FashionMNIST/raw
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz to ./data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz


HBox(children=(IntProgress(value=1, bar_style='info', max=1), HTML(value='')))


Extracting ./data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz to ./data/FashionMNIST/raw
Processing...
Done!


In [0]:
# Define class label names for displaying.  Class labels are [0,1,2,...,9] and these
#  labels are match class to position.  So class=0 => 'T-shirt/top', class=1 => 'Trousers'
classes = ('T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
        'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle Boot')


# Define a helper function to show the image
# helper function to show an image
# (used in the `plot_classes_preds` function below)
def matplotlib_imshow(img, one_channel=False):
    if one_channel:
        img = img.mean(dim=0)
    img = img / 2 + 0.5     # unnormalize
    npimg = img.numpy()
    if one_channel:   # do we want grayscale?
        plt.imshow(npimg, cmap="Greys")
    else:             # rgb
        plt.imshow(np.transpose(npimg, (1, 2, 0)))




## Define the Model

We define our model here.  Feel free to experiment with the model structure.  Even a linear model of 2 layers will work with the MNIST data.  But, it probably will not perform that well.  But give it a try!! 

Notice the forward() method connects each layer to the next.  Read up on nn.Sequeunce and see if that works.  For Keras
users, nn.Sequence is like the Keras Sequential modeand this code is like the Keras API.

In [0]:
#  Define NN Model class
class Model(nn.Module):

  # Define the storage for the weights
    def __init__(self):
        super(Model, self).__init__()
        self.conv1 = nn.Conv2d(1, 6, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16 * 4 * 4, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

  # Hook up the 
    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 16 * 4 * 4)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

# Create instance of NN Model
model = Model()



When we train we use the loss criterion to measure loss, and the optimizer method to reduce loss.

Our items are one of the 10 classes of fashion items.  CrossEntropyLoss shows how poorly our model is doing at predicting each of the classes.

The optimizer will adjust parameters (weights) in the model to minimuze this loss.

In [0]:
# Define loss criterion and optimizer method
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=learning_rate, momentum=0.9)

##  Train the model 

We finally get around to training the model.  We can log data using the *.add_scalar* method.  Also, we can add associated data with the *.add_figure* method.


In [13]:
for epoch in range(num_epochs):  # loop over the dataset multiple times

    for batch_idx, data in enumerate(train_loader, 0):

        # get the inputs; data is a list of [inputs, labels]
        inputs, labels = data

        # zero the parameter gradients
        optimizer.zero_grad()

        # forward + backward + optimize
        outputs = model(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        if batch_idx % 1000 == 99:   
          print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
                epoch, batch_idx * len(inputs), len(train_loader.dataset),
                100. * batch_idx / len(train_loader), loss.item()))
print('Finished Training')



Traceback (most recent call last):
  File "/usr/lib/python3.6/multiprocessing/queues.py", line 240, in _feed
    send_bytes(obj)
  File "/usr/lib/python3.6/multiprocessing/connection.py", line 200, in send_bytes
    self._send_bytes(m[offset:offset + size])
  File "/usr/lib/python3.6/multiprocessing/connection.py", line 404, in _send_bytes
    self._send(header + buf)
  File "/usr/lib/python3.6/multiprocessing/connection.py", line 368, in _send
    n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe


KeyboardInterrupt: ignored

### Evaluate the trained model's performance on Testing Data

Of course, once we have the model trained, we want to evaluate it's performance. That is why separate training from testing/evaluation data.  And we never train with the testing/evaluation data.

So now we will use this testing/evaluation data to see how well our trained model does on data it was **not** trained on.

Notice we first use model.eval().  This will model.eval() will notify all layers that you are in eval mode, and that way, batchnorm or dropout layers will work in eval mode instead of training mode.  (https://discuss.pytorch.org/t/model-eval-vs-with-torch-no-grad/19615)

In [12]:
#   
  model.eval()
  test_loss = 0
  correct = 0
  with torch.no_grad():
    for data, target in test_loader:
      output = model(data)
      test_loss += F.nll_loss(output, target, reduction='sum').item()  # sum up batch loss
      pred = output.argmax(dim=1, keepdim=True)  # get the index of the max log-probability
      correct += pred.eq(target.view_as(pred)).sum().item()

  test_loss /= len(test_loader.dataset)

  print('\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(
        test_loss, correct, len(test_loader.dataset),
        100. * correct / len(test_loader.dataset)))



Test set: Average loss: -9.6719, Accuracy: 8760/10000 (88%)

Test Accuracy of the model on the 10000 test images: 87.0000 %
