# PyTorch 1.2 Quickstart with Google Colab
In this code tutorial we will learn how to quickly train a model to understand some of PyTorch's basic building blocks to train a deep learning model. This notebook is inspired by the ["Tensorflow 2.0 Quickstart for experts"](https://colab.research.google.com/github/tensorflow/docs/blob/master/site/en/tutorials/quickstart/advanced.ipynb#scrollTo=DUNzJc4jTj6G) notebook. 

After completion of this tutorial, you should be able to import data, transform it, and efficiently feed the data in batches to a convolution neural network (CNN) model for image classification.

**Author:** [Elvis Saravia](https://twitter.com/omarsar0)

**Complete Code Walkthrough:** [Blog post](https://medium.com/dair-ai/pytorch-1-2-quickstart-with-google-colab-6690a30c38d)

In [1]:
!pip3 install torch==1.2.0+cu92 torchvision==0.4.0+cu92 -f https://download.pytorch.org/whl/torch_stable.html


Looking in links: https://download.pytorch.org/whl/torch_stable.html
Collecting torch==1.2.0+cu92
  Downloading https://download.pytorch.org/whl/cu92/torch-1.2.0%2Bcu92-cp37-cp37m-manylinux1_x86_64.whl (663.1 MB)
[K     |████████████████████████████████| 663.1 MB 1.7 kB/s 
[?25hCollecting torchvision==0.4.0+cu92
  Downloading https://download.pytorch.org/whl/cu92/torchvision-0.4.0%2Bcu92-cp37-cp37m-manylinux1_x86_64.whl (8.8 MB)
[K     |████████████████████████████████| 8.8 MB 44.9 MB/s 
Installing collected packages: torch, torchvision
  Attempting uninstall: torch
    Found existing installation: torch 1.9.0+cu111
    Uninstalling torch-1.9.0+cu111:
      Successfully uninstalled torch-1.9.0+cu111
  Attempting uninstall: torchvision
    Found existing installation: torchvision 0.10.0+cu111
    Uninstalling torchvision-0.10.0+cu111:
      Successfully uninstalled torchvision-0.10.0+cu111
[31mERROR: pip's dependency resolver does not currently take into account all the packages tha

Note: We will be using the latest stable version of PyTorch so be sure to run the command above to install the latest version of PyTorch, which as the time of this tutorial was 1.2.0. We PyTorch belowing using the `torch` module. 

In [2]:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torchvision
import torchvision.transforms as transforms

In [3]:
print(torch.__version__)

1.2.0+cu92


In [7]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


## Import The Data
The first step before training the model is to import the data. We will use the [MNIST dataset](http://yann.lecun.com/exdb/mnist/) which is like the Hello World dataset of machine learning. 

Besides importing the data, we will also do a few more things:
- We will tranform the data into tensors using the `transforms` module
- We will use `DataLoader` to build convenient data loaders or what are referred to as iterators, which makes it easy to efficiently feed data in batches to deep learning models. 
- As hinted above, we will also create batches of the data by setting the `batch` parameter inside the data loader. Notice we use batches of `32` in this tutorial but you can change it to `64` if you like. I encourage you to experiment with different batches.

In [16]:
#Configure data correctly
#make rgb images and segmentation images share a file name
#get image in the right format

import os
from numpy import asarray
import numpy as np
from PIL import Image


def remove_chars(filename):
  new_filename = ""
  for i in range(0, len(filename)):
    char = filename[i]
    if char.isnumeric():
      new_filename += char
  return new_filename + ".png"

def config_data(rgb_img, seg_img):
  i = 0



  for filename in os.listdir(rgb_img):
      old_path = os.path.join(rgb_img, filename)
      if not os.path.isdir(old_path):
        i += 1
        new_filename = remove_chars(filename)
        os.rename(os.path.join(rgb_img, filename), os.path.join(rgb_img, new_filename))

  i = 0
  for filename in os.listdir(seg_img):
      old_path = os.path.join(seg_img, filename)
      if not os.path.isdir(old_path):
        i += 1
        new_filename = remove_chars(filename)
        new_name = os.path.join(seg_img,new_filename)
        os.rename(os.path.join(seg_img, filename), os.path.join(seg_img, new_name))
        
        
        #Normalize image to fit model system: https://machinelearningmastery.com/how-to-manually-scale-image-pixel-data-for-deep-learning/ 
        image = Image.open(new_name)
        pixels = asarray(image)
        # convert from integers to floats
        pixels = pixels.astype('float32')
        # normalize to the range 0-1 then to 9
        pixels /= 255.0
        pixels *= 9.0
        im = Image.fromarray(pixels.astype(np.uint8))
        im.save(os.path.join(seg_img, new_name)) 

PATH = '/content/drive/myDrive/Highway_Dataset'
rgb_img = '/content/drive/MyDrive/Highway_Dataset/Train/TrainSeq04/image'
seg_img = '/content/drive/MyDrive/Highway_Dataset/Train/TrainSeq04/label'
config_data(rgb_img, seg_img)
rgb_img_t = '/content/drive/MyDrive/Highway_Dataset/Test/TestSeq04/image'
seg_img_t = '/content/drive/MyDrive/Highway_Dataset/Test/TestSeq04/label'
config_data(rgb_img_t, seg_img_t)

In [17]:
from torch.utils.data import Dataset
from natsort import natsorted
class CustomDataSet(Dataset):
    def __init__(self, main_dir, label_dir, transform):
        self.main_dir = main_dir
        self.label_dir = label_dir
        self.transform = transform
        all_imgs = os.listdir(main_dir)
        all_segs = os.listdir(main_dir)
        self.total_imgs = natsorted(all_imgs)
        self.total_segs = natsorted(all_segs)

    def __len__(self):
        return len(self.total_imgs)

    def __getitem__(self, idx):
        img_loc = os.path.join(self.main_dir, self.total_imgs[idx])
        image = Image.open(img_loc).convert("RGB")
        tensor_image = self.transform(image)
        seg_loc = os.path.join(self.label_dir, self.total_segs[idx])
        labeled_image = Image.open(seg_loc).convert("RGB")
        transform = transforms.Compose([transforms.Resize((12,12)),
                                transforms.ToTensor()])
        labeled_image = transform(labeled_image)
        labeled_image = labeled_image.float()
        return tensor_image, labeled_image

BATCH_SIZE = 32

## transformations
transform = transforms.Compose([transforms.Resize((28,28)),
                                transforms.ToTensor()])

rgb_img = '/content/drive/MyDrive/Highway_Dataset/Train/TrainSeq04/image'
seg_img = '/content/drive/MyDrive/Highway_Dataset/Train/TrainSeq04/label'
## download and load training dataset
imagenet_data = CustomDataSet(rgb_img, seg_img, transform=transform)
trainloader = torch.utils.data.DataLoader(imagenet_data,
                                          batch_size=BATCH_SIZE,
                                          shuffle=True,
                                          num_workers=2)
rgb_img_t = '/content/drive/MyDrive/Highway_Dataset/Test/TestSeq04/image'
seg_img_t = '/content/drive/MyDrive/Highway_Dataset/Test/TestSeq04/label'
## download and load training dataset
imagenet_data_test = CustomDataSet(rgb_img_t, seg_img_t, transform=transform)
testloader = torch.utils.data.DataLoader(imagenet_data_test,
                                          batch_size=BATCH_SIZE,
                                          shuffle=False,
                                          num_workers=2)
#trainset = torchvision.datasets.MNIST(root='./data', train=True,
#                                        download=True, transform=transform)
#trainloader = torch.utils.data.DataLoader(trainset, batch_size=BATCH_SIZE,
#                                          shuffle=True, num_workers=2)

## download and load testing dataset
#testset = torchvision.datasets.MNIST(root='./data', train=False,
#                                       download=True, transform=transform)
#testloader = torch.utils.data.DataLoader(testset, batch_size=BATCH_SIZE,
                                         #shuffle=False, num_workers=2)

## Exploring the Data
As a practioner and researcher, I am always spending a bit of time and effort exploring and understanding the dataset. It's fun and this is a good practise to ensure that everything is in order. 

Let's check what the train and test dataset contains. I will use `matplotlib` to print out some of the images from our dataset. 

In [18]:
import matplotlib.pyplot as plt
import numpy as np

## functions to show an image
def imshow(img):
    #img = img / 2 + 0.5     # unnormalize
    npimg = img.numpy()
    plt.imshow(np.transpose(npimg, (1, 2, 0)))

## get some random training images
dataiter = iter(trainloader)
images, labels = dataiter.next()

## show images
#imshow(torchvision.utils.make_grid(images))

KeyboardInterrupt: ignored

**EXERCISE:** Try to understand what the code above is doing. This will help you to better understand your dataset before moving forward. 

Let's check the dimensions of a batch.

In [19]:
for images, labels in trainloader:
    print("Image batch dimensions:", images.shape)
    print("Image label dimensions:", labels.shape)
    break

Image batch dimensions: torch.Size([32, 3, 28, 28])
Image label dimensions: torch.Size([32, 3, 12, 12])


## The Model
Now using the classical deep learning framework pipeline, let's build the 1 convolutional layer model. 

Here are a few notes for those who are beginning with PyTorch:
- The model below consists of an `__init__()` portion which is where you include the layers and components of the neural network. In our model, we have a convolutional layer denoted by `nn.Conv2d(...)`. We are dealing with an image dataset that is in a grayscale so we only need one channel going in, hence `in_channels=1`. We hope to get a nice representation of this layer, so we use `out_channels=32`. Kernel size is 3, and for the rest of parameters we use the default values which you can find [here](https://pytorch.org/docs/stable/nn.html?highlight=conv2d#conv2d). 
- We use 2 back to back dense layers or what we refer to as linear transformations to the incoming data. Notice for `d1` I have a dimension which looks like it came out of nowhere. 128 represents the size we want as output and the (`26*26*32`) represents the dimension of the incoming data. If you would like to find out how to calculate those numbers refer to the [PyTorch documentation](https://pytorch.org/docs/stable/nn.html?highlight=linear#conv2d). In short, the convolutional layer transforms the input data into a specific dimension that has to be considered in the linear layer. The same applies for the second linear transformation (`d2`) where the dimension of the output of the previous linear layer was added as `in_features=128`, and `10` is just the size of the output which also corresponds to the number of classes.
- After each one of those layers, we also apply an activation function such as `ReLU`. For prediction purposes, we then apply a `softmax` layer to the last transformation and return the output of that.  

In [20]:
class MyModel(nn.Module):
    def __init__(self):
        super(MyModel, self).__init__()

        # 28x28x1 => 26x26x32
        self.conv1 = nn.Conv2d(in_channels=3, out_channels=3, kernel_size=3)
        self.drop1 = nn.Dropout(0.2)
        self.pool1 = nn.MaxPool2d((2,2))

    def forward(self, x):
        # 32x1x28x28 => 32x32x26x26
        x = self.conv1(x)
        x = self.drop1(x)
        x = self.conv1(x)
        x = self.pool1(x)
        x = F.relu(x)

        # flatten => 32 x (32*26*26)
        #x = x.flatten(start_dim = 1)

        #32 x (32*26*26) => 32x128
        #x = self.d1(x)
        #x = F.relu(x)
        #x = x.reshape([32, 3, 28, 28])
        #x = self.d2(x)
        #x = F.relu(x)

        # logits => 32x10
        logits = x
        out = F.softmax(logits, dim=1)
        return out

As I have done in my previous tutorials, I always encourage to test the model with 1 batch to ensure that the output dimensions are what we expect. 

In [21]:
## test the model with 1 batch
model = MyModel()
for images, labels in trainloader:
    print("batch size:", images.shape)
    out = model(images)
    print(out.shape, labels.shape)
    break

Exception ignored in: <function _MultiProcessingDataLoaderIter.__del__ at 0x7f959b15b3b0>
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 926, in __del__
    self._shutdown_workers()
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 906, in _shutdown_workers
    w.join()
  File "/usr/lib/python3.7/multiprocessing/process.py", line 138, in join
    assert self._parent_pid == os.getpid(), 'can only join a child process'
AssertionError: can only join a child process
Exception ignored in: <function _MultiProcessingDataLoaderIter.__del__ at 0x7f959b15b3b0>
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 926, in __del__
    self._shutdown_workers()
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 906, in _shutdown_workers
    w.join()
  File "/usr/lib/python3.7/multiprocessing/pro

batch size: torch.Size([32, 3, 28, 28])
torch.Size([32, 3, 12, 12]) torch.Size([32, 3, 12, 12])


## Training the Model
Now we are ready to train the model but before that we are going to setup a loss function, an optimizer and a function to compute accuracy of the model. 

In [30]:
learning_rate = 0.001
num_epochs = 10

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model = MyModel()
model = model.to(device)
criterion = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

In [28]:
## compute accuracy
def compute_accuracy(hypothesis, Y_target):
    Y_prediction = hypothesis.data.max(dim=1)[1]
    accuracy = torch.mean((Y_prediction.data == Y_target.data.max(dim=1)[1].data).float() )     
    return accuracy.item()

Now it's time for training.

In [31]:
for epoch in range(num_epochs):
    train_running_loss = 0.0
    train_acc = 0.0

    model = model.train()

    ## training step
    for i, (images, labels) in enumerate(trainloader):
        
        images = images.to(device)
        labels = labels.to(device)
        ## forward + backprop + loss
        logits = model(images)
        loss = criterion(logits, labels)
        optimizer.zero_grad()
        loss.backward()

        ## update model params
        optimizer.step()

        train_running_loss += loss.detach().item()
        train_acc += compute_accuracy(logits, labels)
    
    model.eval()
    print('Epoch: %d | Loss: %.4f | Train Accuracy: %.2f' \
          %(epoch, train_running_loss / i, train_acc/i))        

Epoch: 0 | Loss: 0.1214 | Train Accuracy: 0.98
Epoch: 1 | Loss: 0.1213 | Train Accuracy: 0.95
Epoch: 2 | Loss: 0.1213 | Train Accuracy: 0.92
Epoch: 3 | Loss: 0.1213 | Train Accuracy: 0.92
Epoch: 4 | Loss: 0.1211 | Train Accuracy: 0.93
Epoch: 5 | Loss: 0.1213 | Train Accuracy: 0.95
Epoch: 6 | Loss: 0.1214 | Train Accuracy: 0.96
Epoch: 7 | Loss: 0.1212 | Train Accuracy: 0.96
Epoch: 8 | Loss: 0.1213 | Train Accuracy: 0.97
Epoch: 9 | Loss: 0.1212 | Train Accuracy: 0.97


We can also compute accuracy on the testing dataset to see how well the model performs on the image classificaiton task. As you can see below, our basic CNN model is performing very well on the MNIST classification task.

In [33]:
test_acc = 0.0
for i, (images, labels) in enumerate(testloader, 0):
    images = images.to(device)
    labels = labels.to(device)
    outputs = model(images)
    test_acc += compute_accuracy(outputs, labels)

   
print('Test Accuracy: %.2f'%( test_acc/i))

Test Accuracy: 2.00


In [57]:
!pip3 install opencv-python
import cv2
outputs
for i, (images, labels) in enumerate(testloader, 0):
    images = images.to(device)
    labels = labels.to(device)
    outputs = model(images)


tensor  = outputs.cpu().detach().numpy() # make sure tensor is on cpu
cv2.imwrite("image.png" ,tensor )



False

**EXERCISE:** As a way to practise, try to include the testing part inside the code where I was outputing the training accuracy, so that you can also keep testing the model on the testing data as you proceed with the training steps. This is useful as sometimes you don't want to wait until your model has completed training to actually test the model with the testing data.

## Final Words
That's it for this tutorial! Congratulations! You are now able to implement a basic CNN model in PyTorch for image classification. If you would like, you can further extend the CNN model by adding more convolution layers and max pooling, but as you saw, you don't really need it here as results look good. If you are interested in implementing a similar image classification model using RNNs see the references below. 

## References
- [Building RNNs is Fun with PyTorch and Google Colab](https://colab.research.google.com/drive/1NVuWLZ0cuXPAtwV4Fs2KZ2MNla0dBUas)
- [CNN Basics with PyTorch by Sebastian Raschka](https://github.com/rasbt/deeplearning-models/blob/master/pytorch_ipynb/cnn/cnn-basic.ipynb)
- [Tensorflow 2.0 Quickstart for experts](https://colab.research.google.com/github/tensorflow/docs/blob/master/site/en/tutorials/quickstart/advanced.ipynb#scrollTo=DUNzJc4jTj6G) 