# Course 2: Introduction to TorchVision

`torchvision` is a library that provides a collection of datasets and models for computer vision. It facilitates the process of loading and preprocessing data, and provides a collection of models that can be used for common tasks in computer vision.

In this notebook, we'll learn how to use `torchvision` to load and preprocess data, and how to use pre-trained models for tasks like image classification.

Then, we will achieve the previous task of classifying images of handwritten digits from the MNIST dataset using a convolutional neural network.

We'll also take a look at the CIFAR-10 dataset, which consists of 60000 32x32 px colour images in 10 classes. We'll create a convolutional neural network that can predict the labels of these images with a reasonably high accuracy.

Let's start by installing and importing the required libraries.

## Transform and Datasets

`torchvision` provides a module called `transforms` which contains a large number of methods that can be chained together using `transforms.Compose`. Some of the popular transforms are:

- `ToTensor`: Converts a numpy array or a PIL image object into a PyTorch tensor.
- `Normalize`: Normalizes the input tensor to have a mean and standard deviation of a given value.
- `Resize`: Resizes the input PIL image to the given size.
- `RandomCrop`: Crops the input PIL image at a random location.
- `CenterCrop`: Crops the input PIL image at the center.
- `RandomHorizontalFlip`: Randomly flips the input PIL image horizontally.
- `RandomVerticalFlip`: Randomly flips the input PIL image vertically.
- etc.

`torchvision` also provides a module called `datasets` which provides a collection of datasets that can be used to train and test machine learning models. Some of the popular datasets are:

- `MNIST`: A dataset of 28x28 px grayscale images of handwritten digits.
- `CIFAR10`: A dataset of 32x32 px colour images in 10 classes.
- `ImageNet`: A massive dataset of 224x224 px colour images in 1000 classes.
- etc.

Let's start by importing `torch`, `torchvision`, `torchvision.transforms` and `torchvision.datasets`.

In [1]:
import torch
import torchvision
import torchvision.transforms as transforms
from torch.utils.data import DataLoader

device = torch.device("mps" if torch.backends.mps.is_available() else "cuda" if torch.cuda.is_available() else "cpu")

# Define the transform to convert the PIL image to a tensor and normalize it
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])

# Download and load the training data
trainset = torchvision.datasets.MNIST(root='./data', train=True, download=True, transform=transform)
trainloader = DataLoader(trainset, batch_size=4, shuffle=True)

# Download and load the test data
testset = torchvision.datasets.MNIST(root='./data', train=False, download=True, transform=transform)
testloader = DataLoader(testset, batch_size=4, shuffle=False)

# Classes
classes = tuple(str(i) for i in range(10))

# Let's visualize some of the training data
import matplotlib.pyplot as plt
import numpy as np

def imshow(img):
    img = img / 2 + 0.5     # unnormalize
    npimg = img.numpy()
    plt.imshow(np.transpose(npimg, (1, 2, 0)))
    plt.show()


Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Failed to download (trying next):
HTTP Error 403: Forbidden

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz to ./data/MNIST/raw/train-images-idx3-ubyte.gz


100%|██████████| 9912422/9912422 [00:23<00:00, 429642.19it/s] 


Extracting ./data/MNIST/raw/train-images-idx3-ubyte.gz to ./data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Failed to download (trying next):
HTTP Error 403: Forbidden

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-labels-idx1-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-labels-idx1-ubyte.gz to ./data/MNIST/raw/train-labels-idx1-ubyte.gz


100%|██████████| 28881/28881 [00:00<00:00, 166195.66it/s]


Extracting ./data/MNIST/raw/train-labels-idx1-ubyte.gz to ./data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Failed to download (trying next):
HTTP Error 403: Forbidden

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz to ./data/MNIST/raw/t10k-images-idx3-ubyte.gz


100%|██████████| 1648877/1648877 [00:01<00:00, 875431.51it/s] 


Extracting ./data/MNIST/raw/t10k-images-idx3-ubyte.gz to ./data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Failed to download (trying next):
HTTP Error 403: Forbidden

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz to ./data/MNIST/raw/t10k-labels-idx1-ubyte.gz


100%|██████████| 4542/4542 [00:00<00:00, 1453018.75it/s]


Extracting ./data/MNIST/raw/t10k-labels-idx1-ubyte.gz to ./data/MNIST/raw



Matplotlib is building the font cache; this may take a moment.


AttributeError: '_SingleProcessDataLoaderIter' object has no attribute 'next'