## Convolutional Neural Networks
CNNs are great for problems that require classification (and sometimes regression) from visual data. CNNs are useful because *they* find the features themselves as opposed to us determining the features to use. We just give the CNN a label and data that is somewhat visual and let it go.
The general flow for this notebook will be:
1. Explore tools from PyTorch that allow for the import/transformation of different types of visual data
2. Importing data image data from a clothing database
3. Design a Multi-Class NN model that will be used to classify the different types of clothing in the images
4. Examine the results and then create a CNN model and compare the performance to the MCNN
5. Save the weights from model with the best results so that it can be used elsewhere

In [2]:
# Lets start by importing everything we will need

import torch
from torch import nn

import torchvision
from torchvision import datasets # contains pre-built datasets that can be used to test models
from torchvision.transforms import ToTensor # contains useful functions that can transfrom common images formats to tensors

import matplotlib.pyplot as plt
print(f'PyTorch Version: {torch.__version__}, Torchvision Version: {torchvision.__version__}')

PyTorch Version: 1.12.1, Torchvision Version: 0.13.1


In [9]:
# Lets get our training and testing data. Turns out torchvision has many built-in datasets already.
# We will be using the FashionMNIST dataset for this classification problem. It's basically the fashion
# version of the original MNIST dataset that used number. There are 10 classes, but they're clothes, not numbers.

DATA_DIR = "../../data/"

# Many of the datasets in this module have the same arguments.
train_data = datasets.FashionMNIST(
    DATA_DIR,
    train=True,
    transform=ToTensor(),
    target_transform=None,
    download=True
)

test_data = datasets.FashionMNIST(
    DATA_DIR,
    train=False,
    transform=ToTensor(),
    target_transform=None,
    download=True
)

In [11]:
# With the training data donwloaded, lets explore
print(type(train_data))

# Lets get the first image. label combo from the test data
image, label = test_data[0]
print(f'Label Type: {type(label)}, Label: {label}')
print(f'Image Type: {type(image)}, Image Shape: {image.shape}')



<class 'torchvision.datasets.mnist.FashionMNIST'>
Label Type: <class 'int'>, Label: 9
Image Type: <class 'torch.Tensor'>, Image Shape: torch.Size([1, 28, 28])


#### So it looks like the image is a 3D tensor and the label is just an integer giving the class. The 1 in the first dimension of the image shows that it's just a greyscale image. This implies that the values in the 28x28 2D tensor just represent the intensity of the the pixel. If this were a color image, there would be 3 "channels" representing the intensity of RGB respectively. 