<a href="https://colab.research.google.com/github/christophergaughan/PyTorch/blob/main/ComputerVision_PyTorch.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Computer Vision- Using PYTorch

**Basis**

pixels are read as RGB colors and turned into --> numbers (tensors) or `numerical encoding` --> model (algorithm) --> output probability that the image is X ot Y or Z

**Details**
 Tensors contain the following information:
 1. Width of image
 2. Height of image
 3. Color channels == 3 (RGB)
 depending on what algorithm you're working with data as tensors whose ID is as follows:

 [batch_size, height, width, color_channels] OR [batch_size, color_channels, height, width]

 These will be mainly CNN models

 We will be working with `torch.nn.Conv2d`

 ## Computer version libraries in PyTorch

* `torchvision`- base domain library for PyTorch computer vision-
  https://pytorch.org/vision/stable/index.html
* `torchvision.datassets`get datasets and loading functions here:
  https://pytorch.org/vision/stable/datasets.html#built-in-datasets
* `torchvision.models` get pre-trained computer vision models i.e. have pretrained weights, etc. that you can leverage for your own problems.
* `torchvision.transforms`- functions for manipulating your vision data (images) to be suitable for use with an ML model.
* `torch.utils.Dataset`- Base dataset class for PyTorch.
* `torch.utils.data.DataLoader` - Creates a Python iterable over a dataset

Torchvision supports common computer vision transformations in the torchvision.transforms and torchvision.transforms.v2 modules. Transforms can be used to transform or augment data for training or inference of different tasks (image classification, detection, segmentation, video classification).

* PIL is the Python Imaging Library by Fredrik Lundh and contributors.

### torchvision.datasets

All datasets are subclasses of torch.utils.data.Dataset i.e, they have __getitem__ and __len__ methods implemented. Hence, they can all be passed to a torch.utils.data.DataLoader which can load multiple samples parallelly using torch.multiprocessing workers. For example:
```
imagenet_data = torchvision.datasets.ImageNet('path/to/imagenet_root/')
data_loader = torch.utils.data.DataLoader(imagenet_data,
                                          batch_size=4,
                                          shuffle=True,
                                          num_workers=args.nThreads)
```

In [None]:
import torch
import torchvision
from torchvision import datasets
from torchvision import transforms
from torchvision.transforms import ToTensor
from torch.utils.data import Dataset
from torch.utils.data import DataLoader
import numpy as np
import matplotlib.pyplot as plt

print(torch.__version__)
print(torchvision.__version__)

## Getting a dataset

we will be using `fashion.mnist` datset- greyscale images of clothing
basic dataset for implementation here

Be aware that IMAGENET  is the gold standard for computer vision evaluations

`torchvision.datasets.FashionMNIST(root: str, train: bool = True, transform: Union[Callable, NoneType] = None, target_transform: Union[Callable, NoneType] = None, download: bool = False) → None[source]`

### Fashion-MNIST Dataset.

Parameters:
* **root (string)** – Root directory of dataset where FashionMNIST/processed/training.pt and FashionMNIST/processed/test.pt exist.
* **train (bool, optional)** – If True, creates dataset from training.pt, otherwise from test.pt.
* **download (bool, optional)** – If true, downloads the dataset from the internet and puts it in root directory. If dataset is already downloaded, it is not downloaded again.
transform (callable, optional) – A function/transform that takes in an PIL image and returns a transformed version. E.g, transforms.RandomCrop
* **target_transform (callable, optional)** – A function/transform that takes in the target and transforms it.

In [None]:
# Setup Training data
train_data = datasets.FashionMNIST(
    root="data", # where to download data to
    train=True, # do we want the training dataset?
    download=True, # do we want to download?
    transform=torchvision.transforms.ToTensor(), # how to transform the data
    target_transform=None # how do we want to transform the labels/target
)

test_data = datasets.FashionMNIST(
    root="data",
    train=False,
    download=True,
    transform=torchvision.transforms.ToTensor(),
    target_transform=None
)



In [None]:
len(train_data), len(test_data)

In [None]:
# See the first training data- this will output the data as tensors (C x H x W) NOTE: grey scale images only have 1 color channel
image, label = train_data[0]
image, label

In [None]:
class_names = train_data.classes
class_names

In [None]:
class_to_idx = train_data.class_to_idx
class_to_idx

In [None]:
train_data.targets

In [None]:
# Check shape of our image
print(f"Image Shape: {image.shape} --> [color_channels, height, width], Image Label: {class_names[label]}")

## Visualizing our data

In [None]:
image, label = train_data[0]
print(f"Image Shape: {image.shape}")
plt.imshow(image.squeeze(), cmap="gray") # had to remove a dimension so it would plot
plt.title(class_names[label])
plt.axis("off")
plt.imshow(image.squeeze())
# image

In [None]:
# Plot more images
torch.manual_seed(42)
fig = plt.figure(figsize=(9, 9))
row, cols = 4, 4
for i in range(1, row * cols + 1):
    random_idx = torch.randint(0, len(train_data), size=[1]).item()
    img, label = train_data[random_idx]
    fig.add_subplot(row, cols, i)
    plt.imshow(img.squeeze(), cmap="gray")
    plt.title(class_names[label])
    plt.axis(False)