<a href="https://colab.research.google.com/github/crispu93/Pytorch_Pocket_Reference/blob/main/3_Deep_Learning_development_with_pytorch.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Outline

Now we can start developing and deploying deep learning models. Overall process:
1. Load and transform data in suitable inputs for the model
2. Build the deep learning model
3. Train the model
4. Test the model performance and tweak hyperparameters to improve your results and training speed
5. Deploy your model to prototype systems or production

(Insert image of the hole process)

# Data preparation 
The first step of deep learning development starts with data preparation.

## Data loading
Pytorch provides powerfull built-in classes and utilities, such as the *Dataset*, *DataLoader*, and *Sampler* classes for loading various types of data.

- *Dataset* class: access and preprocess data from a file or data sources.
- *Sampler* class: sample data from a dataset in order to create batches.
- *DataLoader* class: combines a dataset with a sampler and allows you to iterate over a set of batches.

In [1]:
from torchvision.datasets import CIFAR10

train_data = CIFAR10(root="./train/",
                     train=True,
                     download=True)

Downloading https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz to ./train/cifar-10-python.tar.gz


  0%|          | 0/170498071 [00:00<?, ?it/s]

Extracting ./train/cifar-10-python.tar.gz to ./train/


In [3]:
print(train_data.data.shape)

(50000, 32, 32, 3)


# Data transforms
The data might need to be adjusted before it is passed into the NN model for training and testing.
Data values may be **normalized** to assist training, **augmented** to create larger datasets, or **converted** from one type of object to a tensor.

The test data is going to be randomly cropped, horizontally flipped, and normalized.
This actually helps NN models to do a better job of classifying the images.

In [5]:
from torchvision import transforms

train_transforms = transforms.Compose([
    transforms.RandomCrop(32, padding=4),
    transforms.RandomHorizontalFlip(),
    transforms.ToTensor(),
    transforms.Normalize(
        mean=(0.4914, 0.4822, 0.4465),
        std=(0.2023, 0.1994, 0.2010))
    ])

train_data = CIFAR10(root="./train/",
                     train=True,
                     download=True,
                     transform=train_transforms)
print(train_data)

Files already downloaded and verified
Dataset CIFAR10
    Number of datapoints: 50000
    Root location: ./train/
    Split: Train
    StandardTransform
Transform: Compose(
               RandomCrop(size=(32, 32), padding=4)
               RandomHorizontalFlip(p=0.5)
               ToTensor()
               Normalize(mean=(0.4914, 0.4822, 0.4465), std=(0.2023, 0.1994, 0.201))
           )


In the case of test data we do not want to crop or flip the image, but we do need to convert the image to tensors and normalize the tensors values, as shown in the following code:

In [9]:
test_transforms = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(
        (0.4914, 0.4822, 0.4465),
        (0.2023, 0.1994, 0.2010)
    )])

test_data = CIFAR10(
    root="./test/",
    train=False,
    download=True,
    transform=test_transforms
)

print(test_data)

Files already downloaded and verified
Dataset CIFAR10
    Number of datapoints: 10000
    Root location: ./test/
    Split: Test
    StandardTransform
Transform: Compose(
               ToTensor()
               Normalize(mean=(0.4914, 0.4822, 0.4465), std=(0.2023, 0.1994, 0.201))
           )
