# Fashion MNIST with pytorch lightning and neptune.ai

This notebook is accompanying the code on the [repository](https://github.com/jeremyfix/deeplearning-lectures/tree/lightning/LabsSolutions/00-pytorch-FashionMNISTLightning). This is a tutorial for illustrating deep learning training with [pytorch](www.pytorch.org), more specifically the higher level framework [pytorch lightning](https://www.pytorchlightning.ai/). We monitor the training using [neptune.ai](https://app.neptune.ai/).

In [None]:
%load_ext autoreload
%autoreload 

In [None]:
# Setup up the environment

# For CPU only
#!pip install -r requirements.txt

# For GPU 
!pip install -r requirements_cuda.txt

## Dataloader illustration

The first step of every deep learning script is to deal with the dataloading. This means writing a [dataset object](https://pytorch.org/docs/stable/data.html#dataset-types) or using already programmed dataset objects and plugging it into a dataloader. A dataloader allows to iterate over the dataset, providing a sequence of mini-batches of data. As we are dealing with images, our dataloader is providing 4D tensor (B, C, H, W) following the pytorh convention of Batch, Channel, Height, Width.

In [None]:
import data
import random
import torch

We first build our dataloaders to see the raw data

In [None]:
num_threads = 4
valid_ratio = 0.2
batch_size = 128
classes_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat', 'Sandal','Shirt', 'Sneaker', 'Bag', 'Ankle boot']

loaders, fnorm = data.make_dataloaders(valid_ratio,
                                  batch_size,
                                  num_threads,
                                  normalize=False,
                                  dataaugment_train=False,
                                  dataset_dir=None,
                                  normalizing_tensor_path=None)
train_loader, valid_loader, test_loader = loaders

data.display_samples(train_loader, 10, 'fashionMNIST_samples.png', classes_names)

X, y = next(iter(train_loader))
print(f"A minibatch of data is a tensor X of shape {X.shape} and y of shape {y.shape}")

And we now play around with data augmentation. Data augmentation is a very important technique for regularization, i.e. generating many variation of your data for which you can also compute the target

In [None]:
loaders, fnorm = data.make_dataloaders(valid_ratio,
                                  batch_size,
                                  num_threads,
                                  False,
                                  dataaugment_train=True,
                                  dataset_dir=None,
                                  normalizing_tensor_path=None)
# Let us take the first sample of the dataset and sample it several
# times 
train_loader, _, _ = loaders
sample_idx = random.randint(0, len(train_loader.dataset))
samples = [train_loader.dataset[sample_idx][0] for i in range(10)]
label = train_loader.dataset[sample_idx][1] 

# Build a torch tensor from the list of samples
samples = torch.cat(samples, dim=0).unsqueeze(dim=1) # to add C=1

data.display_tensor_samples(samples, label,
                       'fashionMNIST_sample_aug.png',
                        classes_names)

## Training

In [None]:
import train

In [None]:
config = {
    "model": "linear",
    "num_workers": 2,
    "normalize": True,
    "data_augment": True,
    "dataset_dir": None,
    "weight_decay": 0.0    
}

train.main(config)