This tutorial explains the pipeline of YAMLF with CIFAR10 dataset. First, let's set the automatic reload for jupyter notebook.

In [2]:
# To automatically reload the functions
%load_ext autoreload
%autoreload 2

Use `pip install yamlf` to install

In [3]:
!ls ../../

dev  dist  LICENSE  MANIFEST  README.md  setup.cfg  setup.py  tutorials  yamlf


In [4]:
import sys
sys.path.append('../../')

# Setting Hyperparameters
There are two ways to set hyperparameters:
 1. create a python dict:
 `{batchsize: 64, device: "cuda", chkpt_dir: "chkpts"}`
 2. use default_settings.DefaultSettings class:
        from default_settings import DefaultSettings
        defs = DefaultSettings.init("data", "chkpts")

In [5]:
from yamlf.default_settings import DefaultSettings

defs = DefaultSettings.init()
defs

{'low_storage': False,
 'DN': 'inputs',
 'LN': 'targets',
 'scpc': None,
 'ltoc': None,
 'ctol': None,
 'datadir': PosixPath('data'),
 'chkptdir': PosixPath('chkpts'),
 'chkpt_filename': 'chkpt.pt',
 'wts_filename': 'wts.pt',
 'tbwriter': <torch.utils.tensorboard.writer.SummaryWriter at 0x7f0e04528ca0>,
 'num_folds': 5,
 'batchsize': 64,
 'num_workers': 8,
 'epochs': 5,
 'init_epoch': 0,
 'lr': 0.001,
 'moms': (0.95, 0.85),
 'wd': 0.0,
 'dropout': 0.1,
 'train_iters': None,
 'val_iters': None,
 'device': device(type='cuda')}

# Load Data
Next is data loading class and setting dataloaders. This tutorial uses data loading from torchvision CIFAR10 dataset. So, there is no need to define a dataset class.

`yamlf.vision` script contains class and functions for Computer Vision tasks such as Classification, localization, and segmentation.
There are two ways to define dataloaders to pass it to trainer:
 1. create a dict of dataloaders like
 `dls = {
    "train": torch.utils.data.DataLoader(trainset, ...),
    "val": torch.utils.data.DataLoader(valset, ...),
    "test": torch.utils.data.DataLoader(testset, ...)
    }`
 2. use `yamlf.vision.LoadData` class as given below.

In [6]:
from yamlf.vision import *

In [7]:
train_tsfms = tv.transforms.Compose([
    tv.transforms.ToTensor(),
])
test_tsfms  = tv.transforms.Compose([
    tv.transforms.ToTensor(),
])

trainset = tv.datasets.CIFAR10(root='./data', train=True, download=True, transform=train_tsfms)
testset  = tv.datasets.CIFAR10(root='./data', train=False, download=True, transform=test_tsfms)
classes = ('plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck')

defs["num_cls"] = len(classes)

Files already downloaded and verified
Files already downloaded and verified


In [8]:
# defs are defaults settings dict that contains batchsize,
# num_workers, etc. parameters
dls = LoadData(trainset, testset, defs=defs)
dls

yamlf.vision.LoadData class
trainset -> num samples: 50000, num batches: 782
valset -> num samples: 10000, num batches: 157

# Model or Network
define a model to train.

`yamlf.utils.net_stats` is a useful function for brief network information

In [9]:
from yamlf.utils import net_stats

In [10]:
# The convention for model is net in this framework. You can also use model if you like.
net = tv.models.resnet18(num_classes=10)
net_stats(net)

NETWORK STATS:
20 convs
20 batchnorms
1 dense
# parameters: 11.182M


# Training and validation
The *Trainer* class is inspired by fast.ai.

Just import the Trainer and provide data, network, loss. Some values such as optimizer and learning rate scheduler has default values.

In [11]:
from yamlf.nn_trainer import Trainer
import torch.nn as nn

Apex recommended for faster mixed precision training: https://github.com/NVIDIA/apex


In [12]:
model = Trainer(dls, net, nn.CrossEntropyLoss(), metrics='acc')

After initializing the Trainer class, we can check if everything works fine by training for few iterations than running the whole epoch of data. It is useful to check if code is working for full training. This can be achieve by setting `model.fit(..., train_iters=5, val_iters=5)` as shown below:

In [13]:
model.fit(2, 1e-3, train_iters=5, val_iters=5)

epoch,train_loss,train_acc,val_loss,val_acc,time
0,2.65403,0.10312,2.31296,0.09688,0.8922524452209473 secs
1,2.45588,0.15937,2.32984,0.10312,0.578148365020752 secs


After that, we can run full training as:

In [None]:
model.fit(epochs=5, lr=1e-3)

If low_stroage is not True in default settings dict then a checkpoint folder will be created which contains tensorboard logs, last epoch training checkpoint.
Also, model weights can be saved manually by calling `model.save_weights()`. Similarly, `model.load_weights()` can be used to load model weights

In [None]:
model.save_weights()

In [None]:
# Start new training and check weights
net = tv.models.resnet18(num_classes=10)
model = Trainer(dls, net, nn.CrossEntropyLoss(), metrics='acc')
model.test_dl = model.val_dl
model.test()

In [None]:
# Load saved weights and test
model.load_weights()
model.test_dl = model.val_dl
model.test()