# Getting started with Classy Vision

Classy Vision is an end-to-end framework for image and video classification. Classy Vision makes it easy to write and launch distributed training jobs without having to worry about checkpointing, tensorboard logging and other features.

In this tutorial, we will cover: (1) how to start a new project; (2) how to launch a single node training run; (3) how to launch a distributed training run; (3) how to visualize results with Tensorboard; (4) how to load checkpoints and interact with the trained model; (5) how to start training from a Jupyter notebook;

## 0. Setup

Make sure you have classy vision installed:

In [None]:
! pip install classy_vision

If you would like to use GPUs for training, make sure your environment has a working version of PyTorch with CUDA:

In [None]:
import torch
torch.cuda.is_available()

The cell above should output `True`. For this tutorial, we will also need two additional dependencies: `tensorboard` and `tensorboardX`. Install them with the following:

In [None]:
! pip install tensorboard tensorboardX

## 1. Start a new project

To start, let's create a new project:

In [2]:
! classy-project my-project

INFO:root:Running classy project!
INFO:root:
    Successfully generated template project at '/Users/vini/fb/vreis/ClassyVision/tutorials/my-project'.
    To get started, run:
        $ cd my-project
        $ ./classy_train.py --config templates/config.json


In [3]:
import os
os.chdir('my-project')

To launch a training run on the current machine, run the following:

In [None]:
!  ./classy_train.py --config configs/template_config.json

That's it! You've launched your first training run. This trained a small MLP model on a dataset made of random noise, which is not that useful. The `classy-project` utility creates the scaffolding for you project, and you should modify it according to your needs. We'll learn how to customize your runs on the next few tutorials.

Let's take a look at what classy-project has created for us:

In [20]:
! find . | grep -v \.pyc | sort

.
./classy_train.py
./configs
./configs/template_config.json
./datasets
./datasets/__init__.py
./datasets/my_dataset.py
./losses
./losses/__init__.py
./losses/my_loss.py
./models
./models/__init__.py
./models/my_model.py


Here's what each folder means:

 * `configs`: stores your experiment configurations. Keeping all your experiments as separate configuration files helps making your research reproducible;
 * `models`: code for your custom model architectures;
 * `losses`: code for your custom loss functions;
 * `datasets`: code for your custom datasets;
 * `classy_train.py`: script to execute a training job; This uses the Classy Vision library to configure the job and execute it, and you might change it according to your needs;
 * `template_config.json`: experiment configuration file. This file is read by `classy_train.py` to configure your training job and launch it.

Let's take a peek at the configuration file:

In [13]:
! cat configs/template_config.json

{
    "name": "classification_task",
    "num_epochs": 2,
    "loss": {
        "name": "my_loss"
    },
    "dataset": {
        "train": {
            "name": "my_dataset",
            "split": "train",
            "crop_size": 224,
            "class_ratio": 0.5,
            "num_samples": 320,
            "seed": 0,
            "batchsize_per_replica": 32,
            "use_shuffle": true
        },
        "test": {
            "name": "my_dataset",
            "split": "val",
            "crop_size": 224,
            "class_ratio": 0.5,
            "num_samples": 100,
            "seed": 1,
            "batchsize_per_replica": 32,
            "use_shuffle": false
        }
    },
    "meters": {
        "accuracy": {
            "topk": [1, 5]
        }
    },
    "model": {
        "name": "my_model"
    },
    "optimizer": {
        "name": "sgd",
        "lr": {
            "name": "step",
            "values": [0.1, 0.01]
        },
  

That file can be shared with other researchers whenever you want them to reproduce your experiments. We generate `json` files by default but YAML is also supported. Take a look at the [Hydra tutorial](TODO) for an example of how to use YAML files within Classy Vision.

## 2. Distributed training

`classy_train.py` can also be called from `torch.distributed.launch`, similar to regular PyTorch distributed scripts:

In [None]:
! python -m torch.distributed.launch classy_train.py

If you have two GPUs on your current machine, that command will launch one process per GPU and start a [DistributedDataParallel](https://pytorch.org/tutorials/intermediate/ddp_tutorial.html) training run. 

## 3. Tensorboard integration

[Tensorboard](https://www.tensorflow.org/tensorboard) is a very useful tool for visualizing training progress. Classy Vision works with tensorboard out-of-the-box, just make sure you have it installed as described in the Setup section. By default `classy_train.py` will output tensorboard data in a subdirectory of your project directory (typically named `output_<timestamp>/tensorboard`), so in our case we can just launch tensorboard in the current working directory:

In [10]:
%load_ext tensorboard
%tensorboard --logdir .

The tensorboard extension is already loaded. To reload it, use:
  %reload_ext tensorboard


Reusing TensorBoard on port 6006 (pid 58199), started 1:08:48 ago. (Use '!kill 58199' to kill it.)

You can also customize the tensorboard output directory by editing `classy_train.py`.

## 4. Loading checkpoints

Now that we've run `classy_train.py`, let's see how to load the resulting model. At the end of execution, `classy_train.py` will print the checkpoint directory used for that run. Each run will output to a different directory, typically named `output_<timestamp>/checkpoints`.

In [16]:
from classy_vision.generic.util import load_checkpoint
from classy_vision.models import ClassyModel

# This is important: importing models here will register your custom models with Classy Vision
# so that it can instantiate them appropriately from the checkpoint file
import models

# Update this with your actual directory:
checkpoint_dir = './output_<timestamp>/checkpoints'
checkpoint_data = load_checkpoint(checkpoint_dir)
model = ClassyModel.from_checkpoint(checkpoint_data)
model

MyModel(
  (_heads): ModuleDict()
  (model): Sequential(
    (0): AdaptiveAvgPool2d(output_size=(20, 20))
    (1): Flatten()
    (2): Linear(in_features=1200, out_features=2, bias=True)
    (3): Sigmoid()
  )
)

That's it! You can now use that model for inference as usual.

## 5. Resuming from checkpoints

Resuming from a checkpoint is as simple as training: `classy_train.py` takes a `--checkpoint_folder` argument, which specifies the checkpoint to resume from:

In [None]:
! ./classy_train.py --config configs/template_config.json --checkpoint_folder ./output_<timestamp>/checkpoints

## 6. Interactive development

Training scripts and configuration files are useful for running large training jobs on a GPU cluster (see our [AWS tutorial](TODO)), but a lot of day-to-day work happens interactively within Jupyter notebooks. Classy Vision is designed as a library that can be used without our built-in training scripts. Let's take a look at how to do the same training run as before, but within Jupyter instead of using `classy_train.py`:

In [4]:
import classy_vision

In [7]:
from datasets.my_dataset import MyDataset
from models.my_model import MyModel
from losses.my_loss import MyLoss
from classy_vision.dataset.transforms import GenericImageTransform
from torchvision import transforms

train_dataset = MyDataset(
    batchsize_per_replica=32,
    shuffle=False,
    transform=GenericImageTransform(
        transform=transforms.Compose(
            [
                transforms.RandomResizedCrop(224),
                transforms.RandomHorizontalFlip(),
                transforms.ToTensor(),
                transforms.Normalize(
                    mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]
                ),
            ]
        )
    ),
    num_samples=100,
    crop_size=224,
    class_ratio=0.5,
    seed=0,
)

test_dataset = MyDataset(
    batchsize_per_replica=32,
    shuffle=False,
    transform=GenericImageTransform(
        transform=transforms.Compose(
            [
                transforms.Resize(256),
                transforms.CenterCrop(224),
                transforms.ToTensor(),
                transforms.Normalize(
                    mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]
                ),
            ]
        )
    ),
    num_samples=100,
    crop_size=224,
    class_ratio=0.5,
    seed=0,
)


In [8]:
from classy_vision.tasks import ClassificationTask
from classy_vision.optim import SGD
from classy_vision.optim.param_scheduler import ConstantParamScheduler

model = MyModel()
loss = MyLoss()

optimizer = SGD(
    lr_scheduler=ConstantParamScheduler(0.01)
)

from classy_vision.trainer import LocalTrainer

task = ClassificationTask() \
        .set_model(model) \
        .set_dataset(train_dataset, "train") \
        .set_dataset(test_dataset, "test") \
        .set_loss(loss) \
        .set_optimizer(optimizer) \
        .set_num_epochs(1)

trainer = LocalTrainer()
trainer.train(task)

TypeError: Unexpected type <class 'dict'>

That's it! Your model is trained now and ready for inference:

In [62]:
import torch
x = torch.randn((1, 3, 224, 224))
with torch.no_grad():
    y_hat = model(x)

y_hat

torch.Size([1, 1000])

## 7. Conclusion

In this tutorial, we learned how to start a new project using Classy Vision, how to perform tranining locally and how to do multi-gpu training on a single machine. We also saw how to use Tensorboard to visualize training progress, how to load models from checkpoints and how resume training from a checkpoint file. In the next tutorials, we'll look into how to add custom datasets, models and loss functions to Classy Vision so you can adapt it to your needs, and how to launch distributed training on multiple nodes.