# CLOUD GPU TRAINING 

This notebook provides a very basic example of how to configure and run a training on a cloud gpu (e.g kaggle, colab, or deepnote).

## 1. Setting up environnement

In [1]:
import os

In [5]:
!git clone https://github.com/the-dharma-bum/Pytorch-Vision-Transformers-CIFAR10

Cloning into 'Pytorch-Vision-Transformers-CIFAR10'...
remote: Enumerating objects: 15, done.[K
remote: Counting objects: 100% (15/15), done.[K
remote: Compressing objects: 100% (11/11), done.[K
remote: Total 15 (delta 4), reused 15 (delta 4), pack-reused 0[K
Unpacking objects: 100% (15/15), done.


In [7]:
os.chdir('Pytorch-Vision-Transformers-CIFAR10')

In [9]:
!pip install -r requirements.txt

Collecting einops==0.3.0
  Downloading einops-0.3.0-py2.py3-none-any.whl (25 kB)
Collecting pytorch_lightning_bolts==0.2.5
  Downloading pytorch_lightning_bolts-0.2.5-py3-none-any.whl (190 kB)
[K     |████████████████████████████████| 190 kB 388 kB/s eta 0:00:01
Collecting dataclasses==0.6
  Downloading dataclasses-0.6-py3-none-any.whl (14 kB)
Collecting pytorch_lightning==1.0.5
  Downloading pytorch_lightning-1.0.5-py3-none-any.whl (559 kB)
[K     |████████████████████████████████| 559 kB 2.4 MB/s eta 0:00:01
Installing collected packages: einops, pytorch-lightning, pytorch-lightning-bolts, dataclasses
  Attempting uninstall: pytorch-lightning
    Found existing installation: pytorch-lightning 1.0.4
    Uninstalling pytorch-lightning-1.0.4:
      Successfully uninstalled pytorch-lightning-1.0.4
Successfully installed dataclasses-0.6 einops-0.3.0 pytorch-lightning-1.0.5 pytorch-lightning-bolts-0.2.5


## 2. Configure training

In [16]:
from pl_bolts.datamodules import CIFAR10DataModule
from pytorch_lightning import Trainer
from pytorch_lightning.callbacks import LearningRateMonitor, EarlyStopping
from config import TrainConfig
from model import LightningModel

In [14]:
cfg = TrainConfig(
    rootdir          = './cifar10/',
    train_batch_size = 32,
    val_batch_size   = 32,
    num_workers      = 4,
    patch_size       = 4,
    dim              = 512,
    depth            = 6,
    heads            = 8,
    mlp_dim          = 1024,
    dropout_rate     = 0.1,
    emb_dropout_rate = 0.1
)

In [20]:
data = CIFAR10DataModule(cfg.rootdir)

In [21]:
model = LightningModel(cfg)

In [22]:
lr_logger      = LearningRateMonitor()
early_stopping = EarlyStopping(monitor   = 'val_loss',
                               mode      = 'min',
                               min_delta = 0.001,
                               patience  = 10,
                               verbose   = True)

> TPUs should be usable by changing gpus=1 to tpus_core=1, but I haven't tested it yet.

In [23]:
trainer = Trainer(gpus=1, callbacks = [lr_logger, early_stopping])

GPU available: True, used: True
TPU available: False, using: 0 TPU cores
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


## 3. Run training

In [None]:
trainer.fit(model, data)