## Getting Started with PyTorch-Ignite on Cloud TPUs

This notebook is based on ["Getting Started with PyTorch on Cloud TPUs"](https://colab.research.google.com/github/pytorch/xla/blob/master/contrib/colab/getting-started.ipynb#scrollTo=RKLajLqUni6H) and will show you how to:

- Install PyTorch/XLA on Colab, which lets you use PyTorch with TPUs.
- Train a basic model on MNIST with PyTorch-Ignite.

PyTorch/XLA is a package that lets PyTorch connect to Cloud TPUs and use TPU cores as devices. Colab provides a free Cloud TPU system (a remote CPU host + four TPU chips with two cores each) and installing PyTorch/XLA only takes a couple minutes.


<h3>  &nbsp;&nbsp;Use Colab Cloud TPU&nbsp;&nbsp; <a href="https://cloud.google.com/tpu/"><img valign="middle" src="https://raw.githubusercontent.com/GoogleCloudPlatform/tensorflow-without-a-phd/master/tensorflow-rl-pong/images/tpu-hexagon.png" width="50"></a></h3>

* On the main menu, click Runtime and select **Change runtime type**. Set "TPU" as the hardware accelerator.
* The cell below makes sure you have access to a TPU on Colab.


In [None]:
import os
assert os.environ['COLAB_TPU_ADDR'], 'Make sure to select TPU from Edit > Notebook settings > Hardware accelerator'

## Installing PyTorch/XLA

Run the following cell (or copy it into your own notebook!) to install PyTorch, Torchvision, and PyTorch/XLA. It will take a couple minutes to run.

The PyTorch/XLA package lets PyTorch connect to Cloud TPUs. (It's named PyTorch/XLA, not PyTorch/TPU, because XLA is the name of the TPU compiler.) In particular, PyTorch/XLA makes TPU cores available as PyTorch devices. This lets PyTorch create and manipulate tensors on TPUs.

In [None]:
VERSION = !curl -s https://api.github.com/repos/pytorch/xla/releases/latest | grep -Po '"tag_name": "v\K.*?(?=")'
VERSION = VERSION[0].rstrip('0').rstrip('.') # remove trailing zero
!pip install cloud-tpu-client==0.10 https://storage.googleapis.com/tpu-pytorch/wheels/torch_xla-{VERSION}-cp37-cp37m-linux_x86_64.whl

## Required Dependencies

We assume that `torch` and `ignite` are already installed. We can install it using `pip`:

In [None]:
!pip install pytorch-ignite

## Train a basic model on MNIST with PyTorch-Ignite.

PyTorch XLA API is so simple as PyTorch uses Cloud TPUs just like it uses CPU or CUDA devices. With only minor changes we can train models with PyTorch and Ignite. We will use the code of this example : https://github.com/pytorch/ignite/blob/master/examples/mnist/mnist_with_tensorboard_on_tpu.py


### Import librairies

In [None]:
import torch
print("PyTorch version:", torch.__version__)

# imports the torch_xla package
import torch_xla
import torch_xla.core.xla_model as xm
print("PyTorch xla version:", torch_xla.__version__)

In [None]:
# Import PyTorch, Torchvision and Tensorboard
from torch.utils.data import DataLoader
from torch import nn
import torch.nn.functional as F
from torch.optim import SGD
from torchvision.datasets import MNIST
from torchvision.transforms import Compose, ToTensor, Normalize

from torch.utils.tensorboard import SummaryWriter

# Import PyTorch-Ignite
from ignite.engine import Events, create_supervised_trainer, create_supervised_evaluator
from ignite.metrics import Accuracy, Loss, RunningAverage
from ignite.contrib.handlers import ProgressBar

### Data processing

In [None]:
# Dataloaders
def get_data_loaders(train_batch_size, val_batch_size):
    data_transform = Compose([ToTensor(), Normalize((0.1307,), (0.3081,))])

    train_loader = DataLoader(
        MNIST(download=True, root=".", transform=data_transform, train=True), batch_size=train_batch_size, shuffle=True
    )

    val_loader = DataLoader(
        MNIST(download=False, root=".", transform=data_transform, train=False), batch_size=val_batch_size, shuffle=False
    )
    return train_loader, val_loader

In [None]:
train_batch_size = 64
val_batch_size = train_batch_size * 2

train_loader, val_loader = get_data_loaders(train_batch_size, val_batch_size)

### Create a model

In [None]:
# Setup a basic CNN
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 10, kernel_size=5)
        self.conv2 = nn.Conv2d(10, 20, kernel_size=5)
        self.conv2_drop = nn.Dropout2d()
        self.fc1 = nn.Linear(320, 50)
        self.fc2 = nn.Linear(50, 10)

    def forward(self, x):
        x = F.relu(F.max_pool2d(self.conv1(x), 2))
        x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)), 2))
        x = x.view(-1, 320)
        x = F.relu(self.fc1(x))
        x = F.dropout(x, training=self.training)
        x = self.fc2(x)
        return F.log_softmax(x, dim=-1)

In [None]:
model = Net()
device = xm.xla_device()
model = model.to(device)  # Move model before creating optimizer

### Optimizer and trainers

In [None]:
optimizer = SGD(model.parameters(), lr=0.01, momentum=0.9)

# Create trainer and evaluator
trainer = create_supervised_trainer(
    model, 
    optimizer, 
    F.nll_loss, 
    device=device, 
    output_transform=lambda x, y, y_pred, loss: [loss.item(), ]
)

evaluator = create_supervised_evaluator(
    model, metrics={"accuracy": Accuracy(), "nll": Loss(F.nll_loss)}, device=device
)

### Handlers

In [None]:
# Setup event's handlers
log_interval = 10

log_dir = "/tmp/tb_logs"

# writer
writer = SummaryWriter(log_dir=log_dir)

tracker = xm.RateTracker()

# Add RateTracker as an output of the training step
@trainer.on(Events.ITERATION_COMPLETED)
def add_rate_tracker(engine):
    tracker.add(len(engine.state.batch))
    engine.state.output.append(tracker.global_rate())

# Setup output values of the training step as EMA metrics
RunningAverage(output_transform=lambda x: x[0]).attach(trainer, "batch_loss")
RunningAverage(output_transform=lambda x: x[1]).attach(trainer, "global_rate")

# Let's log the EMA metrics every `log_interval` iterations
@trainer.on(Events.ITERATION_COMPLETED(every=log_interval))
def log_training_loss(engine):
    writer.add_scalar("training/batch_loss", engine.state.metrics["batch_loss"], engine.state.iteration)
    writer.add_scalar("training/global_rate", engine.state.metrics["global_rate"], engine.state.iteration)

# Setup a progress bar (tqdm) and display batch loss metric in the bar
pbar = ProgressBar()
pbar.attach(trainer, ["batch_loss", "global_rate"])

# Let's compute training metrics: average accuracy and loss
@trainer.on(Events.EPOCH_COMPLETED)
def log_training_results(engine):
    evaluator.run(train_loader)
    metrics = evaluator.state.metrics
    avg_accuracy = metrics["accuracy"]
    avg_nll = metrics["nll"]
    pbar.log_message(
        f"Training Results - Epoch: {engine.state.epoch}  Avg accuracy: {avg_accuracy:.2f} Avg loss: {avg_nll:.2f}"
    )
    writer.add_scalar("training/avg_loss", avg_nll, engine.state.epoch)
    writer.add_scalar("training/avg_accuracy", avg_accuracy, engine.state.epoch)

# Let's compute training metrics: average accuracy and loss
@trainer.on(Events.EPOCH_COMPLETED)
def log_validation_results(engine):
    evaluator.run(val_loader)
    metrics = evaluator.state.metrics
    avg_accuracy = metrics["accuracy"]
    avg_nll = metrics["nll"]
    print(
        f"Validation Results - Epoch: {engine.state.epoch}  Avg accuracy: {avg_accuracy:.2f} Avg loss: {avg_nll:.2f}"
    )
    writer.add_scalar("valdation/avg_loss", avg_nll, engine.state.epoch)
    writer.add_scalar("valdation/avg_accuracy", avg_accuracy, engine.state.epoch)

In [None]:
# Display in Firefox may not work properly. Use Chrome.
%load_ext tensorboard

%tensorboard --logdir="/tmp/tb_logs"

In [None]:
# kick everything off
!rm -rf /tmp/tb_logs/*

trainer.run(train_loader, max_epochs=10)