# trixi PyTorch Example

This example aims to show some basic logging/visualization features for a simple workflow in PyTorch.  
It assumes you have a visdom server running (using port 8080, so you can in principle even connect to the visdom server from some remote device):  
`python -m visdom.server -port 8080`  
Navigate your browser to `localhost:8080` to view the visualizations.


In [1]:
from trixi import PytorchExperimentLogger
from trixi import PytorchVisdomLogger
from trixi import Config

import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.autograd import Variable
import torch.optim as optim
from torchvision import datasets, transforms

Let's start by creating a `PytorchExperimentLogger` and a `PytorchVisdomLogger`.

In [6]:
Exp = PytorchExperimentLogger(base_dir="./experiment_dir", 
                              experiment_name="test-experiment",
                              folder_format="{experiment_name}")

Viz = PytorchVisdomLogger(name="main", port=8080)

The experiment logger creates a convenient folder structure for a given experiment (with name `experiment_name`) in the base directory `base_dir`. To better keep track of when a given experiment was done, `folder_format` allows using [Python's strftime directives](http://strftime.org/). An example folder format using the current date would be:  
`folder_format="%Y-%m-%d_%H-%M_{experiment_name}"`

For the sake of this basic example we'll stick to the simple folder format only using the experiment's name.

In [7]:
!ls experiment_dir

test-experiment


We see the base directory has been created containing our test experiment.

In [8]:
!ls experiment_dir/test-experiment/

checkpoint  config  img  log  plot  result  save


Let's now create a config for our experiment which basically works like a `dict`.

In [9]:
config_dict = Config()

config_dict.batch_size = 64
config_dict.batch_size_test = 1000
config_dict.n_epochs = 10
config_dict.learning_rate = 0.01
config_dict.momentum = 0.9
if torch.cuda.is_available():
    config_dict.use_cuda = True
else:
    config_dict.use_cuda = False
config_dict.rnd_seed = 1
config_dict.log_interval = 200

Time to save this config.

In [10]:
Exp.save_config(config_dict, name="test-config")

The experiment logger's `save_config` method automatically encodes our config in JSON and saves the file in our config directory.

In [12]:
!ls experiment_dir/test-experiment/config/

test-config.json


In [13]:
!tail -n 10 experiment_dir/test-experiment/config/test-config.json

{
    "batch_size": 64,
    "batch_size_test": 1000,
    "n_epochs": 10,
    "learning_rate": 0.01,
    "momentum": 0.9,
    "use_cuda": true,
    "rnd_seed": 1,
    "log_interval": 200
}

Now let's continue with our PyTorch workflow - in this case a simple ConvNet trained on MNIST.

In [14]:
kwargs = {'num_workers': 1, 'pin_memory': True} if config_dict.use_cuda else {}

train_loader = torch.utils.data.DataLoader(
    datasets.MNIST('experiment_dir/data', train=True, download=True, transform=transforms.ToTensor()),
    batch_size=config_dict.batch_size, shuffle=True, **kwargs)

test_loader = torch.utils.data.DataLoader(
    datasets.MNIST('experiment_dir/data', train=False, transform=transforms.ToTensor()),
    batch_size=config_dict.batch_size_test, shuffle=True, **kwargs)

Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Processing...
Done!


In [15]:
# build a simple cnn model
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 10, kernel_size=5)
        self.conv2 = nn.Conv2d(10, 20, kernel_size=5)
        self.conv2_drop = nn.Dropout2d()
        self.fc1 = nn.Linear(320, 50)
        self.fc2 = nn.Linear(50, 10)

    def forward(self, x):
        x = F.relu(F.max_pool2d(self.conv1(x), 2))
        x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)), 2))
        x = x.view(-1, 320)
        x = F.relu(self.fc1(x))
        x = F.dropout(x, training=self.training)
        x = self.fc2(x)
        return F.log_softmax(x, dim=1)

In [16]:
model = Net()

if config_dict.use_cuda:
    model.cuda()
    
optimizer = optim.SGD(model.parameters(), lr=config_dict.learning_rate,
                     momentum=config_dict.momentum)

Now during training we'd like to have a simple visualization of the training loss and every `log_interval` batches we also want to save our model.  
Our visdom logger's `show_value` simply creates a line plot which is automatically appended with new values every iteration. We could also set an appendix to the environment name here so the plot is displayed in a different environment on the visdom server, but we'll stick to the default here.  
Using our experiment logger we can simply save the model providing a name and iteration number. The files are automatically saved in the checkpoint directory of our experiment.

In [17]:
# train
def train(epoch):
    model.train()
    for batch_idx, (data, target) in enumerate(train_loader):
        if config_dict.use_cuda:
            data, target = data.cuda(), target.cuda()
        data, target = Variable(data), Variable(target)
        optimizer.zero_grad()
        output = model(data)
        loss = F.nll_loss(output, target)
        # plot the training loss
        Viz.show_value(loss.data[0], name='Training Loss')
        loss.backward()
        optimizer.step()
        if batch_idx % config_dict.log_interval == 0:
            print('Train Epoch: {} [{}/{} samples ({:.0f}%)]\t Batch Loss: {:.6f}'
                  .format(epoch, batch_idx * len(data), 
                          len(train_loader.dataset),
                          100. * batch_idx / len(train_loader), 
                          loss.data[0]))
            # save the current model weights
            Exp.save_model(model, name="MNIST_ConvNet", n_iter=batch_idx)

For every epoch we also want to keep track of our model's performance on the test set. Again we can use `show_value` to display a line plot of the test loss for every epoch.

In [18]:
# test
def test():
    model.eval()
    test_loss = 0
    correct = 0
    for data, target in test_loader:
        if config_dict.use_cuda:
            data, target = data.cuda(), target.cuda()
        data, target = Variable(data, volatile=True), Variable(target)
        output = model(data)
        test_loss += F.nll_loss(output, target, size_average=False).data[0]
        pred = output.data.max(1, keepdim=True)[1]
        correct += pred.eq(target.data.view_as(pred)).cpu().sum()
    test_loss /= len(test_loader.dataset)
    # plot the test loss
    Viz.show_value(test_loss, name='Test Loss')
    print('\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(
        test_loss, correct, len(test_loader.dataset),
        100. * correct / len(test_loader.dataset)))

Let's go ahead and train our net for the number of epochs saved in the config.

In [19]:
for epoch in range(1, config_dict.n_epochs+1):
    train(epoch)
    test()


Test set: Average loss: 0.1445, Accuracy: 9553/10000 (96%)


Test set: Average loss: 0.1007, Accuracy: 9667/10000 (97%)


Test set: Average loss: 0.0750, Accuracy: 9760/10000 (98%)


Test set: Average loss: 0.0621, Accuracy: 9805/10000 (98%)


Test set: Average loss: 0.0653, Accuracy: 9813/10000 (98%)


Test set: Average loss: 0.0542, Accuracy: 9827/10000 (98%)


Test set: Average loss: 0.0568, Accuracy: 9835/10000 (98%)


Test set: Average loss: 0.0490, Accuracy: 9838/10000 (98%)


Test set: Average loss: 0.0458, Accuracy: 9851/10000 (99%)


Test set: Average loss: 0.0446, Accuracy: 9862/10000 (99%)



Now we should also see our saved model checkpoints.

In [20]:
!ls experiment_dir/test-experiment/checkpoint/

MNIST_ConvNet_00000.pth  MNIST_ConvNet_00400.pth  MNIST_ConvNet_00800.pth
MNIST_ConvNet_00200.pth  MNIST_ConvNet_00600.pth


Great! That's it for a very basic use-case, going forward one could e.g. combine the two loggers used in this examples using the `CombinedLogger` class. The benefit of this lies in methods supported by both parts of the combined logger. This way, for example, plots can simultaneously be displayed on the visdom server and saved in the image directory of the experiment - convenient, right?