# What is TensorBoard ?
TensorBoard is a visualization tool for machine learning. It can be used to track metrics such as loss and accuracy, visualize model graph and it has many more functionalities. Let's now see it in action.

The very first thing we do is we import all the libriaries that we will need for this tutorial. To be able to use TensorBoard we need to install it, use this code to accomplish this:
`pip3 install tensorboard`

In [None]:
import matplotlib.pyplot as plt
import numpy as np

import torch
import torchvision
import torchvision.transforms as transforms

import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

device = ("cuda" if torch.cuda.is_available() else "cpu") # Use GPU or CPU for training

For this tutorial we are going to use the LeNet5 architecture that we have defined in this POST. 

First thing we do is we create a transform variable which will be used for transforming the data as we load it. The way we are transforming it is, we will resize it to be of the dimensions 32x32 and also convert it into tensors.

Select our train and test data and apply the transformations dirrectly.

We are loading our dataset next. We split it into batches of the size 64 and we enable shuffling.

In [None]:
# Used to transform our data
transform = transforms.Compose([
    transforms.Resize((32, 32)),
    transforms.ToTensor()])

# Load our datasets
trainset = torchvision.datasets.FashionMNIST('./data',
    download=True,
    train=True,
    transform=transform)
testset = torchvision.datasets.FashionMNIST('./data',
    download=True,
    train=False,
    transform=transform)

trainloader = torch.utils.data.DataLoader(trainset, batch_size=64, shuffle=True)
testloader = torch.utils.data.DataLoader(testset, batch_size=64, shuffle=True)

# Our classes
classes = ('T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
        'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle Boot')

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz to ./data/FashionMNIST/raw/train-images-idx3-ubyte.gz


HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

Extracting ./data/FashionMNIST/raw/train-images-idx3-ubyte.gz to ./data/FashionMNIST/raw
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz to ./data/FashionMNIST/raw/train-labels-idx1-ubyte.gz


HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

Extracting ./data/FashionMNIST/raw/train-labels-idx1-ubyte.gz to ./data/FashionMNIST/raw
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz to ./data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz


HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

Extracting ./data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz to ./data/FashionMNIST/raw
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz to ./data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz


HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

Extracting ./data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz to ./data/FashionMNIST/raw
Processing...
Done!


  return torch.from_numpy(parsed.astype(m[2], copy=False)).view(*s)


For this post we are going to use the same LeNet5 we have talked about in a previous post. If you want to learn more about it check this post out. 

In [None]:
class LeNet5(nn.Module):

    def __init__(self):
        super(LeNet5, self).__init__()
        
        self.convolutional_layer = nn.Sequential(            
            nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5, stride=1),
            nn.Tanh(),
            nn.AvgPool2d(kernel_size=2, stride=2, padding=0),
            nn.Conv2d(in_channels=6, out_channels=16, kernel_size=5, stride=1),
            nn.Tanh(),
            nn.AvgPool2d(kernel_size=2, stride=2, padding=0),
            nn.Conv2d(in_channels=16, out_channels=120, kernel_size=5, stride=1),
            nn.Tanh()
        )

        self.linear_layer = nn.Sequential(
            nn.Linear(in_features=120, out_features=84),
            nn.Tanh(),
            nn.Linear(in_features=84, out_features=10),
        )


    def forward(self, x):
        x = self.convolutional_layer(x)
        x = torch.flatten(x, 1)
        x = self.linear_layer(x)
        x = F.softmax(x, dim=1)
        return x

model = LeNet5().to(device)





For the optimizer we are using Adam and for the loss function we use CrossEntropy.

In [None]:
optimizer = optim.Adam(model.parameters(), lr=0.001)
criterion = nn.CrossEntropyLoss()

Now, for the interesting part, using tensorboard. We are going to import "SummaryWriter" from tensorboard which is the main entry to log data. 

For this example we are going to take the first batch of images and labels which will only be used for this example. The torchivision library has a usefull function to create a image grid, which we will use. 

To add our image to TensorBoard we simply use the "add_image" function and add a title, simple as that we added our first image.

We will also plot our graph in TensorBoard using the "add_graph" function.

In [None]:
from torch.utils.tensorboard import SummaryWriter

writer = SummaryWriter()

images, labels = next(iter(trainloader))
images, labels = images.to(device), labels.to(device)

img_grid = torchvision.utils.make_grid(images) # Create a grid of images

writer.add_image('Fashion_mnist_images', img_grid) # Used to add images to tensorboard

writer.add_graph(model, images) # Draw our graph in tensorboard
writer.close()

To visualize what we have done we should use this piece of code. All the data will be logged inside a folder called "runs" by default. To open TensorBoard, go to your browser and go to this url: `http://localhost:6006/`

In [None]:
!tensorboard --logdir=runs

2020-08-19 10:36:22.907564: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
Serving TensorBoard on localhost; to expose to the network, use a proxy or pass --bind_all
TensorBoard 2.3.0 at http://localhost:6006/ (Press CTRL+C to quit)
^C


We can see a image was added, this image represents the first batch of images. On the left side we can see that TensorBoard has two sliders for us. The first one is used to change the brightness of the images and the second one for chaning the contrast. 

We can also check how our model looks like. In the top part of TensorBoard we can see a navigational button called "GRAPHS". If we click on it, it will take us to our model. By default, we will see an op-level graph, which helps us to understand how TensorFlow understands our program. Examining the op-level graph can give us insight as how to change our model.

The direction of the arrows show which way the tensors are flowing. To better inspect layers, they can be expanded by double-clicking on a specific one.

The next part would be to measure our training loss and accuracy. We are going to train for 20 epochs, to see how we are training our model check out the previous [post](https://). 

We are going to write to TensorBoard every second epoch. We are saving the accuracy for training and testing as well as the loss.

Lots of information can be logged for one experiment. To avoid cluttering the UI and have better result clustering, we can group plots by naming them hierarchically. Like for example "Accuracy/Training" and "Accuracy/Testing" will be grouped together.

Now by refreshing our page, we can see that a new tab has popped up, "SCALARS". If we click on it we can see all the graphs that we have added. By placing our mouse on it, it will show us a value and the epoch that value was captured at. To change how smooth our line is, we can change the amount by sliding the smoothing slider.

This way we have separated all of our graphs, but if we wanted to see how they overlap, for example the training and testing accuracy, we can visualize them on one plot. this can be done by chaning the "add_scalar" function by "add_scalars" function and just by passing a dictionary of what we want to visualize on that one plot.

In [None]:
for epoch in range(20):  # loop over the dataset multiple times
    total_train_loss = 0.0
    total_test_loss = 0.0
    
    print("Epoch: ", epoch)

    model.train()
    
    total = 0
    for idx, (image, label) in enumerate(trainloader):
        image, label = image.to(device), label.to(device)
        optimizer.zero_grad()

        pred = model(image)

        loss = criterion(pred, label)
        total_train_loss += loss.item()

        loss.backward()
        optimizer.step()
        total_train_loss += loss.item()
        pred = torch.nn.functional.softmax(pred, dim=1)
        for i, p in enumerate(pred):
          if label[i] == torch.max(p.data, 0)[1]:
            total = total + 1

    total_train_acc = total / len(trainset)
    total_train_loss = total_train_loss / (idx + 1)

    total = 0
    model.eval()
    for idx, (image, label) in enumerate(testloader):
        image, label = image.to(device), label.to(device)

        pred = model(image)
        loss = criterion(pred, label)
        total_test_loss += loss.item()
        pred = torch.nn.functional.softmax(pred, dim=1)
        for i, p in enumerate(pred):
          if label[i] == torch.max(p.data, 0)[1]:
            total = total + 1

    total_test_acc = total / len(testset)
    total_test_loss = total_test_loss / (idx + 1)


    if epoch % 2 == 0:    # every 2'nd epoch...
        writer.add_scalars('Loss', 
                            {'Training': total_train_loss,
                             'Testing': total_test_loss},
                            epoch)

        writer.add_scalars('Accuracy', 
                           {'Training': total_train_acc,
                            'Testing': total_test_acc},
                          epoch)


    total_train_loss = 0.0
    total_test_loss = 0.0

print('Finished Training')

Epoch:  0
Epoch:  1
Epoch:  2
Epoch:  3
Epoch:  4
Epoch:  5
Epoch:  6
Epoch:  7
Epoch:  8
Epoch:  9
Epoch:  10
Epoch:  11
Epoch:  12
Epoch:  13
Epoch:  14
Epoch:  15
Epoch:  16
Epoch:  17
Epoch:  18
Epoch:  19
Finished Training


Now we can clearly see how our training and testing loss and accuracy compare.

In [None]:
!zip -r /content/runs.zip /content/runs

  adding: content/runs/ (stored 0%)
  adding: content/runs/Aug19_10-36-13_512ecdee9637/ (stored 0%)
  adding: content/runs/Aug19_10-36-13_512ecdee9637/events.out.tfevents.1597833482.512ecdee9637.100.1 (deflated 5%)
  adding: content/runs/Aug19_10-36-13_512ecdee9637/Loss_Testing/ (stored 0%)
  adding: content/runs/Aug19_10-36-13_512ecdee9637/Loss_Testing/events.out.tfevents.1597833482.512ecdee9637.100.3 (deflated 45%)
  adding: content/runs/Aug19_10-36-13_512ecdee9637/Accuracy_Testing/ (stored 0%)
  adding: content/runs/Aug19_10-36-13_512ecdee9637/Accuracy_Testing/events.out.tfevents.1597833482.512ecdee9637.100.5 (deflated 49%)
  adding: content/runs/Aug19_10-36-13_512ecdee9637/Loss_Training/ (stored 0%)
  adding: content/runs/Aug19_10-36-13_512ecdee9637/Loss_Training/events.out.tfevents.1597833482.512ecdee9637.100.2 (deflated 46%)
  adding: content/runs/Aug19_10-36-13_512ecdee9637/events.out.tfevents.1597833374.512ecdee9637.100.0 (deflated 16%)
  adding: content/runs/Aug19_10-36-13_512