Tensorboard是个好东西，可以可视化训练结果，在使用tensorboard或者keras时候经常使用这个工具。先看下`logger.py`文件。定义一个Logger类，里面的函数分别是：
 - `__init__`：构造函数，需要log_dir作为参数，如果不存在这个文件夹就生成
 - scalar_summary：写入一个标量数据，分别是tag, value, step作为参数
 - image_summary：写入一张图像，参数同scalar_summary
 - histo_summary：矩阵数据?多用于可视化参数

In [1]:
# Code referenced from https://gist.github.com/gyglim/1f8dfb1b5c82627ae3efcfbbadb9f514
import tensorflow as tf
import numpy as np
import scipy.misc 
try:
    from StringIO import StringIO  # Python 2.7
except ImportError:
    from io import BytesIO         # Python 3.x


class Logger(object):
    
    def __init__(self, log_dir):
        """Create a summary writer logging to log_dir."""
        self.writer = tf.summary.FileWriter(log_dir)

    def scalar_summary(self, tag, value, step):
        """Log a scalar variable."""
        summary = tf.Summary(value=[tf.Summary.Value(tag=tag, simple_value=value)])
        self.writer.add_summary(summary, step)

    def image_summary(self, tag, images, step):
        """Log a list of images."""

        img_summaries = []
        for i, img in enumerate(images):
            # Write the image to a string
            try:
                s = StringIO()
            except:
                s = BytesIO()
            scipy.misc.toimage(img).save(s, format="png")

            # Create an Image object
            img_sum = tf.Summary.Image(encoded_image_string=s.getvalue(),
                                       height=img.shape[0],
                                       width=img.shape[1])
            # Create a Summary value
            img_summaries.append(tf.Summary.Value(tag='%s/%d' % (tag, i), image=img_sum))

        # Create and write Summary
        summary = tf.Summary(value=img_summaries)
        self.writer.add_summary(summary, step)
        
    def histo_summary(self, tag, values, step, bins=1000):
        """Log a histogram of the tensor of values."""

        # Create a histogram using numpy
        counts, bin_edges = np.histogram(values, bins=bins)

        # Fill the fields of the histogram proto
        hist = tf.HistogramProto()
        hist.min = float(np.min(values))
        hist.max = float(np.max(values))
        hist.num = int(np.prod(values.shape))
        hist.sum = float(np.sum(values))
        hist.sum_squares = float(np.sum(values**2))

        # Drop the start of the first bin
        bin_edges = bin_edges[1:]

        # Add bin edges and counts
        for edge in bin_edges:
            hist.bucket_limit.append(edge)
        for c in counts:
            hist.bucket.append(c)

        # Create and write Summary
        summary = tf.Summary(value=[tf.Summary.Value(tag=tag, histo=hist)])
        self.writer.add_summary(summary, step)
        self.writer.flush()

用一个简单的网络看下Tensorboard如何使用，还是先导入相应的包文件

In [3]:
import torch
import torch.nn as nn
import torchvision
from torchvision import transforms


# Device configuration
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

使用MNIST数据集，导入数据集，并生成DataLoader.

In [4]:
dataset = torchvision.datasets.MNIST(root='../../data', 
                                     train=True, 
                                     transform=transforms.ToTensor(),  
                                     download=True)


data_loader = torch.utils.data.DataLoader(dataset=dataset, 
                                          batch_size=100, 
                                          shuffle=True)

定义一个只有一个隐藏层的全连接网络

In [6]:
class NeuralNet(nn.Module):
    def __init__(self, input_size=784, hidden_size=500, num_classes=10):
        super(NeuralNet, self).__init__()
        self.fc1 = nn.Linear(input_size, hidden_size) 
        self.relu = nn.ReLU()
        self.fc2 = nn.Linear(hidden_size, num_classes)  
    
    def forward(self, x):
        out = self.fc1(x)
        out = self.relu(out)
        out = self.fc2(out)
        return out

创建一个网络对象。

In [7]:
model = NeuralNet().to(device)

**创建一个Logger对象。**

In [8]:
logger = Logger('./logs')

定义优化器和损失函数

In [9]:
criterion = nn.CrossEntropyLoss()  
optimizer = torch.optim.Adam(model.parameters(), lr=0.00001)  

data_iter = iter(data_loader)
iter_per_epoch = len(data_loader)
total_step = 50000

开始训练

In [None]:
for step in range(total_step):
    
    # 重置data_iter
    if (step+1) % iter_per_epoch == 0:
        data_iter = iter(data_loader)

    # 获取训练数据
    images, labels = next(data_iter)
    images, labels = images.view(images.size(0), -1).to(device), labels.to(device)
    
    # 前向传播并计算loss
    outputs = model(images)
    loss = criterion(outputs, labels)
    
    # 优化参数
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    # 计算准确率
    _, argmax = torch.max(outputs, 1)
    accuracy = (labels == argmax.squeeze()).float().mean()

    if (step+1) % 100 == 0:
        print ('Step [{}/{}], Loss: {:.4f}, Acc: {:.2f}' 
               .format(step+1, total_step, loss.item(), accuracy.item()))

        # ================================================================== #
        #                        Tensorboard Logging                         #
        # ================================================================== #

        # 1. 形成需要写入Tensorboard的数据
        info = { 'loss': loss.item(), 'accuracy': accuracy.item() }

        # 2. 调用logger.scalar_summary方法写入数据
        for tag, value in info.items():
            logger.scalar_summary(tag, value, step+1)

        # 3. 调用logger.histo_summary保存参数梯度。
        for tag, value in model.named_parameters():
            tag = tag.replace('.', '/')
            logger.histo_summary(tag, value.data.cpu().numpy(), step+1)
            logger.histo_summary(tag+'/grad', value.grad.data.cpu().numpy(), step+1)

        # 4. 调用logger.image_summary方法写入图像数据
        info = { 'images': images.view(-1, 28, 28)[:10].cpu().numpy() }

        for tag, images in info.items():
            logger.image_summary(tag, images, step+1)

Step [100/50000], Loss: 2.2100, Acc: 0.41


`toimage` is deprecated in SciPy 1.0.0, and will be removed in 1.2.0.
Use Pillow's ``Image.fromarray`` directly instead.


Step [200/50000], Loss: 2.1023, Acc: 0.63
Step [300/50000], Loss: 1.9762, Acc: 0.75
Step [400/50000], Loss: 1.8005, Acc: 0.82
Step [500/50000], Loss: 1.7678, Acc: 0.77
Step [600/50000], Loss: 1.5943, Acc: 0.81
Step [700/50000], Loss: 1.4049, Acc: 0.84
Step [800/50000], Loss: 1.2297, Acc: 0.80
Step [900/50000], Loss: 1.2623, Acc: 0.83
Step [1000/50000], Loss: 1.1100, Acc: 0.86
Step [1100/50000], Loss: 1.0732, Acc: 0.80
Step [1200/50000], Loss: 1.1394, Acc: 0.81
Step [1300/50000], Loss: 1.1252, Acc: 0.76
Step [1400/50000], Loss: 0.8352, Acc: 0.85
Step [1500/50000], Loss: 0.9541, Acc: 0.77
Step [1600/50000], Loss: 0.7346, Acc: 0.93
Step [1700/50000], Loss: 0.8345, Acc: 0.83
Step [1800/50000], Loss: 0.7983, Acc: 0.83
Step [1900/50000], Loss: 0.6986, Acc: 0.90
Step [2000/50000], Loss: 0.7485, Acc: 0.80
Step [2100/50000], Loss: 0.6486, Acc: 0.85
Step [2200/50000], Loss: 0.5744, Acc: 0.87
Step [2300/50000], Loss: 0.6069, Acc: 0.88
Step [2400/50000], Loss: 0.4930, Acc: 0.94
Step [2500/50000], 