## Tensorboard 基础
使用Tensorboard将Graph和loss可视化，该示例使用了MNIST手写数据集
(http://yann.lecun.com/exdb/mnist/).

- Author: Aymeric Damien
- Project: https://github.com/aymericdamien/TensorFlow-Examples/

#### 1. 导入MNIST数据集

In [1]:
from __future__ import print_function

import tensorflow as tf

# Import MINST data
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("data/", one_hot=True)

Extracting data/train-images-idx3-ubyte.gz
Extracting data/train-labels-idx1-ubyte.gz
Extracting data/t10k-images-idx3-ubyte.gz
Extracting data/t10k-labels-idx1-ubyte.gz


#### 2. 自定义参数

In [2]:
#超参数
learning_rate = 0.01
training_epochs = 25
batch_size = 100
display_epoch = 1
logs_path = 'events/'   #当前目录下创建一个events文件夹以保存Tensorboard文件

#TensorFlow Graph输入
#MNIST数据的形状:28 * 28 = 784
x = tf.placeholder(tf.float32, [None, 784], name='InputData')
#0-9个数字作为类别标签
y = tf.placeholder(tf.float32, [None, 10], name='LabelData')

#设置模型参数
W = tf.Variable(tf.zeros([784,10]),name = 'Weights')
b = tf.Variable(tf.zeros([10]),name = 'Bias')

#### 3. 建立模型

In [3]:
#建立模型并将所有的operation封装到一个scope中
#从而使得Tensorboard的Graph可视化更加方便
with tf.name_scope('Model'):
    #模型
    pred = tf.nn.softmax(tf.matmul(x,W) + b) 
with tf.name_scope('Loss'):
    #使用交叉熵最小化Loss
    cost = tf.reduce_mean(-tf.reduce_sum(y * tf.log(pred),reduction_indices=1))
with tf.name_scope('SGD'):
    #梯度下降
    optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(cost)
with tf.name_scope('Accuracy'):
    acc = tf.equal(tf.argmax(pred,1),tf.argmax(y,1))
    acc = tf.reduce_mean(tf.cast(acc,tf.float32))

    
#初始化变量
init = tf.global_variables_initializer()

#创建一个summary来监控cost 
tf.summary.scalar('loss',cost)
#创建一个summary来监控accuracy 
tf.summary.scalar('accuracy',acc)
#将所有的summary合并为一个Operation
merged_summary_op = tf.summary.merge_all()
    

#### 4. 训练模型

In [4]:
#开始训练
with tf.Session() as sess:
    sess.run(init)
    
    #将日志写入Tensorboard
    summary_writer = tf.summary.FileWriter(logs_path,graph = tf.get_default_graph())
    
    #训练循环
    for epoch in range(training_epochs):
        avg_cost = 0.
        total_batch = int(mnist.train.num_examples / batch_size)
        # 遍历所有的batch
        for i in range(total_batch):
            batch_xs , batch_ys = mnist.train.next_batch(batch_size)
            #运行optimization,loss,summary的operation
            _, c, summary = sess.run([optimizer, cost, merged_summary_op],
                                     feed_dict={x: batch_xs,y: batch_ys})
            #每次迭代都写入日志
            summary_writer.add_summary(summary,epoch * total_batch + i)
            #计算loss均值
            avg_cost += c / total_batch
        #每个epoch显示logs
        if (epoch+1) % display_epoch == 0:
            print("Epoch:", '%04d' % (epoch+1), "cost=", "{:.9f}".format(avg_cost))

    print("训练结束!")
    
    # 测试模型
    # 计算精准度
    print("Accuracy:", acc.eval({x: mnist.test.images, y: mnist.test.labels}))

    print("运行命令行:\n" \
          "--> tensorboard --logdir=/tmp/tensorflow_logs " \
          "\n然后打开在浏览器中打开： http://0.0.0.0:6006/ ")

Epoch: 0001 cost= 1.183988077
Epoch: 0002 cost= 0.665380242
Epoch: 0003 cost= 0.552933572
Epoch: 0004 cost= 0.498747118
Epoch: 0005 cost= 0.465548007
Epoch: 0006 cost= 0.442622204
Epoch: 0007 cost= 0.425540524
Epoch: 0008 cost= 0.412230093
Epoch: 0009 cost= 0.401390533
Epoch: 0010 cost= 0.392406998
Epoch: 0011 cost= 0.384808946
Epoch: 0012 cost= 0.378173454
Epoch: 0013 cost= 0.372396935
Epoch: 0014 cost= 0.367290784
Epoch: 0015 cost= 0.362745545
Epoch: 0016 cost= 0.358620725
Epoch: 0017 cost= 0.354896377
Epoch: 0018 cost= 0.351507486
Epoch: 0019 cost= 0.348323458
Epoch: 0020 cost= 0.345402256
Epoch: 0021 cost= 0.342800388
Epoch: 0022 cost= 0.340259916
Epoch: 0023 cost= 0.337950383
Epoch: 0024 cost= 0.335762308
Epoch: 0025 cost= 0.333713121
训练结束!
Accuracy: 0.9132
运行命令行:
--> tensorboard --logdir=/tmp/tensorflow_logs 
然后打开在浏览器中打开： http://0.0.0.0:6006/ 


### 补充1：
上述代码在使用IPython或Jupyter时，可能会出现了以下错误：  
`[[Node: inputs/x_input = Placeholder[dtype=DT_FLOAT, shape=[], _device="/job:localhost/replica:0/task:0/cpu:0"]()]]`  
这个错误和dtype没有关系。目前个人理解为是当前路径下不能存在一个以上的events文件。解决方案有以下几个：  
1. 到存储的路径下把之前生成的文件给删了； 
2. 在operation代码前加上`tf.reset_default_graph()`

### 补充2：
在使用命令行`tensorboard --logdir='PATH'`的时候，使用绝对路径，且路径名不要出现中文