### Tensorflow MNIST
---
使用`Softmax Regression`，利用 `tensorflow` 构建预测模型

* 导入数据集
    
    * `mnist.train` : 55000行的训练数据
    * `mnist.valiation` : 5000行的验证集数据
    * `mnist.test` : 10000行的测试数据
   
    * 数据单元包含两部分，其中一部分是图片，另一部分是图片对应的标记
        * `mnist.train.images`
        * `mnist.train.labels`
        
* `Softmax Regression`
    
    [Softmax for MNIST in tensorflow](http://wiki.jikexueyuan.com/project/tensorflow-zh/tutorials/mnist_beginners.html)
    
* 必要说明
    
    1. Python并不擅长快速的处理大规模的计算任务，所以 `tensorflow` 为我们提供了一种机制在Python中定义图然后放在Python外部使用其他的高效语言去计算执行
    
    2. 计算图的定义结束之后，我们甚至还可以在其他的设备甚至在手机上运行我们的计算图模型
    
    3. 代码中的必要节点说明
        
        1. `tf.nn.softmax` : 计算输入张量的 `softmax` 结果
        
        2. `tf.log` : 计算**张量每个分量**的log
        
        3. `tf.train.GradientDescentOptimizer(0.01).minimize(loss)` : 使用梯度下降算法，学习步长是 `0.01`，来优化我们的 `loss` 损失函数，使得参数尽可能的贴近真实的分布
        
        4. `tf.argmax(data, axis)` : 找到data中的最大值所在的下标，`axis = 0(x纵轴是x), axis = 1(y横轴是y)`
        
        5. `tf.cast(data, dtype, name)` : 将数据 `cast` 成一个新的数据类型
    
* 代码复现和解释说明


In [1]:
# ---------- 数据输入 ---------- #
# Copyright 2015 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================

"""Functions for downloading and reading MNIST data."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import gzip
import os
import tempfile

import numpy
from six.moves import urllib
from six.moves import xrange  # pylint: disable=redefined-builtin
import tensorflow as tf
from tensorflow.contrib.learn.python.learn.datasets.mnist import read_data_sets

dataset = read_data_sets('./MNIST_data', one_hot=True)

# ---------- 定义计算图,加入到默认的数据流图中 ---------- #
# 定义placeholder占位节点
x = tf.placeholder(tf.float32, [None, 784])    # 定义输入的张量是一个任意数目的图片像素集合

# 定义参数变量,10是 0 ~ 9 之间的概率输出节点
w = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))

# 注意这里需要引入numpy的广播机制
# alpha = x * w : None * 10
# b = 10  -> broadcast : [None, 10]
# y' = x * w + b -> y : None, 10
y = tf.nn.softmax(tf.matmul(x, w) + b)

# ---------- 训练 ---------- #
# 使用交叉熵作为loss function, 判断估计的分布和真实的分布之间的差异性
y_ = tf.placeholder(tf.float32, [None, 10])    # 真实分布
loss = - tf.reduce_sum(y_ * tf.log(y))

train = tf.train.GradientDescentOptimizer(0.01).minimize(loss)

init = tf.global_variables_initializer()

with tf.Session() as sess:
    sess.run(init)    # 初始化变量
    for i in range(1000):
        batch_x, batch_y = dataset.train.next_batch(100)    # 随机抓取100个数据训练，随机梯度下降
        sess.run(train, feed_dict={x: batch_x, y_:batch_y})
        print("Loop %d ... " % (i + 1), end = '\r')
        
    # ---------- 评估 ---------- #
    correct = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
    accuracy = tf.reduce_mean(tf.cast(correct, "float"))    # 计算正确率的平均值
    xs, ys_ = dataset.test.images, dataset.test.labels
    print(' ' * 20)
    print('精确率:', sess.run(accuracy, feed_dict={x: xs, y_:ys_}))

Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from tensorflow/models.
Instructions for updating:
Please write your own downloading logic.
Instructions for updating:
Please use tf.data to implement this functionality.
Extracting ./MNIST_data/train-images-idx3-ubyte.gz
Instructions for updating:
Please use tf.data to implement this functionality.
Extracting ./MNIST_data/train-labels-idx1-ubyte.gz
Instructions for updating:
Please use tf.one_hot on tensors.
Extracting ./MNIST_data/t10k-images-idx3-ubyte.gz
Extracting ./MNIST_data/t10k-labels-idx1-ubyte.gz
Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from tensorflow/models.
                    
精确率: 0.915
