## Tensor

数据不是以整数、浮点数或者字符串形式存储的。这些值被封装在一个叫做 tensor 的对象中。<br>
在 `hello_constant = tf.constant('Hello World!')` 代码中，`hello_constant` 是一个 0 维度的字符串 tensor.

In [45]:
import tensorflow as tf

import warnings
warnings.filterwarnings('ignore')

# Create TensorFlow object called hello_constant
hello_constant = tf.constant('Hello World!')

with tf.Session() as sess:
    # Run the tf.constant operation in the session
    output = sess.run(hello_constant)
    print(output)

b'Hello World!'


tf.constant() 返回的 tensor 是一个常量 tensor，因为这个 tensor 的值不会变。

In [46]:
# A is a 0-dimensional int32 tensor
A = tf.constant(1234)

# B is a 1-dimensional int32 tensor
B = tf.constant([123,456,789]) 

# C is a 2-dimensional int32 tensor
C = tf.constant([ [123,456,789], [222,333,444] ])

## Session
TensorFlow 的 api 构建在 computational graph 的概念上，它是一种对数学运算过程进行可视化的方法 (more details in miniFlow)。<br>
一个 "TensorFlow Session" 是用来运行图的环境。这个 session 负责分配 GPU(s) 和／或 CPU(s)，包括远程计算机的运算。<br>
<img src="image/session.png">
`tf.Session` 创建了一个 `sess` 的 session 实例。然后 `sess.run()` 函数对 tensor 求值，并返回结果。

## Input
使用一个非常量（non-constant）该怎么办？ `tf.placeholder()` 和 `feed_dict` 派上用场的时候了。下面向 TensorFlow 传输数据的基础知识。<br>
很遗憾，你不能把数据集赋值给 x 再将它传给 TensorFlow。因为之后你会想要你的 TensorFlow 模型对不同的数据集采用不同的参数。你需要的是 `tf.placeholder()`.<br>
<br>
数据经过 `tf.session.run()` 函数得到的值，由 `tf.placeholder()` 返回成一个 tensor，这样你可以在 session 运行之前，设置输入。<br>
<br>
Also, 用 `tf.session.run()` 里的 `feed_dict` 参数设置占位 tensor.

In [47]:
x = tf.placeholder(tf.string)

with tf.Session() as sess:
    output = sess.run(x, feed_dict={x: 'Hello World'})

In [48]:
x = tf.placeholder(tf.string)
y = tf.placeholder(tf.int32)
z = tf.placeholder(tf.float32)

with tf.Session() as sess:
    output = sess.run(x, feed_dict={x: 'Test String', y: 123, z: 45.67})

## Tensorflow Math

In [49]:
x = tf.add(5, 2)  
y = tf.subtract(10, 4) 
z = tf.multiply(2, 5)  

#m = tf.subtract(tf.constant(2.0), tf.constant(1))  # error, value type mismatch
m = tf.subtract(tf.cast(tf.constant(2.0), tf.int32), tf.constant(1))  # success, type conversion

Quiz:

In [50]:
x = tf.placeholder(tf.int32) # 10
y = tf.placeholder(tf.int32) # 2
z = tf.placeholder(tf.int32) # 1

m = tf.subtract(tf.divide(x, y), tf.cast(z, tf.float64))

with tf.Session() as sess:
    output = sess.run(m, feed_dict={x:10, y:2, z:1})
    print(output)

4.0


## Linear Function

训练神经网络的目的是更新权重和偏差来更好地预测目标。为了使用权重和偏差，你需要一个能修改的 Tensor。这就排除了 `tf.placeholder()` 和 `tf.constant()`，因为它们的 Tensor 不能改变。这里就需要 `tf.Variable` 了。<br>
<br>
该 tensor 把它的状态存在 session 里，所以必须手动初始化它的状态。将使用 `tf.global_variables_initializer()` 函数来初始化所有可变 tensor。

In [51]:
x = tf.Variable(5)

init = tf.global_variables_initializer()

with tf.Session() as sess:
    sess.run(init)

`tf.global_variables_initializer()` 会返回一个操作，它会从 graph 中初始化所有的 TensorFlow 变量。可以通过 session 来调用这个操作来初始化所有上面的变量。<br>
<br>
用 `tf.Variable` 类可以让我们改变神经网络中的权重和偏差，但还是要选择一个初始值。从正态分布中取随机数来初始化权重是个好习惯。随机化权重可以避免模型每次训练时候卡在同一个地方。<br>
<br>
从正态分布中选择权重可以避免任意一个权重与其他权重相比有压倒性的特性。你可以用 tf.truncated_normal() 函数从一个正态分布中生成随机数。

In [52]:
n_features = 120
n_labels = 5
weights = tf.Variable(tf.truncated_normal((n_features, n_labels)))

权重已经被随机化来帮助模型不被卡住，就不需要再把偏差随机化了。则简单地把偏差设为 0。

In [53]:
n_labels = 5
bias = tf.Variable(tf.zeros(n_labels))

Quiz:

In [54]:
from tensorflow.examples.tutorials.mnist import input_data
import warnings
warnings.filterwarnings('ignore')

import tensorflow as tf

def get_weights(n_features, n_labels):
    return tf.Variable(tf.truncated_normal((n_features, n_labels)))

def get_biases(n_labels):
    return tf.Variable(tf.zeros(n_labels))

def linear(input, w, b):
    return tf.add(tf.matmul(input, w), b)

def mnist_features_labels(n_labels):
    mnist_features = []
    mnist_labels = []

    mnist = input_data.read_data_sets('/datasets/ud730/mnist', one_hot=True)

    for mnist_feature, mnist_label in zip(*mnist.train.next_batch(10000)):
        if mnist_label[:n_labels].any():
            mnist_features.append(mnist_feature)
            mnist_labels.append(mnist_label[:n_labels])

    return mnist_features, mnist_labels


n_features = 784
n_labels = 3

features = tf.placeholder(tf.float32)
labels = tf.placeholder(tf.float32)

w = get_weights(n_features, n_labels)
b = get_biases(n_labels)

logits = linear(features, w, b)


train_features, train_labels = mnist_features_labels(n_labels)

with tf.Session() as session:
    # Initialize session variables
    #init = tf.global_variables_initializer()
    session.run(tf.global_variables_initializer())
    
    # Softmax
    prediction = tf.nn.softmax(logits)

    # Cross entropy
    cross_entropy = -tf.reduce_sum(labels * tf.log(prediction), reduction_indices=1)

    # Training loss
    loss = tf.reduce_mean(cross_entropy)

    # Rate at which the weights are changed
    learning_rate = 0.08

    # Gradient Descent
    optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss)

    # Run optimizer and get loss
    _, l = session.run(
        [optimizer, loss],
        feed_dict={features: train_features, labels: train_labels})

print('Loss: {}'.format(l))


Extracting /datasets/ud730/mnist\train-images-idx3-ubyte.gz
Extracting /datasets/ud730/mnist\train-labels-idx1-ubyte.gz
Extracting /datasets/ud730/mnist\t10k-images-idx3-ubyte.gz
Extracting /datasets/ud730/mnist\t10k-labels-idx1-ubyte.gz
Loss: 2.8693926334381104


## Softmax
Softmax 函数可以把它的输入，通常被称为 logits 或者 logit scores，处理成 0 到 1 之间，并且能够把输出归一化到和为 1。这意味着 softmax 函数与分类的概率分布等价。它是一个网络预测多分类问题的最佳输出激活函数。<br>
<img src="image/softmax.png">

In [55]:
x = tf.nn.softmax([2.0, 1.0, 0.2])

In [56]:
def run():
    output = None
    logit_data = [2.0, 1.0, 0.1]
    logits = tf.placeholder(tf.float32)
    
    softmax = tf.nn.softmax(logits)
    
    with tf.Session() as sess:
        output = sess.run(softmax, feed_dict={logits: logit_data})

    return output

## Cross Entropy
<img src="image/crossentropy.png"> <br>
在 TensorFlow 中来创建一个交叉熵函数，需要用到这两个新的函数：`tf.reduce_sum()` and `tf.log()`

Quiz:

In [57]:
softmax_data = [0.7, 0.2, 0.1]
one_hot_data = [1.0, 0.0, 0.0]

softmax = tf.placeholder(tf.float32)
one_hot = tf.placeholder(tf.float32)

cross_entropy = -tf.reduce_sum(tf.multiply(one_hot, tf.log(softmax)))

with tf.Session() as sess:
    ce = sess.run(cross_entropy, feed_dict={softmax:softmax_data, one_hot:one_hot_data})
    print(ce)
    

0.35667497


## Mini-batching
Mini-batching 是一个一次训练数据集的一小部分，而不是整个训练集的技术。它可以使内存较小、不能同时训练整个数据集的电脑也可以训练模型。<br>
<br>
Mini-batching 从运算角度来说是低效的，因为你不能在所有样本中计算 loss。但是这点小代价也比根本不能运行模型要划算。<br>
<br>
它跟随机梯度下降（SGD）结合在一起用也很有帮助。方法是在每一代训练之前，对数据进行随机混洗，然后创建 mini-batches，对每一个 mini-batch，用梯度下降训练网络权重。因为这些 batches 是随机的，你其实是在对每个 batch 做随机梯度下降（SGD）。<br>
<br>
不幸的是，有时候不可能把数据完全分割成相同数量的 batch。例如有 1000 个数据点，你想每个 batch 有 128 个数据。但是 1000 无法被 128 整除。你得到的结果是其中 7 个 batch 有 128 个数据点，一个 batch 有 104 个数据点。(7by128 + 1by104 = 1000)<br>
<br>
batch 里面的数据点数量会不同的情况下，需要利用 TensorFlow 的 `tf.placeholder()` 函数来接收这些不同的 batch。

In [58]:
# Features and Labels
n_input = 784
n_classes = 10

features = tf.placeholder(tf.float32, [None, n_input])
labels = tf.placeholder(tf.float32,   [None, n_classes])

`None` 维度在这里是一个 batch size 的占位符。在运行时，TensorFlow 会接收任何大于 0 的 batch size。回到之前的例子，这个设置可以让你把 features 和 labels 给到模型。无论 batch 中包含 128，还是 104 个数据点。

Quiz:

In [59]:
import math
from tensorflow.examples.tutorials.mnist import input_data
import numpy as np

def batches(batch_size, features, labels):
    """
    Create batches of features and labels
    :param batch_size: The batch size
    :param features: List of features
    :param labels: List of labels
    :return: Batches of (Features, Labels)
    """
    assert len(features) == len(labels)
    outout_batches = []
    
    sample_size = len(features)
    for start_i in range(0, sample_size, batch_size):
        end_i = start_i + batch_size
        batch = [features[start_i:end_i], labels[start_i:end_i]]
        outout_batches.append(batch)
        
    return outout_batches


learning_rate = 0.001
n_input = 784  
n_classes = 10  

mnist = input_data.read_data_sets('/datasets/ud730/mnist', one_hot=True)

train_features = mnist.train.images
test_features = mnist.test.images

train_labels = mnist.train.labels.astype(np.float32)
test_labels = mnist.test.labels.astype(np.float32)

features = tf.placeholder(tf.float32, [None, n_input])
labels = tf.placeholder(tf.float32, [None, n_classes])

weights = tf.Variable(tf.random_normal([n_input, n_classes]))
bias = tf.Variable(tf.random_normal([n_classes]))

logits = tf.add(tf.matmul(features, weights), bias)

cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=labels))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(cost)

correct_prediction = tf.equal(tf.argmax(logits, 1), tf.argmax(labels, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))


batch_size = 32

init = tf.global_variables_initializer()

with tf.Session() as sess:
    sess.run(init)
    
    for batch_features, batch_labels in batches(batch_size, train_features, train_labels):
        sess.run(optimizer, feed_dict={features: batch_features, labels: batch_labels})

    test_accuracy = sess.run(
        accuracy,
        feed_dict={features: test_features, labels: test_labels})

print('Test Accuracy: {}'.format(test_accuracy))


Extracting /datasets/ud730/mnist\train-images-idx3-ubyte.gz
Extracting /datasets/ud730/mnist\train-labels-idx1-ubyte.gz
Extracting /datasets/ud730/mnist\t10k-images-idx3-ubyte.gz
Extracting /datasets/ud730/mnist\t10k-labels-idx1-ubyte.gz
Test Accuracy: 0.21119999885559082


## Epochs
epoch是指整个数据集正向反向训练一次。它被用来提示模型的准确率并且不需要额外数据。<br>


In [60]:
from tensorflow.examples.tutorials.mnist import input_data
import tensorflow as tf
import numpy as np



def print_epoch_stats(epoch_i, sess, last_features, last_labels):
    """
    Print cost and validation accuracy of an epoch
    """
    current_cost = sess.run(
        cost,
        feed_dict={features: last_features, labels: last_labels})
    valid_accuracy = sess.run(
        accuracy,
        feed_dict={features: valid_features, labels: valid_labels})
    print('Epoch: {:<4} - Cost: {:<8.3} Valid Accuracy: {:<5.3}'.format(
        epoch_i,
        current_cost,
        valid_accuracy))

n_input = 784  # MNIST data input (img shape: 28*28)
n_classes = 10  # MNIST total classes (0-9 digits)

# Import MNIST data
mnist = input_data.read_data_sets('/datasets/ud730/mnist', one_hot=True)

# The features are already scaled and the data is shuffled
train_features = mnist.train.images
valid_features = mnist.validation.images
test_features = mnist.test.images

train_labels = mnist.train.labels.astype(np.float32)
valid_labels = mnist.validation.labels.astype(np.float32)
test_labels = mnist.test.labels.astype(np.float32)

# Features and Labels
features = tf.placeholder(tf.float32, [None, n_input])
labels = tf.placeholder(tf.float32, [None, n_classes])

# Weights & bias
weights = tf.Variable(tf.random_normal([n_input, n_classes]))
bias = tf.Variable(tf.random_normal([n_classes]))

# Logits - xW + b
logits = tf.add(tf.matmul(features, weights), bias)

# Define loss and optimizer
learning_rate = tf.placeholder(tf.float32)
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=labels))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(cost)

# Calculate accuracy
correct_prediction = tf.equal(tf.argmax(logits, 1), tf.argmax(labels, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

init = tf.global_variables_initializer()

batch_size = 128
epochs = 10
learn_rate = 0.001

train_batches = batches(batch_size, train_features, train_labels)

with tf.Session() as sess:
    sess.run(init)

    # Training cycle
    for epoch_i in range(epochs):

        # Loop over all batches
        for batch_features, batch_labels in train_batches:
            train_feed_dict = {
                features: batch_features,
                labels: batch_labels,
                learning_rate: learn_rate}
            sess.run(optimizer, feed_dict=train_feed_dict)

        # Print cost and validation accuracy of an epoch
        print_epoch_stats(epoch_i, sess, batch_features, batch_labels)

    # Calculate accuracy for test dataset
    test_accuracy = sess.run(
        accuracy,
        feed_dict={features: test_features, labels: test_labels})

print('Test Accuracy: {}'.format(test_accuracy))

Extracting /datasets/ud730/mnist\train-images-idx3-ubyte.gz
Extracting /datasets/ud730/mnist\train-labels-idx1-ubyte.gz
Extracting /datasets/ud730/mnist\t10k-images-idx3-ubyte.gz
Extracting /datasets/ud730/mnist\t10k-labels-idx1-ubyte.gz
Epoch: 0    - Cost: 11.6     Valid Accuracy: 0.144
Epoch: 1    - Cost: 10.7     Valid Accuracy: 0.15 
Epoch: 2    - Cost: 10.0     Valid Accuracy: 0.161
Epoch: 3    - Cost: 9.52     Valid Accuracy: 0.169
Epoch: 4    - Cost: 9.07     Valid Accuracy: 0.181
Epoch: 5    - Cost: 8.66     Valid Accuracy: 0.196
Epoch: 6    - Cost: 8.29     Valid Accuracy: 0.215
Epoch: 7    - Cost: 7.94     Valid Accuracy: 0.235
Epoch: 8    - Cost: 7.61     Valid Accuracy: 0.25 
Epoch: 9    - Cost: 7.3      Valid Accuracy: 0.266
Test Accuracy: 0.28060001134872437


降低学习率需要更多的 epoch，但是可以最终得到更好的准确率。<br>
<br>
在接下来的 TensorFlow Lab 里，选择自己的学习率，epoch 数，batch size 来提升模型的准确率。