# 10.2 Training an MLP with Tensorflow's High-Level API
High level API:TF.Learnを使ってMLPを訓練してみましょう！
DNNClassifierクラスを使えば，隠れ層がたくさんあって，ソフトマックス出力層になっているDeep Neural Networkも簡単に訓練できます．
ここでは例として，MNISTのデータに対して，２層の隠れ層(1つめが300このノード，2つめが100このノード)，10ノードのソフトマックス出力層をもつDNNで，クラス分類してみましょう．

In [22]:
# Data のインポート
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("/tmp/data/")
X_train = mnist.train.images
X_test = mnist.test.images
y_train = mnist.train.labels.astype("int")
y_test = mnist.test.labels.astype("int")

Extracting /tmp/data/train-images-idx3-ubyte.gz
Extracting /tmp/data/train-labels-idx1-ubyte.gz
Extracting /tmp/data/t10k-images-idx3-ubyte.gz
Extracting /tmp/data/t10k-labels-idx1-ubyte.gz


次のコードで準備と訓練を行なっています．
1. まず訓練集合から実数値列を作っています(infer_real_valued_columns_from_input)
2. 次にDNNClassifierを作ります(DNNClassifier)
3. DNNClassifierをScikit-Learn compatibility helperでラップします(Scikit-Learnと同じようにfitとかpredictとかを使えるようにするラッパー)
4. 最後に50個ごとのバッチサイズで40000回訓練します．

In [4]:
import tensorflow as tf
config = tf.contrib.learn.RunConfig(tf_random_seed=42) # ←テキストには載ってない

# ↓テキストのコード
feature_cols = tf.contrib.learn.infer_real_valued_columns_from_input(X_train)
dnn_clf = tf.contrib.learn.DNNClassifier(hidden_units=[300,100], n_classes=10, feature_columns=feature_cols, config=config)
dnn_clf = tf.contrib.learn.SKCompat(dnn_clf) # if TensorFlow >= 1.1
dnn_clf.fit(X_train, y_train, batch_size=50, steps=40000)

INFO:tensorflow:Using config: {'_task_type': None, '_task_id': 0, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x11f3e3f28>, '_master': '', '_num_ps_replicas': 0, '_num_worker_replicas': 0, '_environment': 'local', '_is_chief': True, '_evaluation_master': '', '_tf_config': gpu_options {
  per_process_gpu_memory_fraction: 1
}
, '_tf_random_seed': 42, '_save_summary_steps': 100, '_save_checkpoints_secs': 600, '_log_step_count_steps': 100, '_session_config': None, '_save_checkpoints_steps': None, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_model_dir': '/var/folders/vr/kt_mcww15rx8271xkyqysvh80000gn/T/tmpsxa1fjax'}
Instructions for updating:
Please switch to tf.train.get_global_step
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Saving checkpoints for 1 into /var/folders/vr/kt_mcww15rx8271xkyqysvh80000gn/T/tmpsxa1fjax/model.ckpt.
INFO:tensorflow:loss = 2.40058, step = 1
INFO:tensorflow:global_step/sec: 250.751
INFO:ten

INFO:tensorflow:loss = 0.0213878, step = 7001 (0.413 sec)
INFO:tensorflow:global_step/sec: 243.249
INFO:tensorflow:loss = 0.00836719, step = 7101 (0.411 sec)
INFO:tensorflow:global_step/sec: 222.512
INFO:tensorflow:loss = 0.0488339, step = 7201 (0.450 sec)
INFO:tensorflow:global_step/sec: 226.097
INFO:tensorflow:loss = 0.00950404, step = 7301 (0.442 sec)
INFO:tensorflow:global_step/sec: 218.357
INFO:tensorflow:loss = 0.0169789, step = 7401 (0.458 sec)
INFO:tensorflow:global_step/sec: 198.796
INFO:tensorflow:loss = 0.0107572, step = 7501 (0.503 sec)
INFO:tensorflow:global_step/sec: 222.171
INFO:tensorflow:loss = 0.0105873, step = 7601 (0.450 sec)
INFO:tensorflow:global_step/sec: 179.077
INFO:tensorflow:loss = 0.00593114, step = 7701 (0.559 sec)
INFO:tensorflow:global_step/sec: 250.456
INFO:tensorflow:loss = 0.00519743, step = 7801 (0.401 sec)
INFO:tensorflow:global_step/sec: 233.669
INFO:tensorflow:loss = 0.00644975, step = 7901 (0.427 sec)
INFO:tensorflow:global_step/sec: 240.078
INFO:

INFO:tensorflow:loss = 0.00101589, step = 15201 (0.440 sec)
INFO:tensorflow:global_step/sec: 212.589
INFO:tensorflow:loss = 0.00229391, step = 15301 (0.471 sec)
INFO:tensorflow:global_step/sec: 218.513
INFO:tensorflow:loss = 0.0028404, step = 15401 (0.457 sec)
INFO:tensorflow:global_step/sec: 214.812
INFO:tensorflow:loss = 0.00541212, step = 15501 (0.468 sec)
INFO:tensorflow:global_step/sec: 211.639
INFO:tensorflow:loss = 0.00358959, step = 15601 (0.470 sec)
INFO:tensorflow:global_step/sec: 246.172
INFO:tensorflow:loss = 0.00588915, step = 15701 (0.406 sec)
INFO:tensorflow:global_step/sec: 237.008
INFO:tensorflow:loss = 0.000818647, step = 15801 (0.423 sec)
INFO:tensorflow:global_step/sec: 242.778
INFO:tensorflow:loss = 0.000857814, step = 15901 (0.411 sec)
INFO:tensorflow:global_step/sec: 238.473
INFO:tensorflow:loss = 0.00651137, step = 16001 (0.420 sec)
INFO:tensorflow:global_step/sec: 243.219
INFO:tensorflow:loss = 0.00306222, step = 16101 (0.411 sec)
INFO:tensorflow:global_step/se

INFO:tensorflow:loss = 0.00187119, step = 23301 (0.432 sec)
INFO:tensorflow:global_step/sec: 233.77
INFO:tensorflow:loss = 0.000505153, step = 23401 (0.426 sec)
INFO:tensorflow:global_step/sec: 242.153
INFO:tensorflow:loss = 0.000766654, step = 23501 (0.413 sec)
INFO:tensorflow:global_step/sec: 235.154
INFO:tensorflow:loss = 0.000556651, step = 23601 (0.427 sec)
INFO:tensorflow:global_step/sec: 243.373
INFO:tensorflow:loss = 0.000105218, step = 23701 (0.409 sec)
INFO:tensorflow:global_step/sec: 240.323
INFO:tensorflow:loss = 0.000919802, step = 23801 (0.416 sec)
INFO:tensorflow:global_step/sec: 240.499
INFO:tensorflow:loss = 0.00179801, step = 23901 (0.416 sec)
INFO:tensorflow:global_step/sec: 242.982
INFO:tensorflow:loss = 0.00146623, step = 24001 (0.412 sec)
INFO:tensorflow:global_step/sec: 233.638
INFO:tensorflow:loss = 0.000645721, step = 24101 (0.428 sec)
INFO:tensorflow:global_step/sec: 215.1
INFO:tensorflow:loss = 0.00170859, step = 24201 (0.465 sec)
INFO:tensorflow:global_step/

INFO:tensorflow:loss = 0.00138772, step = 31401 (0.542 sec)
INFO:tensorflow:global_step/sec: 226.378
INFO:tensorflow:loss = 0.000200015, step = 31501 (0.442 sec)
INFO:tensorflow:global_step/sec: 228.511
INFO:tensorflow:loss = 0.000326743, step = 31601 (0.438 sec)
INFO:tensorflow:global_step/sec: 230.769
INFO:tensorflow:loss = 0.000636576, step = 31701 (0.433 sec)
INFO:tensorflow:global_step/sec: 221.547
INFO:tensorflow:loss = 0.000169163, step = 31801 (0.451 sec)
INFO:tensorflow:global_step/sec: 238.845
INFO:tensorflow:loss = 0.00069206, step = 31901 (0.419 sec)
INFO:tensorflow:global_step/sec: 230.689
INFO:tensorflow:loss = 0.000129269, step = 32001 (0.434 sec)
INFO:tensorflow:global_step/sec: 224.575
INFO:tensorflow:loss = 0.000392807, step = 32101 (0.445 sec)
INFO:tensorflow:global_step/sec: 225.075
INFO:tensorflow:loss = 0.00115589, step = 32201 (0.447 sec)
INFO:tensorflow:global_step/sec: 219.951
INFO:tensorflow:loss = 0.00056562, step = 32301 (0.452 sec)
INFO:tensorflow:global_st

INFO:tensorflow:loss = 0.000150253, step = 39501 (0.416 sec)
INFO:tensorflow:global_step/sec: 246.678
INFO:tensorflow:loss = 0.000623901, step = 39601 (0.405 sec)
INFO:tensorflow:global_step/sec: 252.673
INFO:tensorflow:loss = 0.000144656, step = 39701 (0.399 sec)
INFO:tensorflow:global_step/sec: 218.737
INFO:tensorflow:loss = 0.00113882, step = 39801 (0.454 sec)
INFO:tensorflow:global_step/sec: 222.889
INFO:tensorflow:loss = 0.000799873, step = 39901 (0.452 sec)
INFO:tensorflow:Saving checkpoints for 40000 into /var/folders/vr/kt_mcww15rx8271xkyqysvh80000gn/T/tmpsxa1fjax/model.ckpt.
INFO:tensorflow:Loss for final step: 0.00044029.


SKCompat()

In [5]:
from sklearn.metrics import accuracy_score

y_pred = dnn_clf.predict(X_test)
accuracy_score(y_test, y_pred['classes'])

INFO:tensorflow:Restoring parameters from /var/folders/vr/kt_mcww15rx8271xkyqysvh80000gn/T/tmpsxa1fjax/model.ckpt-40000


0.98350000000000004

↑のように結果は98%のスコアがテスト集合に対して得られます！すごい！

In [6]:
from sklearn.metrics import log_loss

y_pred_proba = y_pred['probabilities']
log_loss(y_test, y_pred_proba)

0.071474514666464845

実際DNNClassifierが何をしているかというと，隠れ層のレイヤーを作り，活性化関数にReLU関数，出力層にソフトマックス関数，コスト関数に交差エントロピーを使って訓練を行なっています．

# 10.3 Training a DNN Using Plain Tensorflow
先ほどはhigh level APIを使ってやりましたが，low level APIを使えば，もっといろいろいじれます．たとえばこの節では，先ほどと同じneural networkを，同じMNISTデータセットに対して，Mini-batch Gradient Decentを実装してみます．順番は以下の感じです．
1. 構築フェーズでTensorflow graphを作ります．
2. 実行フェーズで作ったグラフを訓練します．

## 10.3.1 Construction Phase
まず，```tensorflow```ライブラリをインポートして，いろんな設定をしておきます．
設定するのはインプット・アウトプットの数，隠れ層とそのノードの数です．

In [23]:
import tensorflow as tf
import numpy as np

n_inputs = 28*28  # MNIST
n_hidden1 = 300
n_hidden2 = 100
n_outputs = 10

# ↓テキストに載っていないコード
# 前に作ったグラフをリセットしておきます．
def reset_graph(seed=42):
    tf.reset_default_graph()
    tf.set_random_seed(seed)
    np.random.seed(seed)
reset_graph()

訓練データとターゲットをプレースホルダ ノードで作っておきます．
- Xの形は```(None,n_in_puts)```です．各インスタンスは2次元テンソルなので第二引数を```n_input```で設定します．バッチにどれだけのインスタンスが入るのかわからないので第一引数を```None```にします．
- yの形は```(None)```です．各インスタンスは1次元テンソルなので第二引数は指定不要です．```X```と同様バッチにどれだけのインスタンスが入るのかわからないので第一引数を```None```にします．

In [None]:
X = tf.placeholder(tf.float32, shape=(None, n_inputs), name="X")
y = tf.placeholder(tf.int64, shape=(None), name="y")

次にneural networkを作ります．  
- プレースホルダ```X```は実行フェーズでは入力層として実行され，一つの訓練バッチに置き換えられます．
- 続いて2つの隠れ層を作ります．ほとんど同じものを作りますが，インプットだけ違います．
- 最後に出力層をReLU関数ではなくソフトマックス関数で作ります．

このneural networkの各層を作る関数をneuron_layer関数として作ります．

In [1]:
def neuron_layer(X, n_neurons, name, activation=None): 
    with tf.name_scope(name): #1
        n_inputs = int(X.get_shape()[1])  #2 
        stddev = 2 / np.sqrt(n_inputs) #3 
        init = tf.truncated_normal((n_inputs, n_neurons), stddev=stddev) #3
        W = tf.Variable(init, name="kernel") #3
        b = tf.Variable(tf.zeros([n_neurons]), name="bias") #4
        Z = tf.matmul(X, W) + b #5
        if activation is not None: #6
            return activation(Z) 
        else:
            return Z

上のコードの解説
1. ```#1```：引数```name```からこの層の名前を作ります．

In [12]:
with tf.name_scope("dnn"):
    hidden1 = neuron_layer(X, n_hidden1, name="hidden1", activation=tf.nn.relu)
    hidden2 = neuron_layer(hidden1, n_hidden2, name="hidden2", activation=tf.nn.relu)
    logits = neuron_layer(hidden2, n_outputs, name="outputs")

In [13]:
with tf.name_scope("loss"):
    xentropy = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=y, logits=logits)
    loss = tf.reduce_mean(xentropy, name="loss")

In [14]:
learning_rate = 0.01

with tf.name_scope("train"):
    optimizer = tf.train.GradientDescentOptimizer(learning_rate)
    training_op = optimizer.minimize(loss)

with tf.name_scope("eval"):
    correct = tf.nn.in_top_k(logits, y, 1)
    accuracy = tf.reduce_mean(tf.cast(correct, tf.float32))

init = tf.global_variables_initializer()
saver = tf.train.Saver()

n_epochs = 40
batch_size = 50

## 10.3.2 Execution Phase

In [15]:
with tf.Session() as sess:
    init.run()
    for epoch in range(n_epochs):
        for iteration in range(mnist.train.num_examples // batch_size):
            X_batch, y_batch = mnist.train.next_batch(batch_size)
            sess.run(training_op, feed_dict={X: X_batch, y: y_batch})
        acc_train = accuracy.eval(feed_dict={X: X_batch, y: y_batch})
        acc_val = accuracy.eval(feed_dict={X: mnist.validation.images,
                                            y: mnist.validation.labels})
        print(epoch, "Train accuracy:", acc_train, "Val accuracy:", acc_val)

    save_path = saver.save(sess, "./tfcheckpoint/ch10_my_model_final.ckpt")

0 Train accuracy: 0.9 Val accuracy: 0.9146
1 Train accuracy: 0.94 Val accuracy: 0.9348
2 Train accuracy: 0.92 Val accuracy: 0.9466
3 Train accuracy: 0.96 Val accuracy: 0.9508
4 Train accuracy: 0.92 Val accuracy: 0.9586
5 Train accuracy: 0.94 Val accuracy: 0.9584
6 Train accuracy: 0.98 Val accuracy: 0.9608
7 Train accuracy: 0.96 Val accuracy: 0.9636
8 Train accuracy: 0.92 Val accuracy: 0.9638
9 Train accuracy: 0.96 Val accuracy: 0.965
10 Train accuracy: 0.98 Val accuracy: 0.9686
11 Train accuracy: 0.94 Val accuracy: 0.9686
12 Train accuracy: 1.0 Val accuracy: 0.9702
13 Train accuracy: 0.94 Val accuracy: 0.9686
14 Train accuracy: 1.0 Val accuracy: 0.9716
15 Train accuracy: 1.0 Val accuracy: 0.973
16 Train accuracy: 1.0 Val accuracy: 0.9736
17 Train accuracy: 0.98 Val accuracy: 0.9736
18 Train accuracy: 1.0 Val accuracy: 0.9752
19 Train accuracy: 1.0 Val accuracy: 0.975
20 Train accuracy: 0.98 Val accuracy: 0.9748
21 Train accuracy: 1.0 Val accuracy: 0.975
22 Train accuracy: 1.0 Val accur

## 10.3.3 Using the Neural Network

In [17]:
with tf.Session() as sess:
    saver.restore(sess, "./tfcheckpoint/ch10_my_model_final.ckpt") # or better, use save_path
    X_new_scaled = mnist.test.images[:20]
    Z = logits.eval(feed_dict={X: X_new_scaled})
    y_pred = np.argmax(Z, axis=1)

print("Predicted classes:", y_pred)
print("Actual classes:   ", mnist.test.labels[:20])

INFO:tensorflow:Restoring parameters from ./tfcheckpoint/ch10_my_model_final.ckpt
Predicted classes: [7 2 1 0 4 1 4 9 6 9 0 6 9 0 1 5 9 7 3 4]
Actual classes:    [7 2 1 0 4 1 4 9 5 9 0 6 9 0 1 5 9 7 3 4]


In [20]:
from IPython.display import clear_output, Image, display, HTML

def strip_consts(graph_def, max_const_size=32):
    """Strip large constant values from graph_def."""
    strip_def = tf.GraphDef()
    for n0 in graph_def.node:
        n = strip_def.node.add() 
        n.MergeFrom(n0)
        if n.op == 'Const':
            tensor = n.attr['value'].tensor
            size = len(tensor.tensor_content)
            if size > max_const_size:
                tensor.tensor_content = b"<stripped %d bytes>"%size
    return strip_def

def show_graph(graph_def, max_const_size=32):
    """Visualize TensorFlow graph."""
    if hasattr(graph_def, 'as_graph_def'):
        graph_def = graph_def.as_graph_def()
    strip_def = strip_consts(graph_def, max_const_size=max_const_size)
    code = """
        <script>
          function load() {{
            document.getElementById("{id}").pbtxt = {data};
          }}
        </script>
        <link rel="import" href="https://tensorboard.appspot.com/tf-graph-basic.build.html" onload=load()>
        <div style="height:600px">
          <tf-graph-basic id="{id}"></tf-graph-basic>
        </div>
    """.format(data=repr(str(strip_def)), id='graph'+str(np.random.rand()))

    iframe = """
        <iframe seamless style="width:1200px;height:620px;border:0" srcdoc="{}"></iframe>
    """.format(code.replace('"', '&quot;'))
    display(HTML(iframe))

show_graph(tf.get_default_graph())