# tf.estimator quickstart 读书笔记

[tf官方文件](https://www.tensorflow.org/get_started/estimator)

简单的Deep Learning的模型，实作iris dataset。

import和资料参数。

In [1]:
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import os
import urllib

import tensorflow as tf
import numpy as np

IRIS_TRAINING = "./data/iris/iris_training.csv"
IRIS_TRAINING_URL = "http://download.tensorflow.org/data/iris_training.csv"

IRIS_TEST = "./data/iris/iris_test.csv"
IRIS_TEST_URL = "http://download.tensorflow.org/data/iris_test.csv"

## 读取资料
确认是否存在资料，没有的话就下载

In [2]:
if not os.path.exists(IRIS_TRAINING):
    raw = urllib.urlopen(IRIS_TRAINING_URL).read()
    with open(IRIS_TRAINING, 'w') as f:
        f.write(raw)

if not os.path.exists(IRIS_TEST):
    raw = urllib.urlopen(IRIS_TEST_URL).read()
    with open(IRIS_TEST, 'w') as f:
        f.write(raw)

利用load_csv_with_header()來讀取資料。

return的是Dataset，可以取得dataset.data和dataset.target取得feature和target。

In [3]:
training_set = tf.contrib.learn.datasets.base.load_csv_with_header(
    filename=IRIS_TRAINING,
    target_dtype=np.int,
    features_dtype=np.float32,)
test_set = tf.contrib.learn.datasets.base.load_csv_with_header(
    filename=IRIS_TEST,
    target_dtype=np.int,
    features_dtype=np.float32,)

## 建构模型

[tf.estimator](https://www.tensorflow.org/api_docs/python/tf/estimator)里面有很多内建的模型，和keras一样，先要定义一个模型。

feature_columns:需要定义每一个column的名字，这里都简单的定义为4个x（本来就有4个dimension）。

In [4]:
feature_columns = [tf.feature_column.numeric_column("x", shape=[4])]
classifier = tf.estimator.DNNClassifier(feature_columns=feature_columns,
                                        hidden_units=[10, 20, 10],
                                        n_classes=3,
                                        model_dir="./model/iris_model")

INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_save_checkpoints_secs': 600, '_save_summary_steps': 100, '_session_config': None, '_log_step_count_steps': 100, '_keep_checkpoint_max': 5, '_save_checkpoints_steps': None, '_tf_random_seed': 1, '_keep_checkpoint_every_n_hours': 10000, '_model_dir': './model/iris_model'}


建立input function

In [5]:
train_input_fn = tf.estimator.inputs.numpy_input_fn(
    x={"x": np.array(training_set.data)},
    y=np.array(training_set.target),
    num_epochs=None,
    shuffle=True,)

## 训练
利用estimator.train訓練模型。

In [6]:
classifier.train(input_fn=train_input_fn, steps=2000)

INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Restoring parameters from ./model/iris_model/model.ckpt-2000
INFO:tensorflow:Saving checkpoints for 2001 into ./model/iris_model/model.ckpt.
INFO:tensorflow:loss = 6.6917, step = 2001
INFO:tensorflow:global_step/sec: 422.132
INFO:tensorflow:loss = 10.5437, step = 2101 (0.238 sec)
INFO:tensorflow:global_step/sec: 547.079
INFO:tensorflow:loss = 4.69869, step = 2201 (0.180 sec)
INFO:tensorflow:global_step/sec: 576.934
INFO:tensorflow:loss = 1.04251, step = 2301 (0.173 sec)
INFO:tensorflow:global_step/sec: 591.278
INFO:tensorflow:loss = 5.81058, step = 2401 (0.170 sec)
INFO:tensorflow:global_step/sec: 527.359
INFO:tensorflow:loss = 7.3451, step = 2501 (0.189 sec)
INFO:tensorflow:global_step/sec: 520.522
INFO:tensorflow:loss = 2.62255, step = 2601 (0.193 sec)
INFO:tensorflow:global_step/sec: 522.13
INFO:tensorflow:loss = 5.35887, step = 2701 (0.191 sec)
INFO:tensorflow:global_step/sec: 506.544
INFO:tensorflow:loss = 8.51034, step =

<tensorflow.python.estimator.canned.dnn.DNNClassifier at 0x10fddc978>

## 评估模型

同样要建构一个给evaluate的input function（其实建构一个,传入不同参数[参考input_fn](input%20function.ipynb)）

In [7]:
test_input_fn = tf.estimator.inputs.numpy_input_fn(
    x={"x": np.array(test_set.data)},
    y=np.array(test_set.target),
    num_epochs=1,
    shuffle=False)

accuracy_score = classifier.evaluate(input_fn=test_input_fn)["accuracy"]

print("\nTest Accuracy: {0:f}\n".format(accuracy_score))

INFO:tensorflow:Starting evaluation at 2017-08-22-13:54:27
INFO:tensorflow:Restoring parameters from ./model/iris_model/model.ckpt-4000
INFO:tensorflow:Finished evaluation at 2017-08-22-13:54:28
INFO:tensorflow:Saving dict for global step 4000: accuracy = 0.966667, average_loss = 0.0611264, global_step = 4000, loss = 1.83379

Test Accuracy: 0.966667



## 预测模型

同样利用estimator中的predict加上构建的input_fn来预测新的资讯。

In [8]:
new_samples = np.array(
    [[6.4, 3.2, 4.5, 1.5],
     [5.8, 3.1, 5.0, 1.7]], dtype=np.float32)
predict_input_fn = tf.estimator.inputs.numpy_input_fn(
    x={"x": new_samples},
    num_epochs=1,
    shuffle=False)

predictions = list(classifier.predict(input_fn=predict_input_fn))
predicted_classes = [p["classes"] for p in predictions]

print(
    "New Samples, Class Predictions:    {}\n"
    .format(predicted_classes))

INFO:tensorflow:Restoring parameters from ./model/iris_model/model.ckpt-4000
New Samples, Class Predictions:    [array([b'1'], dtype=object), array([b'2'], dtype=object)]

