https://www.tensorflow.org/versions/r0.11/tutorials/input_fn/index.html

building input functions with tf.contrib.learn

To start, set up your imports (including pandas and tensorflow) and set logging verbosity to INFO for more detailed log output:

In [1]:
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import pandas as pd
import tensorflow as tf

tf.logging.set_verbosity(tf.logging.INFO)

Define the column names for the data set in COLUMNS. To distinguish features from the label, also define FEATURES and LABEL. Then read the three CSVs (train, test, and predict) into pandas DataFrames:

In [5]:
COLUMNS = ["crim", "zn", "indus", "nox", "rm", "age",
           "dis", "tax", "ptratio", "medv"]
FEATURES = ["crim", "zn", "indus", "nox", "rm",
            "age", "dis", "tax", "ptratio"]
LABEL = "medv"

training_set = pd.read_csv("./boston_data/boston_train.csv", skipinitialspace=True,
                           skiprows=1, names=COLUMNS)
test_set = pd.read_csv("./boston_data/boston_test.csv", skipinitialspace=True,
                       skiprows=1, names=COLUMNS)
prediction_set = pd.read_csv("./boston_data/boston_predict.csv", skipinitialspace=True,
                             skiprows=1, names=COLUMNS)

Next, create a list of FeatureColumns for the input data, which formally specify the set of features to use for training. Because all features in the housing data set contain continuous values, you can create their FeatureColumns using the tf.contrib.layers.real_valued_column() function:

In [8]:
feature_cols = [tf.contrib.layers.real_valued_column(k)
                  for k in FEATURES]

Now, instantiate a DNNRegressor for the neural network regression model. You'll need to provide two arguments here: hidden_units, a hyperparameter specifying the number of nodes in each hidden layer (here, two hidden layers with 10 nodes each), and feature_columns, containing the list of FeatureColumns you just defined:

In [11]:
regressor = tf.contrib.learn.DNNRegressor(
    feature_columns=feature_cols, hidden_units=[10, 10], enable_centered_bias = True)

INFO:tensorflow:Using config: {'keep_checkpoint_every_n_hours': 10000, 'evaluation_master': '', 'tf_random_seed': None, 'cluster_spec': None, 'num_ps_replicas': 0, 'master': '', 'tf_config': gpu_options {
  per_process_gpu_memory_fraction: 1
}
, '_job_name': None, 'save_checkpoints_secs': 600, 'keep_checkpoint_max': 5, 'save_summary_steps': 100, '_is_chief': True, 'task': 0}


To pass input data into the regressor, create an input function, which will accept a pandas Dataframe and return feature column and label values as Tensors:

In [14]:
def input_fn(data_set):
  feature_cols = {k: tf.constant(data_set[k].values) for k in FEATURES}
  labels = tf.constant(data_set[LABEL].values)
  return feature_cols, labels

To train the neural network regressor, run fit with the training_set passed to the input_fn as follows:

In [15]:
regressor.fit(input_fn=lambda: input_fn(training_set), steps=5000)

INFO:tensorflow:Setting feature info to {'nox': TensorSignature(dtype=tf.float64, shape=TensorShape([Dimension(400)]), is_sparse=False), 'age': TensorSignature(dtype=tf.float64, shape=TensorShape([Dimension(400)]), is_sparse=False), 'crim': TensorSignature(dtype=tf.float64, shape=TensorShape([Dimension(400)]), is_sparse=False), 'indus': TensorSignature(dtype=tf.float64, shape=TensorShape([Dimension(400)]), is_sparse=False), 'ptratio': TensorSignature(dtype=tf.float64, shape=TensorShape([Dimension(400)]), is_sparse=False), 'dis': TensorSignature(dtype=tf.float64, shape=TensorShape([Dimension(400)]), is_sparse=False), 'zn': TensorSignature(dtype=tf.float64, shape=TensorShape([Dimension(400)]), is_sparse=False), 'rm': TensorSignature(dtype=tf.float64, shape=TensorShape([Dimension(400)]), is_sparse=False), 'tax': TensorSignature(dtype=tf.int64, shape=TensorShape([Dimension(400)]), is_sparse=False)}
INFO:tensorflow:Setting targets info to TensorSignature(dtype=tf.float64, shape=TensorShape(

DNNRegressor(hidden_units=[10, 10], optimizer=None, dropout=None, feature_columns=[_RealValuedColumn(column_name='crim', dimension=1, default_value=None, dtype=tf.float32, normalizer=None), _RealValuedColumn(column_name='zn', dimension=1, default_value=None, dtype=tf.float32, normalizer=None), _RealValuedColumn(column_name='indus', dimension=1, default_value=None, dtype=tf.float32, normalizer=None), _RealValuedColumn(column_name='nox', dimension=1, default_value=None, dtype=tf.float32, normalizer=None), _RealValuedColumn(column_name='rm', dimension=1, default_value=None, dtype=tf.float32, normalizer=None), _RealValuedColumn(column_name='age', dimension=1, default_value=None, dtype=tf.float32, normalizer=None), _RealValuedColumn(column_name='dis', dimension=1, default_value=None, dtype=tf.float32, normalizer=None), _RealValuedColumn(column_name='tax', dimension=1, default_value=None, dtype=tf.float32, normalizer=None), _RealValuedColumn(column_name='ptratio', dimension=1, default_value=

# Evaluating the Model

Next, see how the trained model performs against the test data set. Run evaluate, and this time pass the test_set to the input_fn:

In [16]:
ev = regressor.evaluate(input_fn=lambda: input_fn(test_set), steps=1)

INFO:tensorflow:Transforming feature_column _RealValuedColumn(column_name='age', dimension=1, default_value=None, dtype=tf.float32, normalizer=None)
INFO:tensorflow:Transforming feature_column _RealValuedColumn(column_name='crim', dimension=1, default_value=None, dtype=tf.float32, normalizer=None)
INFO:tensorflow:Transforming feature_column _RealValuedColumn(column_name='dis', dimension=1, default_value=None, dtype=tf.float32, normalizer=None)
INFO:tensorflow:Transforming feature_column _RealValuedColumn(column_name='indus', dimension=1, default_value=None, dtype=tf.float32, normalizer=None)
INFO:tensorflow:Transforming feature_column _RealValuedColumn(column_name='nox', dimension=1, default_value=None, dtype=tf.float32, normalizer=None)
INFO:tensorflow:Transforming feature_column _RealValuedColumn(column_name='ptratio', dimension=1, default_value=None, dtype=tf.float32, normalizer=None)
INFO:tensorflow:Transforming feature_column _RealValuedColumn(column_name='rm', dimension=1, defaul

Retrieve the loss from the ev results and print it to output:

In [17]:
loss_score = ev["loss"]
print("Loss: {0:f}".format(loss_score))

Loss: 23.162535


# Making Predictions

Finally, you can use the model to predict median house values for the prediction_set, which contains feature data but no labels for six examples:



In [18]:
y = regressor.predict(input_fn=lambda: input_fn(prediction_set))
print ("Predictions: {}".format(str(y)))

Instructions for updating:
The default behavior of predict() is changing. The default value for
as_iterable will change to True, and then the flag will be removed
altogether. The behavior of this flag is described below.
INFO:tensorflow:Transforming feature_column _RealValuedColumn(column_name='age', dimension=1, default_value=None, dtype=tf.float32, normalizer=None)
INFO:tensorflow:Transforming feature_column _RealValuedColumn(column_name='crim', dimension=1, default_value=None, dtype=tf.float32, normalizer=None)
INFO:tensorflow:Transforming feature_column _RealValuedColumn(column_name='dis', dimension=1, default_value=None, dtype=tf.float32, normalizer=None)
INFO:tensorflow:Transforming feature_column _RealValuedColumn(column_name='indus', dimension=1, default_value=None, dtype=tf.float32, normalizer=None)
INFO:tensorflow:Transforming feature_column _RealValuedColumn(column_name='nox', dimension=1, default_value=None, dtype=tf.float32, normalizer=None)
INFO:tensorflow:Transforming fe

Predictions: [ 35.83711624  19.06627274  22.22982788  34.91257858  14.789711
  20.82322502]


Your results should contain six house-value predictions in thousands of dollars, e.g