We'll learn how to create input functions in tf.contrib.learn in this notes. The input function will preprocess
and feed data into our models.
When using tf.contrib.learn to train a neural network, it's ok to pass our feature and data directly into our model's fit, evaluate or predict operations. But when more feature engineering is needed, through using a custom input function(input_fn) to encapsulate the logic for preprocessing and piping data into our models would be a better way.

In [None]:
def my_input_fn():
    '''Preprocess our data here..
       Return 1) a mapping of feature columns to Tensors with the corresponding feature data, and 2)a Tensor containing
       labels
    '''
    return feature_cols, labels

The body of input function contains the specific logic for preprocessing our input data.
feature_cols: A dict containing key/value pairs that map feature column names to Tensor s(or SparseTensor s) containing the corresponding feature data;
labels: A tensor containing label values.

h1. Converting feature data to tensors
If our feature/label is stored in pandas dataframes or numpy arrays, we'll need to convert them to Tensors before returning it from input function.

In [2]:
import tensorflow as tf

#For continuous data, we can create and populate a Tensor using tf.constant like below:
feature_column_data = [1, 2.4, 0, 9.9, 3, 120]
print feature_column_data
feature_tensor = tf.constant(feature_column_data)
print feature_tensor

[1, 2.4, 0, 9.9, 3, 120]
Tensor("Const:0", shape=(6,), dtype=float32)


For sparse, categorical data(data where the majority of values are 0), we'll instead want to populate a SparseTensor, which is instantiated with three arguments:
dense_shape: the shape of the tensor. Takes a list indicating the number of elements in each dimension;
indices: the indices of the elements in our tensor that contain nonzero values. Takes a list of terms, where each term is itself a list containing the index of a nonzero element. For example, indices = [[1, 3], [2, 4]] specifies that the elements with indexes of [1, 3] and [2, 4] have nonzero values.
values: A one dimensional tensor of values. Term i in values corresponds to term i in indices and specifies its value.

In [4]:
#Following code defines a two-dimensional SparseTensor with 3 rows and 5 columns. The element with index[0,1] has 
#a value of 6, and the element with index[2, 4] has a value of 0.5(all other values are 0):
sparse_tensor = tf.SparseTensor(indices=[[0,1], [2,4]], values = [6,0.5], dense_shape=[3,5])
print sparse_tensor
s = tf.Session()
print s.run(sparse_tensor)

SparseTensor(indices=Tensor("SparseTensor_1/indices:0", shape=(2, 2), dtype=int64), values=Tensor("SparseTensor_1/values:0", shape=(2,), dtype=float32), dense_shape=Tensor("SparseTensor_1/dense_shape:0", shape=(2,), dtype=int64))
SparseTensorValue(indices=array([[0, 1],
       [2, 4]]), values=array([ 6. ,  0.5], dtype=float32), dense_shape=array([3, 5]))


h1. Passing input_fun Data to our model

To feed data to our model for training, simply pass the input function we've created to the fit operation as the value of the input_fn parameter.
Note that the input_fn parameter must receive a function object(i.e., input_fn = my_input_fn), not the return value of a function call.
If we want to be able to parameterize our input function, we can use a wrapper function, or python's functools.partial function to construct a new function object with all parameter values fixed. Also we can wrap our function invocation in a lambda and pass it to the input_fn parameter.