In [1]:
import tensorflow as tf

# Introduction to input function

Here is a skeleton of basic input function:

In [2]:
def my_input_fn():

    # Preprocess your data here...

    # ...then return 1) a mapping of feature columns to Tensors with
    # the corresponding feature data, and 2) a Tensor containing labels
    return feature_cols, labels

## Function of input function

input function must return

* feature columns
    a dict contain key/value paires that map feature column name to `Tensors` containing the corresponding `feature data`.
* labels
a `Tensor` contain your label data

## Converting feature data to tensor

As we said above, **feature column** need **Tensor** that contains feature data. so we need convert our data into **Tensor**

### python array, numpy array, dataframe

if you data is python array, numpy array, dataframe, you can use following data to convert to Tensor.

```
import numpy as np
# numpy input_fn.

my_input_fn = tf.estimator.inputs.numpy_input_fn(
    x={"x": np.array(x_data)},
    y=np.array(y_data),
    ...)
import pandas as pd

# pandas input_fn.

my_input_fn = tf.estimator.inputs.pandas_input_fn(
    x=pd.DataFrame({"x": x_data}),
    y=pd.Series(y_data),
    ...)
```

you can check this [official tutorial](https://tensorflow.google.cn/get_started/estimator#construct_a_deep_neural_network_classifier) for a example and run it to get a feeling of it.

these 2 function have other paramter for you  to customize the input function. check it out at official documetns:

Here is a list:

* batch_size: int, size of batches to return.

* num_epochs: int, number of epochs to iterate over data. If not None, read attempts that would exceed this value will raise OutOfRangeError.

* shuffle: bool, whether to read the records in random order.

* queue_capacity: int, size of the read queue. If None, it will be set roughly to the size of x.

* num_threads: Integer, number of threads used for reading and enqueueing. In order to have predicted and repeatable order of reading and enqueueing, such as in prediction and evaluation mode, num_threads should be 1.

* target_column: str, name to give the target column y.

## Trick

you can use ```functools.partial``` to wrap your function, so you don't need define separatly function to each type of task(training, validation, prediction) 

```
my_input_fn(data_set):
    # balalalal

    # ...then return 1) a mapping of feature columns to Tensors with
    # the corresponding feature data, and 2) a Tensor containing labels
    return feature_cols, labels
```
---
```
classifier.train(
    input_fn=functools.partial(my_input_fn, data_set=training_set),
    steps=2000)
```