In [9]:
import tensorflow as tf
import pandas as pd
print(tf.__version__)

2.0.0


# In Memory DataSet:

There are two types of In-Memory dataset.
<br/>
<br/>1. Numpy Array.
<br/>2. Pandas Dataframe.

#### 1. Numpy Array:

Now we will create in memory numpy array dataset by calling below function.

In [3]:
def numpy_train_input_fn(sqft, prop_type, price): # np array
    return tf.estimator.inputs.numpy_input_fn(
        x = {"sq_footage":sqft, "type":prop_type},
        y = price,
        batch_size=128,
        num_epochs=10,
        shuffle=True,
        queue_capacity=1000
    )

#### 2. Pandas DataFrame:

Now we will create in memory pandas dataframe dataset by calling below function.

In [4]:
def pandas_train_input_fn(df): # A pandas dataframe
    return tf.estimator.inputs.pandas_input_fn(
        x=df, # "sq_footage", "type" selected autometically
              # because of feature column defination.
        y=df['price'],
        batch_size=128,
        num_epochs=10,
        shuffle=True,
        queue_capacity=1000
    )

# Full Code For Training Using In-Memory Dataset:

#### Create Dataframe:

In [26]:
data = {"sq_footage":[ 1000,    2000,    3000,    1000,  2000,  3000],
        "type":      ["house", "house", "house", "apt", "apt", "apt"],
        "nBeds":     [   2,       3,       4,      2,     5,     3  ],
        "price":     [ 500,     1000,    1500,    700,   1300,   1900]}
df = pd.DataFrame(data=data)
df.head()

Unnamed: 0,sq_footage,type,nBeds,price
0,1000,house,2,500
1,2000,house,3,1000
2,3000,house,4,1500
3,1000,apt,2,700
4,2000,apt,5,1300


### Create Feature Columns:

In [19]:
feat_cols = [
    tf.feature_column.numeric_column(key="sq_footage"),
    tf.feature_column.categorical_column_with_vocabulary_list(key="type",
                                                              vocabulary_list=["house","apt"]),
    tf.feature_column.numeric_column(key="nBeds")
]

In [20]:
model = tf.estimator.LinearRegressor(feature_columns=feat_cols)

INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_model_dir': '/tmp/tmpyikucayx', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f179431b3d0>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


In [21]:
def pandas_train_input_fn(df): # A pandas dataframe
    return tf.compat.v1.estimator.inputs.pandas_input_fn(
        x=df, # "sq_footage", "type" selected autometically
              # because of feature column defination.
        y=df['price'],
        batch_size=128,
        num_epochs=10,
        shuffle=True,
        queue_capacity=1000
    )

In [22]:
model.train(input_fn=pandas_train_input_fn(df))

INFO:tensorflow:Calling model_fn.
Instructions for updating:
Use `tf.cast` instead.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
Instructions for updating:
To construct input pipelines, use the `tf.data` module.
INFO:tensorflow:Saving checkpoints for 0 into /tmp/tmpyikucayx/model.ckpt.
INFO:tensorflow:loss = 1548333.4, step = 0
INFO:tensorflow:Saving checkpoints for 1 into /tmp/tmpyikucayx/model.ckpt.
INFO:tensorflow:Loss for final step: 1548333.4.


<tensorflow_estimator.python.estimator.canned.linear.LinearRegressorV2 at 0x7f1793e42a50>

By default, training will run until your training data is exhausted or exhausted n times if you specifies you want n epochs in your input function (such as in above code we have specified `num_epochs=10`).

You can also override this with an explicit number of steps when calling the train function. There are two variants for the step setting. 

###### variant 1:

In [23]:
model.train(input_fn=pandas_train_input_fn(df), steps=1000)

INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from /tmp/tmpyikucayx/model.ckpt-1
Instructions for updating:
Use standard file utilities to get mtimes.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Saving checkpoints for 1 into /tmp/tmpyikucayx/model.ckpt.
INFO:tensorflow:loss = 666707.75, step = 1
INFO:tensorflow:Saving checkpoints for 2 into /tmp/tmpyikucayx/model.ckpt.
INFO:tensorflow:Loss for final step: 666707.75.


<tensorflow_estimator.python.estimator.canned.linear.LinearRegressorV2 at 0x7f1793e42a50>

`steps=1000` will run 1,000 additional training steps starting from the last checkpoint. One step here corresponds to one batch of input data. 

In [24]:
model.train(input_fn=pandas_train_input_fn(df), max_steps=1000)

INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from /tmp/tmpyikucayx/model.ckpt-2
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Saving checkpoints for 2 into /tmp/tmpyikucayx/model.ckpt.
INFO:tensorflow:loss = 344368.6, step = 2
INFO:tensorflow:Saving checkpoints for 3 into /tmp/tmpyikucayx/model.ckpt.
INFO:tensorflow:Loss for final step: 344368.6.


<tensorflow_estimator.python.estimator.canned.linear.LinearRegressorV2 at 0x7f1793e42a50>

`max_steps=1000` on the other hand restarts from the latest check points reads the step count reached in the previous run and continues until the step count reaches max steps. This can potentially do nothing if the checkpoint was already there.