## Project - Predict Price of the House 

Objective :-
We have developed an understanding of feature columns and data pipelines.

Now let’s focus on building a regression model using a real dataset, the Boston Housing Price data set.
We will use a Tensor Flow estimator to build a linear regression model

## Load Data and Preprocessing

#### Import Module and Download Dataset

In [1]:
import tensorflow as tf

In [2]:
import pandas as pd

In [7]:
from tensorflow.keras.datasets import boston_housing

In [9]:
(train_x, train_y), (test_x, test_y) = boston_housing.load_data()

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/boston_housing.npz


In [10]:
features = ['CRIM','ZN','INDUS','CHAS','NOX','RM','AGE',
           'DIS','RAD','TAX','PTRATIO','B','LASTAT']

### Convert to pandas dataset

In [11]:
df_train_x = pd.DataFrame(train_x, columns=features)

In [12]:
df_test_x = pd.DataFrame(test_x, columns=features)

In [13]:
df_train_y = pd.DataFrame(train_y, columns=['MEDV'])

In [14]:
df_test_y = pd.DataFrame(test_y, columns=['MEDV'])

In [15]:
df_train_x.head()

Unnamed: 0,CRIM,ZN,INDUS,CHAS,NOX,RM,AGE,DIS,RAD,TAX,PTRATIO,B,LASTAT
0,1.23247,0.0,8.14,0.0,0.538,6.142,91.7,3.9769,4.0,307.0,21.0,396.9,18.72
1,0.02177,82.5,2.03,0.0,0.415,7.61,15.7,6.27,2.0,348.0,14.7,395.38,3.11
2,4.89822,0.0,18.1,0.0,0.631,4.97,100.0,1.3325,24.0,666.0,20.2,375.52,3.26
3,0.03961,0.0,5.19,0.0,0.515,6.037,34.5,5.9853,5.0,224.0,20.2,396.9,8.01
4,3.69311,0.0,18.1,0.0,0.713,6.376,88.4,2.5671,24.0,666.0,20.2,391.43,14.65


### Feature Normalization

In [16]:
mean = df_train_x.mean(axis=0)

In [17]:
std = df_train_x.std(axis=0)

In [22]:
df_train_x -= mean
print(df_train_x)

         CRIM         ZN      INDUS      CHAS       NOX        RM        AGE  \
0   -4.017020 -11.963215 -11.539653 -0.318396 -0.722378 -6.443306 -68.198589   
1   -4.148037  -8.492119 -12.436690 -0.318396 -1.771033 -4.375081 -70.918638   
2   -3.620325 -11.963215 -10.077378 -0.318396  0.070508 -8.094504 -67.901531   
3   -4.146107 -11.963215 -11.972756 -0.318396 -0.918468 -6.591237 -70.245784   
4   -3.750738 -11.963215 -10.077378 -0.318396  0.769611 -6.113630 -68.316696   
..        ...        ...        ...       ...       ...       ...        ...   
399 -4.126611 -11.963215 -11.720235 -0.318396 -1.489686 -7.204097 -69.261555   
400 -4.132850 -11.121737 -11.712894 -0.318396 -1.353276 -6.305236 -70.897164   
401 -4.146643 -10.490629 -11.845027 -0.318396 -1.575796 -6.599691 -70.646633   
402 -3.917817 -11.963215  -9.860093 -0.318396  2.116664 -7.053347 -67.955216   
403 -4.148836  -9.438781 -12.304557 -0.318396 -1.890392 -5.792407 -70.807689   

          DIS        RAD         TAX   

In [23]:
df_train_x /= std
print(df_train_x)

         CRIM        ZN     INDUS      CHAS        NOX         RM       AGE  \
0   -0.434708 -0.503339 -1.694191 -1.319839  -6.158742  -9.077794 -2.440836   
1   -0.448886 -0.357296 -1.825889 -1.319839 -15.099208  -6.163930 -2.538187   
2   -0.391779 -0.503339 -1.479507 -1.319839   0.601123 -11.404121 -2.430205   
3   -0.448677 -0.503339 -1.757776 -1.319839  -7.830536  -9.286211 -2.514106   
4   -0.405892 -0.503339 -1.479507 -1.319839   6.561434  -8.613323 -2.445063   
..        ...       ...       ...       ...        ...        ...       ...   
399 -0.446567 -0.503339 -1.720703 -1.319839 -12.700546 -10.149651 -2.478880   
400 -0.447243 -0.467935 -1.719625 -1.319839 -11.537559  -8.883272 -2.537419   
401 -0.448735 -0.441382 -1.739024 -1.319839 -13.434682  -9.298120 -2.528452   
402 -0.423972 -0.503339 -1.447606 -1.319839  18.045935  -9.937265 -2.432126   
403 -0.448973 -0.397126 -1.806490 -1.319839 -16.116822  -8.160761 -2.534216   

          DIS       RAD       TAX   PTRATIO        

In [24]:
df_test_x -= mean
print(df_test_x)

         CRIM         ZN      INDUS      CHAS       NOX        RM        AGE  \
0   -2.193341 -11.963215 -10.077378 -0.318396  0.479739 -6.031915 -67.901531   
1   -4.137051 -11.963215 -11.265109 -0.318396 -0.645647 -6.765937 -68.155640   
2   -4.144445 -11.963215 -11.972756 -0.318396 -0.918468 -6.664499 -69.855672   
3   -4.012584 -11.963215  -9.860093  3.826882 -0.151159 -6.291148 -68.166377   
4   -4.142655 -11.963215 -12.075526 -0.318396 -1.481161 -6.472892 -69.447664   
..        ...        ...        ...       ...       ...       ...        ...   
97  -3.774419 -11.963215 -10.077378  3.826882  0.812239 -2.726701 -68.513542   
98  -4.141849 -11.963215 -10.851092 -0.318396 -1.583469 -6.258744 -71.265803   
99  -3.951949 -11.963215  -9.860093  3.826882 -0.151159 -4.104578 -67.965953   
100 -4.111642 -11.963215 -11.824473  3.826882 -0.986673 -5.303528 -68.313117   
101 -3.833968 -11.963215  -9.860093 -0.318396 -0.151159 -6.501070 -68.152061   

          DIS        RAD         TAX   

In [21]:
df_test_x /= std

### Feature Column

In [25]:
feature_columns_numeric = []

In [29]:
feature_columns_numeric = [tf.feature_column.numeric_column(fname, dtype=tf.float32) for fname in features]

Instructions for updating:
Use Keras preprocessing layers instead, either directly or via the `tf.keras.utils.FeatureSpace` utility. Each of `tf.feature_column.*` has a functional equivalent in `tf.keras.layers` for feature preprocessing when training a Keras model.


### Build Data Pipeline

In [30]:
def estimator_input_fn(df_data, df_label, epochs=10, shuffle=True,
    batch_size=32):
    def input_funct():
        ds = tf.data.Dataset.from_tensor_slices((dict(df_data), df_label))
        if shuffle:
            ds = ds.shuffle(100)
        ds = ds.batch(batch_size).repeat(epochs)
        return ds
    return input_funct

In [31]:
train_input_fn = estimator_input_fn(df_train_x, df_train_y)

In [34]:
val_input_fn = estimator_input_fn(df_test_x, df_test_y, epochs=1,
                                  shuffle=False)

### Build Model

In [37]:
model = tf.estimator.LinearRegressor(feature_columns=feature_columns_numeric, optimizer="RMSProp")

Instructions for updating:
Use tf.keras instead.
Instructions for updating:
Use tf.keras instead.
Instructions for updating:
Use tf.keras instead.
Instructions for updating:
Use tf.keras instead.
INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_model_dir': 'C:\\Users\\mukes\\AppData\\Local\\Temp\\tmpzzupqfc8', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_checkpoint_save_graph_def': True, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_t

In [38]:
model.train(train_input_fn, steps=100)

Instructions for updating:
Use tf.keras instead.
INFO:tensorflow:Calling model_fn.
Instructions for updating:
Use tf.keras instead.
INFO:tensorflow:Done calling model_fn.
Instructions for updating:
Use tf.keras instead.
Instructions for updating:
Use tf.keras instead.
Instructions for updating:
Use tf.keras instead.
Instructions for updating:
Use tf.keras instead.
INFO:tensorflow:Create CheckpointSaverHook.
Instructions for updating:
Use tf.keras instead.
Instructions for updating:
Use tf.keras instead.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 0...
INFO:tensorflow:Saving checkpoints for 0 into C:\Users\mukes\AppData\Local\Temp\tmpzzupqfc8\model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 0...
Instructions for updating:
Use tf.keras instead.
Instructions for updating:
Use tf.keras instead.
Instructions for upd

<tensorflow_estimator.python.estimator.canned.linear.LinearRegressorV2 at 0x1dfc320c890>

In [39]:
result = model.evaluate(val_input_fn)

INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2023-09-08T18:12:48
Instructions for updating:
Use tf.keras instead.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from C:\Users\mukes\AppData\Local\Temp\tmpzzupqfc8\model.ckpt-100
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Inference Time : 0.64075s
INFO:tensorflow:Finished evaluation at 2023-09-08-18:12:48
INFO:tensorflow:Saving dict for global step 100: average_loss = 817919.94, global_step = 100, label/mean = 23.078432, loss = 817774.4, prediction/mean = 927.4516
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 100: C:\Users\mukes\AppData\Local\Temp\tmpzzupqfc8\model.ckpt-100


In [40]:
print(result)

{'average_loss': 817919.94, 'label/mean': 23.078432, 'loss': 817774.4, 'prediction/mean': 927.4516, 'global_step': 100}


### Predict Results

In [41]:
result = model.predict(val_input_fn)

In [48]:
for pred,exp in zip(result, test_y[:32]): 
     print("Predicted Value: ", pred['predictions'][0], "Expected: ", exp)

Predicted Value:  922.11847 Expected:  7.2
Predicted Value:  931.12256 Expected:  18.8
Predicted Value:  930.70215 Expected:  19.0
Predicted Value:  935.8619 Expected:  27.0
Predicted Value:  916.8761 Expected:  22.2
Predicted Value:  939.02655 Expected:  24.5
Predicted Value:  935.25134 Expected:  31.2
Predicted Value:  932.7129 Expected:  22.9
Predicted Value:  933.4481 Expected:  20.5
Predicted Value:  927.08575 Expected:  23.2
Predicted Value:  920.8363 Expected:  18.6
Predicted Value:  931.28064 Expected:  14.5
Predicted Value:  932.3645 Expected:  17.8
Predicted Value:  927.1042 Expected:  50.0
Predicted Value:  911.8659 Expected:  20.8
Predicted Value:  923.0796 Expected:  24.3
Predicted Value:  936.64526 Expected:  24.2
Predicted Value:  934.0978 Expected:  19.8
Predicted Value:  915.91815 Expected:  19.1
Predicted Value:  941.9326 Expected:  22.7
Predicted Value:  937.6095 Expected:  12.0
Predicted Value:  931.5648 Expected:  10.2
Predicted Value:  923.4543 Expected:  20.0
Pre