# Regression Exercise 

California Housing Data

This data set contains information about all the block groups in California from the 1990 Census. In this sample a block group on average includes 1425.5 individuals living in a geographically compact area. 

The task is to aproximate the median house value of each block from the values of the rest of the variables. 

 It has been obtained from the LIACC repository. The original page where the data set can be found is: http://www.liaad.up.pt/~ltorgo/Regression/DataSets.html.
 

The Features:
 
* housingMedianAge: continuous. 
* totalRooms: continuous. 
* totalBedrooms: continuous. 
* population: continuous. 
* households: continuous. 
* medianIncome: continuous. 
* medianHouseValue: continuous. 

## The Data

** Import the cal_housing_clean.csv file with pandas. Separate it into a training (70%) and testing set(30%).**

In [2]:
import pandas as pd

In [3]:
housing = pd.read_csv('cal_housing_clean.csv')

In [4]:
housing.head()

Unnamed: 0,housingMedianAge,totalRooms,totalBedrooms,population,households,medianIncome,medianHouseValue
0,41.0,880.0,129.0,322.0,126.0,8.3252,452600.0
1,21.0,7099.0,1106.0,2401.0,1138.0,8.3014,358500.0
2,52.0,1467.0,190.0,496.0,177.0,7.2574,352100.0
3,52.0,1274.0,235.0,558.0,219.0,5.6431,341300.0
4,52.0,1627.0,280.0,565.0,259.0,3.8462,342200.0


In [5]:
housing.describe()

Unnamed: 0,housingMedianAge,totalRooms,totalBedrooms,population,households,medianIncome,medianHouseValue
count,20640.0,20640.0,20640.0,20640.0,20640.0,20640.0,20640.0
mean,28.639486,2635.763081,537.898014,1425.476744,499.53968,3.870671,206855.816909
std,12.585558,2181.615252,421.247906,1132.462122,382.329753,1.899822,115395.615874
min,1.0,2.0,1.0,3.0,1.0,0.4999,14999.0
25%,18.0,1447.75,295.0,787.0,280.0,2.5634,119600.0
50%,29.0,2127.0,435.0,1166.0,409.0,3.5348,179700.0
75%,37.0,3148.0,647.0,1725.0,605.0,4.74325,264725.0
max,52.0,39320.0,6445.0,35682.0,6082.0,15.0001,500001.0


In [6]:
#provided do not execute!

In [7]:
#provided do not execute!

In [8]:
housing.columns

Index(['housingMedianAge', 'totalRooms', 'totalBedrooms', 'population',
       'households', 'medianIncome', 'medianHouseValue'],
      dtype='object')

In [9]:
x_data = housing.drop('medianHouseValue', axis=1)
x_data.head()

Unnamed: 0,housingMedianAge,totalRooms,totalBedrooms,population,households,medianIncome
0,41.0,880.0,129.0,322.0,126.0,8.3252
1,21.0,7099.0,1106.0,2401.0,1138.0,8.3014
2,52.0,1467.0,190.0,496.0,177.0,7.2574
3,52.0,1274.0,235.0,558.0,219.0,5.6431
4,52.0,1627.0,280.0,565.0,259.0,3.8462


In [10]:
y_values = housing['medianHouseValue']
y_values.head()

0    452600.0
1    358500.0
2    352100.0
3    341300.0
4    342200.0
Name: medianHouseValue, dtype: float64

In [11]:
from sklearn.model_selection import train_test_split

In [12]:
X_train, X_test, y_train, y_test = train_test_split(
    x_data, 
    y_values, 
    test_size = 0.3,   # Required by exercice 30% test - 70% train
    random_state = 123
)

In [13]:
X_train.head()

Unnamed: 0,housingMedianAge,totalRooms,totalBedrooms,population,households,medianIncome
12364,6.0,4863.0,920.0,3010.0,828.0,3.9508
12271,21.0,4624.0,852.0,2174.0,812.0,3.5255
19605,32.0,946.0,198.0,624.0,173.0,1.9728
10600,8.0,2032.0,349.0,862.0,340.0,6.9133
45,52.0,1656.0,420.0,718.0,382.0,2.6768


In [14]:
y_train.head()

12364    104200.0
12271    132100.0
19605     97900.0
10600    274100.0
45       182300.0
Name: medianHouseValue, dtype: float64

### Scale the Feature Data

** Use sklearn preprocessing to create a MinMaxScaler for the feature data. Fit this scaler only to the training data. Then use it to transform X_test and X_train. Then use the scaled X_test and X_train along with pd.Dataframe to re-create two dataframes of scaled data.**

In [15]:
from sklearn.preprocessing import MinMaxScaler

In [16]:
scaler = MinMaxScaler()

In [17]:
scaler

MinMaxScaler(copy=True, feature_range=(0, 1))

In [18]:
# provided do not execute!

In [19]:
X_train.columns

Index(['housingMedianAge', 'totalRooms', 'totalBedrooms', 'population',
       'households', 'medianIncome'],
      dtype='object')

In [20]:
X_train.head()

Unnamed: 0,housingMedianAge,totalRooms,totalBedrooms,population,households,medianIncome
12364,6.0,4863.0,920.0,3010.0,828.0,3.9508
12271,21.0,4624.0,852.0,2174.0,812.0,3.5255
19605,32.0,946.0,198.0,624.0,173.0,1.9728
10600,8.0,2032.0,349.0,862.0,340.0,6.9133
45,52.0,1656.0,420.0,718.0,382.0,2.6768


In [21]:
X_test.head()

Unnamed: 0,housingMedianAge,totalRooms,totalBedrooms,population,households,medianIncome
19121,40.0,1974.0,410.0,1039.0,398.0,3.7917
20019,9.0,3297.0,568.0,1749.0,568.0,4.0217
15104,12.0,3570.0,713.0,3321.0,666.0,4.0882
3720,27.0,3201.0,970.0,3403.0,948.0,2.2377
8938,41.0,2704.0,557.0,1047.0,478.0,4.4211


In [22]:
for c in X_train.columns:
    scaler = MinMaxScaler()
    scaler.fit(X_train[c].values.reshape(-1, 1))
    X_train[c] = scaler.transform(X_train[c].values.reshape(-1, 1))
    X_test[c] = scaler.transform(X_test[c].values.reshape(-1, 1))

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  after removing the cwd from sys.path.
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """


In [23]:
X_train.head()

Unnamed: 0,housingMedianAge,totalRooms,totalBedrooms,population,households,medianIncome
12364,0.098039,0.123544,0.148011,0.184456,0.154377,0.23799
12271,0.392157,0.117465,0.137059,0.133174,0.151391,0.208659
19605,0.607843,0.02391,0.031728,0.038093,0.032108,0.101578
10600,0.137255,0.051534,0.056048,0.052693,0.063282,0.442297
45,1.0,0.04197,0.067483,0.04386,0.071122,0.150129


In [24]:
X_test.head()

Unnamed: 0,housingMedianAge,totalRooms,totalBedrooms,population,households,medianIncome
19121,0.764706,0.050059,0.065872,0.06355,0.074109,0.227018
20019,0.156863,0.083711,0.091319,0.107103,0.105843,0.242879
15104,0.215686,0.090655,0.114672,0.203533,0.124137,0.247466
3720,0.509804,0.081269,0.156064,0.208563,0.176778,0.119847
8938,0.784314,0.068627,0.089547,0.064041,0.089042,0.270424


In [25]:
pd.DataFrame(y_train)

Unnamed: 0,medianHouseValue
12364,104200.0
12271,132100.0
19605,97900.0
10600,274100.0
45,182300.0
2889,70100.0
1743,97200.0
10770,226000.0
15005,133700.0
13823,92200.0


In [26]:
housing_train = pd.concat([X_train, pd.DataFrame(y_train)], axis=1)
housing_test = pd.concat([X_test, pd.DataFrame(y_test)], axis=1)

In [27]:
housing_train.head()

Unnamed: 0,housingMedianAge,totalRooms,totalBedrooms,population,households,medianIncome,medianHouseValue
12364,0.098039,0.123544,0.148011,0.184456,0.154377,0.23799,104200.0
12271,0.392157,0.117465,0.137059,0.133174,0.151391,0.208659,132100.0
19605,0.607843,0.02391,0.031728,0.038093,0.032108,0.101578,97900.0
10600,0.137255,0.051534,0.056048,0.052693,0.063282,0.442297,274100.0
45,1.0,0.04197,0.067483,0.04386,0.071122,0.150129,182300.0


In [28]:
housing_test.head()

Unnamed: 0,housingMedianAge,totalRooms,totalBedrooms,population,households,medianIncome,medianHouseValue
19121,0.764706,0.050059,0.065872,0.06355,0.074109,0.227018,151600.0
20019,0.156863,0.083711,0.091319,0.107103,0.105843,0.242879,99200.0
15104,0.215686,0.090655,0.114672,0.203533,0.124137,0.247466,134500.0
3720,0.509804,0.081269,0.156064,0.208563,0.176778,0.119847,231700.0
8938,0.784314,0.068627,0.089547,0.064041,0.089042,0.270424,462900.0


### Create Feature Columns

** Create the necessary tf.feature_column objects for the estimator. They should all be trated as continuous numeric_columns. **

In [29]:
housing_train.columns

Index(['housingMedianAge', 'totalRooms', 'totalBedrooms', 'population',
       'households', 'medianIncome', 'medianHouseValue'],
      dtype='object')

In [30]:
# provided do not execute!

In [31]:
import tensorflow as tf

In [32]:
medianAge = tf.feature_column.numeric_column('housingMedianAge')
totalRooms = tf.feature_column.numeric_column('totalRooms')
totalBedrooms = tf.feature_column.numeric_column('totalBedrooms')
population = tf.feature_column.numeric_column('population')
households = tf.feature_column.numeric_column('households')
medianIncome = tf.feature_column.numeric_column('medianIncome')

In [33]:
feat_cols = [
    medianAge,
    totalRooms,
    totalBedrooms,
    population,
    households,
    medianIncome,
]

In [34]:
housing_train['totalRooms'].isnull().sum()

0

In [35]:
for c in housing_train.columns:
    nNans = housing_train[c].isnull().sum()
    print('Number of nans in ' + str(c) + ': ' + str(nNans))

Number of nans in housingMedianAge: 0
Number of nans in totalRooms: 0
Number of nans in totalBedrooms: 0
Number of nans in population: 0
Number of nans in households: 0
Number of nans in medianIncome: 0
Number of nans in medianHouseValue: 0


** Create the input function for the estimator object. (play around with batch_size and num_epochs)**

In [36]:
train_input_fn = tf.estimator.inputs.pandas_input_fn(
    x = X_train, 
    y = y_train,
    batch_size = 10, 
    num_epochs = 1000, 
    shuffle = True
)

** Create the estimator model. Use a DNNRegressor. Play around with the hidden units! **

In [54]:
model = tf.estimator.DNNRegressor(
    feature_columns=feat_cols, 
    hidden_units=[6, 8, 10, 12]
)

INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_log_step_count_steps': 100, '_session_config': None, '_save_summary_steps': 100, '_tf_random_seed': 1, '_model_dir': '/var/folders/7f/kl6zxp_50ddcw8xr42yvrpfm0000gn/T/tmpu1oeibxb', '_keep_checkpoint_every_n_hours': 10000, '_keep_checkpoint_max': 5, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600}


In [57]:
model = tf.estimator.LinearRegressor(feature_columns=feat_cols)

INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_log_step_count_steps': 100, '_session_config': None, '_save_summary_steps': 100, '_tf_random_seed': 1, '_model_dir': '/var/folders/7f/kl6zxp_50ddcw8xr42yvrpfm0000gn/T/tmpmhxx57n3', '_keep_checkpoint_every_n_hours': 10000, '_keep_checkpoint_max': 5, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600}


In [48]:
# provided do not execute!

##### ** Train the model for ~1,000 steps. (Later come back to this and train it for more and check for improvement) **

In [56]:
model.train(input_fn=train_input_fn, steps=100000)

INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Saving checkpoints for 1 into /var/folders/7f/kl6zxp_50ddcw8xr42yvrpfm0000gn/T/tmpku0yxon5/model.ckpt.
INFO:tensorflow:step = 1, loss = 2.33935e+11
INFO:tensorflow:global_step/sec: 424.825
INFO:tensorflow:step = 101, loss = 3.50452e+11 (0.237 sec)
INFO:tensorflow:global_step/sec: 429.34
INFO:tensorflow:step = 201, loss = 5.14036e+11 (0.233 sec)
INFO:tensorflow:global_step/sec: 433.369
INFO:tensorflow:step = 301, loss = 3.44532e+11 (0.231 sec)
INFO:tensorflow:global_step/sec: 427.099
INFO:tensorflow:step = 401, loss = 3.78319e+11 (0.234 sec)
INFO:tensorflow:global_step/sec: 464.149
INFO:tensorflow:step = 501, loss = 1.36346e+12 (0.216 sec)
INFO:tensorflow:global_step/sec: 559.9
INFO:tensorflow:step = 601, loss = 7.60552e+11 (0.179 sec)
INFO:tensorflow:global_step/sec: 587.707
INFO:tensorflow:step = 701, loss = 8.56963e+11 (0.171 sec)
INFO:tensorflow:global_step/sec: 424.353
INFO:tensorflow:step = 801, loss = 5.17374e+11 (0.236 

INFO:tensorflow:step = 8001, loss = 3.59338e+11 (0.285 sec)
INFO:tensorflow:global_step/sec: 386.669
INFO:tensorflow:step = 8101, loss = 6.37203e+11 (0.260 sec)
INFO:tensorflow:global_step/sec: 325.906
INFO:tensorflow:step = 8201, loss = 5.35007e+11 (0.306 sec)
INFO:tensorflow:global_step/sec: 380.656
INFO:tensorflow:step = 8301, loss = 7.11499e+11 (0.268 sec)
INFO:tensorflow:global_step/sec: 393.261
INFO:tensorflow:step = 8401, loss = 5.55589e+11 (0.251 sec)
INFO:tensorflow:global_step/sec: 367.828
INFO:tensorflow:step = 8501, loss = 7.25906e+11 (0.272 sec)
INFO:tensorflow:global_step/sec: 384.385
INFO:tensorflow:step = 8601, loss = 5.87363e+11 (0.259 sec)
INFO:tensorflow:global_step/sec: 362.553
INFO:tensorflow:step = 8701, loss = 4.93627e+11 (0.279 sec)
INFO:tensorflow:global_step/sec: 292.616
INFO:tensorflow:step = 8801, loss = 5.44666e+11 (0.341 sec)
INFO:tensorflow:global_step/sec: 386.816
INFO:tensorflow:step = 8901, loss = 5.75672e+11 (0.260 sec)
INFO:tensorflow:global_step/sec

INFO:tensorflow:step = 16101, loss = 4.54707e+11 (0.296 sec)
INFO:tensorflow:global_step/sec: 318.454
INFO:tensorflow:step = 16201, loss = 1.00995e+12 (0.318 sec)
INFO:tensorflow:global_step/sec: 275.784
INFO:tensorflow:step = 16301, loss = 3.79082e+11 (0.355 sec)
INFO:tensorflow:global_step/sec: 302.724
INFO:tensorflow:step = 16401, loss = 5.26921e+11 (0.335 sec)
INFO:tensorflow:global_step/sec: 300.856
INFO:tensorflow:step = 16501, loss = 2.53347e+11 (0.335 sec)
INFO:tensorflow:global_step/sec: 275.66
INFO:tensorflow:step = 16601, loss = 7.99802e+11 (0.361 sec)
INFO:tensorflow:global_step/sec: 276.404
INFO:tensorflow:step = 16701, loss = 1.80184e+11 (0.360 sec)
INFO:tensorflow:global_step/sec: 326.63
INFO:tensorflow:step = 16801, loss = 1.022e+12 (0.304 sec)
INFO:tensorflow:global_step/sec: 367.152
INFO:tensorflow:step = 16901, loss = 4.36367e+11 (0.277 sec)
INFO:tensorflow:global_step/sec: 270.734
INFO:tensorflow:step = 17001, loss = 4.57342e+11 (0.364 sec)
INFO:tensorflow:global_st

INFO:tensorflow:global_step/sec: 543.033
INFO:tensorflow:step = 24201, loss = 2.19098e+11 (0.186 sec)
INFO:tensorflow:global_step/sec: 483.573
INFO:tensorflow:step = 24301, loss = 4.73116e+11 (0.208 sec)
INFO:tensorflow:global_step/sec: 351.993
INFO:tensorflow:step = 24401, loss = 3.91106e+11 (0.279 sec)
INFO:tensorflow:global_step/sec: 479.361
INFO:tensorflow:step = 24501, loss = 6.50192e+11 (0.210 sec)
INFO:tensorflow:global_step/sec: 457.483
INFO:tensorflow:step = 24601, loss = 7.37374e+11 (0.218 sec)
INFO:tensorflow:global_step/sec: 484.166
INFO:tensorflow:step = 24701, loss = 1.1483e+12 (0.209 sec)
INFO:tensorflow:global_step/sec: 428.93
INFO:tensorflow:step = 24801, loss = 9.31422e+11 (0.230 sec)
INFO:tensorflow:global_step/sec: 481.967
INFO:tensorflow:step = 24901, loss = 2.55302e+11 (0.211 sec)
INFO:tensorflow:global_step/sec: 526.901
INFO:tensorflow:step = 25001, loss = 3.36942e+11 (0.188 sec)
INFO:tensorflow:global_step/sec: 400.436
INFO:tensorflow:step = 25101, loss = 2.3868

INFO:tensorflow:global_step/sec: 501.304
INFO:tensorflow:step = 32301, loss = 7.9542e+11 (0.194 sec)
INFO:tensorflow:global_step/sec: 529.725
INFO:tensorflow:step = 32401, loss = 4.04e+11 (0.187 sec)
INFO:tensorflow:global_step/sec: 395.656
INFO:tensorflow:step = 32501, loss = 5.15216e+11 (0.255 sec)
INFO:tensorflow:global_step/sec: 388.934
INFO:tensorflow:step = 32601, loss = 6.32841e+11 (0.254 sec)
INFO:tensorflow:global_step/sec: 467.041
INFO:tensorflow:step = 32701, loss = 2.6958e+11 (0.215 sec)
INFO:tensorflow:global_step/sec: 551.323
INFO:tensorflow:step = 32801, loss = 5.80974e+11 (0.182 sec)
INFO:tensorflow:global_step/sec: 494.259
INFO:tensorflow:step = 32901, loss = 3.32275e+11 (0.204 sec)
INFO:tensorflow:global_step/sec: 413.222
INFO:tensorflow:step = 33001, loss = 5.26079e+11 (0.242 sec)
INFO:tensorflow:global_step/sec: 441.848
INFO:tensorflow:step = 33101, loss = 4.70401e+11 (0.225 sec)
INFO:tensorflow:global_step/sec: 440.453
INFO:tensorflow:step = 33201, loss = 2.76015e+

INFO:tensorflow:global_step/sec: 315.512
INFO:tensorflow:step = 40401, loss = 7.32357e+11 (0.320 sec)
INFO:tensorflow:global_step/sec: 295.808
INFO:tensorflow:step = 40501, loss = 4.93061e+11 (0.339 sec)
INFO:tensorflow:global_step/sec: 407.397
INFO:tensorflow:step = 40601, loss = 8.00715e+11 (0.244 sec)
INFO:tensorflow:global_step/sec: 245.219
INFO:tensorflow:step = 40701, loss = 7.18759e+11 (0.406 sec)
INFO:tensorflow:global_step/sec: 222.142
INFO:tensorflow:step = 40801, loss = 8.104e+11 (0.462 sec)
INFO:tensorflow:global_step/sec: 230.979
INFO:tensorflow:step = 40901, loss = 5.67238e+11 (0.424 sec)
INFO:tensorflow:global_step/sec: 270.226
INFO:tensorflow:step = 41001, loss = 2.19591e+11 (0.368 sec)
INFO:tensorflow:global_step/sec: 268.024
INFO:tensorflow:step = 41101, loss = 5.10908e+11 (0.375 sec)
INFO:tensorflow:global_step/sec: 194.273
INFO:tensorflow:step = 41201, loss = 8.21674e+11 (0.520 sec)
INFO:tensorflow:global_step/sec: 266.39
INFO:tensorflow:step = 41301, loss = 3.72273

INFO:tensorflow:global_step/sec: 466.049
INFO:tensorflow:step = 48501, loss = 8.88788e+11 (0.213 sec)
INFO:tensorflow:global_step/sec: 393.173
INFO:tensorflow:step = 48601, loss = 3.11656e+11 (0.254 sec)
INFO:tensorflow:global_step/sec: 370.074
INFO:tensorflow:step = 48701, loss = 8.04502e+11 (0.272 sec)
INFO:tensorflow:global_step/sec: 308.261
INFO:tensorflow:step = 48801, loss = 4.18791e+11 (0.323 sec)
INFO:tensorflow:global_step/sec: 305.115
INFO:tensorflow:step = 48901, loss = 6.81936e+11 (0.328 sec)
INFO:tensorflow:global_step/sec: 252.114
INFO:tensorflow:step = 49001, loss = 4.6963e+11 (0.399 sec)
INFO:tensorflow:global_step/sec: 270.319
INFO:tensorflow:step = 49101, loss = 8.28394e+11 (0.372 sec)
INFO:tensorflow:global_step/sec: 260.734
INFO:tensorflow:step = 49201, loss = 5.8338e+11 (0.383 sec)
INFO:tensorflow:global_step/sec: 255.019
INFO:tensorflow:step = 49301, loss = 1.98137e+11 (0.395 sec)
INFO:tensorflow:global_step/sec: 293.322
INFO:tensorflow:step = 49401, loss = 2.6226

INFO:tensorflow:global_step/sec: 380.329
INFO:tensorflow:step = 56601, loss = 3.57513e+11 (0.264 sec)
INFO:tensorflow:global_step/sec: 378.145
INFO:tensorflow:step = 56701, loss = 3.84485e+11 (0.264 sec)
INFO:tensorflow:global_step/sec: 410.019
INFO:tensorflow:step = 56801, loss = 7.37975e+11 (0.242 sec)
INFO:tensorflow:global_step/sec: 414.444
INFO:tensorflow:step = 56901, loss = 5.28515e+11 (0.240 sec)
INFO:tensorflow:global_step/sec: 349.5
INFO:tensorflow:step = 57001, loss = 5.49653e+11 (0.288 sec)
INFO:tensorflow:global_step/sec: 400.712
INFO:tensorflow:step = 57101, loss = 5.52683e+11 (0.248 sec)
INFO:tensorflow:global_step/sec: 410.911
INFO:tensorflow:step = 57201, loss = 7.61573e+11 (0.246 sec)
INFO:tensorflow:global_step/sec: 468.689
INFO:tensorflow:step = 57301, loss = 5.52474e+11 (0.211 sec)
INFO:tensorflow:global_step/sec: 500.691
INFO:tensorflow:step = 57401, loss = 5.74204e+11 (0.202 sec)
INFO:tensorflow:global_step/sec: 367.843
INFO:tensorflow:step = 57501, loss = 3.5611

INFO:tensorflow:global_step/sec: 347.716
INFO:tensorflow:step = 64701, loss = 8.32568e+11 (0.290 sec)
INFO:tensorflow:global_step/sec: 390.756
INFO:tensorflow:step = 64801, loss = 7.54293e+11 (0.254 sec)
INFO:tensorflow:global_step/sec: 480.996
INFO:tensorflow:step = 64901, loss = 3.13478e+11 (0.208 sec)
INFO:tensorflow:global_step/sec: 456.554
INFO:tensorflow:step = 65001, loss = 3.60741e+11 (0.220 sec)
INFO:tensorflow:global_step/sec: 390.072
INFO:tensorflow:step = 65101, loss = 6.43499e+11 (0.257 sec)
INFO:tensorflow:global_step/sec: 449.818
INFO:tensorflow:step = 65201, loss = 4.35119e+11 (0.221 sec)
INFO:tensorflow:global_step/sec: 513.079
INFO:tensorflow:step = 65301, loss = 3.61526e+11 (0.198 sec)
INFO:tensorflow:global_step/sec: 493.659
INFO:tensorflow:step = 65401, loss = 4.41242e+11 (0.200 sec)
INFO:tensorflow:global_step/sec: 357.073
INFO:tensorflow:step = 65501, loss = 7.16889e+11 (0.280 sec)
INFO:tensorflow:global_step/sec: 388.345
INFO:tensorflow:step = 65601, loss = 4.63

INFO:tensorflow:global_step/sec: 491.328
INFO:tensorflow:step = 72801, loss = 6.63451e+11 (0.205 sec)
INFO:tensorflow:global_step/sec: 544.325
INFO:tensorflow:step = 72901, loss = 4.29578e+11 (0.185 sec)
INFO:tensorflow:global_step/sec: 371.239
INFO:tensorflow:step = 73001, loss = 3.9147e+11 (0.266 sec)
INFO:tensorflow:global_step/sec: 452.18
INFO:tensorflow:step = 73101, loss = 3.17601e+11 (0.221 sec)
INFO:tensorflow:global_step/sec: 488.32
INFO:tensorflow:step = 73201, loss = 2.21985e+11 (0.210 sec)
INFO:tensorflow:global_step/sec: 500.26
INFO:tensorflow:step = 73301, loss = 3.20152e+11 (0.200 sec)
INFO:tensorflow:global_step/sec: 518.229
INFO:tensorflow:step = 73401, loss = 3.77902e+11 (0.187 sec)
INFO:tensorflow:global_step/sec: 414.041
INFO:tensorflow:step = 73501, loss = 4.5523e+11 (0.242 sec)
INFO:tensorflow:global_step/sec: 496.386
INFO:tensorflow:step = 73601, loss = 4.8992e+11 (0.202 sec)
INFO:tensorflow:global_step/sec: 386.25
INFO:tensorflow:step = 73701, loss = 3.88446e+11

INFO:tensorflow:global_step/sec: 440.558
INFO:tensorflow:step = 80901, loss = 2.93442e+11 (0.228 sec)
INFO:tensorflow:global_step/sec: 515.796
INFO:tensorflow:step = 81001, loss = 3.75746e+11 (0.195 sec)
INFO:tensorflow:global_step/sec: 387.618
INFO:tensorflow:step = 81101, loss = 5.3915e+11 (0.257 sec)
INFO:tensorflow:global_step/sec: 423.089
INFO:tensorflow:step = 81201, loss = 8.81796e+11 (0.236 sec)
INFO:tensorflow:global_step/sec: 447.184
INFO:tensorflow:step = 81301, loss = 7.43603e+11 (0.225 sec)
INFO:tensorflow:global_step/sec: 506.778
INFO:tensorflow:step = 81401, loss = 4.62027e+11 (0.198 sec)
INFO:tensorflow:global_step/sec: 546.048
INFO:tensorflow:step = 81501, loss = 3.92092e+11 (0.181 sec)
INFO:tensorflow:global_step/sec: 413.174
INFO:tensorflow:step = 81601, loss = 5.15012e+11 (0.239 sec)
INFO:tensorflow:global_step/sec: 524.317
INFO:tensorflow:step = 81701, loss = 2.31733e+11 (0.191 sec)
INFO:tensorflow:global_step/sec: 407.894
INFO:tensorflow:step = 81801, loss = 6.983

INFO:tensorflow:global_step/sec: 363.716
INFO:tensorflow:step = 89001, loss = 4.88166e+11 (0.272 sec)
INFO:tensorflow:global_step/sec: 406.364
INFO:tensorflow:step = 89101, loss = 6.06008e+11 (0.248 sec)
INFO:tensorflow:global_step/sec: 438.57
INFO:tensorflow:step = 89201, loss = 9.14392e+11 (0.231 sec)
INFO:tensorflow:global_step/sec: 385.077
INFO:tensorflow:step = 89301, loss = 4.78069e+11 (0.257 sec)
INFO:tensorflow:global_step/sec: 427.097
INFO:tensorflow:step = 89401, loss = 4.5952e+11 (0.233 sec)
INFO:tensorflow:global_step/sec: 388.772
INFO:tensorflow:step = 89501, loss = 4.25787e+11 (0.259 sec)
INFO:tensorflow:global_step/sec: 422.751
INFO:tensorflow:step = 89601, loss = 5.60471e+11 (0.237 sec)
INFO:tensorflow:global_step/sec: 507.821
INFO:tensorflow:step = 89701, loss = 5.57867e+11 (0.196 sec)
INFO:tensorflow:global_step/sec: 429.442
INFO:tensorflow:step = 89801, loss = 5.92593e+11 (0.230 sec)
INFO:tensorflow:global_step/sec: 400.979
INFO:tensorflow:step = 89901, loss = 6.3539

INFO:tensorflow:global_step/sec: 364.097
INFO:tensorflow:step = 97101, loss = 9.99934e+11 (0.275 sec)
INFO:tensorflow:global_step/sec: 421.365
INFO:tensorflow:step = 97201, loss = 3.85836e+11 (0.239 sec)
INFO:tensorflow:global_step/sec: 391.048
INFO:tensorflow:step = 97301, loss = 9.73755e+11 (0.256 sec)
INFO:tensorflow:global_step/sec: 397.104
INFO:tensorflow:step = 97401, loss = 6.02369e+11 (0.248 sec)
INFO:tensorflow:global_step/sec: 361.312
INFO:tensorflow:step = 97501, loss = 6.83424e+11 (0.280 sec)
INFO:tensorflow:global_step/sec: 484.287
INFO:tensorflow:step = 97601, loss = 5.71352e+11 (0.205 sec)
INFO:tensorflow:global_step/sec: 397.763
INFO:tensorflow:step = 97701, loss = 6.08167e+11 (0.253 sec)
INFO:tensorflow:global_step/sec: 467.755
INFO:tensorflow:step = 97801, loss = 5.26724e+11 (0.215 sec)
INFO:tensorflow:global_step/sec: 392.225
INFO:tensorflow:step = 97901, loss = 5.60549e+11 (0.253 sec)
INFO:tensorflow:global_step/sec: 413.401
INFO:tensorflow:step = 98001, loss = 2.88

<tensorflow.python.estimator.canned.linear.LinearRegressor at 0x121254e48>

In [40]:
# provided do not execute !

** Create a prediction input function and then use the .predict method off your estimator model to create a list or predictions on your test data. **

In [59]:
test_input_fn = tf.estimator.inputs.pandas_input_fn(
    x = X_test, 
    batch_size = 1, 
    num_epochs = 1, 
    shuffle = False
)

In [60]:
predictions = model.predict(input_fn=test_input_fn)
pred_values = [ p['predictions'][0] for p in predictions ]
pd.DataFrame(pred_values).head()

ValueError: Could not find trained model in model_dir: /var/folders/7f/kl6zxp_50ddcw8xr42yvrpfm0000gn/T/tmpmhxx57n3.

In [52]:
y_test.head()

19121    151600.0
20019     99200.0
15104    134500.0
3720     231700.0
8938     462900.0
Name: medianHouseValue, dtype: float64

In [44]:
# provided do not execute!

** Calculate the RMSE. You should be able to get around 100,000 RMSE (remember that this is in the same units as the label.) Do this manually or use [sklearn.metrics](http://scikit-learn.org/stable/modules/generated/sklearn.metrics.mean_squared_error.html) **

In [45]:
from sklearn.metrics import mean_squared_error

In [53]:
mean_squared_error(y_true = y_test, y_pred = pred_values)**0.5

76859.112611920995

# Great Job!