# Enable Virtual Environment For This Notebook.

### Activate Conda Environment

<b>`$ conda activate`</b>

### Install Or Upgrade necessary software for virtual environment.

<b>`$ sudo apt-get install --upgrade python3-pip`</b>

<b>`$ sudo pip3 install --upgrade virtualenv`</b>

<b>`$ sudo pip3 install --upgrade setuptools`</b>

Now we will go to the location of the directory, where we will create our virtual environment.

<b>`$ cd /media/mujahid7292/Data/GoogleDriveSandCorp2014/ML_With_TensorFlow_On_GCP/05.Art_And_Science_Of_Machine_Learning/WEEK_2/01.Using Neural Networks to build ML model`</b>

### Deactivate conda environment

<b>`$ conda deactivate`</b>

### Create Virtual Environment

<b>`$ virtualenv Venv`</b>

### Activate newly created virtual environment

<b>`$ source Venv/bin/activate`</b>

<b>`$ (Venv) which python`</b>

<b>`$ (Venv) pip list`</b>

<b>`$ (Venv) pip3 install jupyter`</b>

In [1]:
%%writefile requirements.txt
numpy
pandas
tensorflow==1.8.0

Overwriting requirements.txt


In [2]:
%%bash
pip3 install -r requirements.txt



In [3]:
%%bash
pip3 list

Package            Version  
------------------ ---------
absl-py            0.9.0    
astor              0.8.1    
attrs              19.3.0   
backcall           0.1.0    
bleach             1.5.0    
decorator          4.4.2    
defusedxml         0.6.0    
entrypoints        0.3      
gast               0.3.3    
grpcio             1.27.2   
html5lib           0.9999999
importlib-metadata 1.6.0    
ipykernel          5.2.0    
ipython            7.13.0   
ipython-genutils   0.2.0    
ipywidgets         7.5.1    
jedi               0.16.0   
Jinja2             2.11.1   
jsonschema         3.2.0    
jupyter            1.0.0    
jupyter-client     6.1.2    
jupyter-console    6.1.0    
jupyter-core       4.6.3    
Markdown           3.2.1    
MarkupSafe         1.1.1    
mistune            0.8.4    
nbconvert          5.6.1    
nbformat           5.0.4    
notebook           6.0.3    
numpy              1.18.2   
pandas             1.0.3    
pandocfilters      1.4.2    
parso         

In [4]:
%%bash
which python

/media/mujahid7292/Data/GoogleDriveSandCorp2014/ML_With_TensorFlow_On_GCP/05.Art_And_Science_Of_Machine_Learning/WEEK_2/01.Using Neural Networks to build ML model/Venv/bin/python


In [5]:
%%bash
python --version

Python 3.6.9


<a>https://github.com/GoogleCloudPlatform/training-data-analyst/blob/master/courses/machine_learning/deepdive/05_artandscience/c_neuralnetwork.ipynb</a>

# Neural Network

**Learning Objectives:**
  * Use the `DNNRegressor` class in TensorFlow to predict median housing price

The data is based on 1990 census data from California. This data is at the city block level, so these features reflect the total number of rooms in that block, or the total number of people who live on that block, respectively.
<p>
Let's use a set of features to predict house value.

## Set Up
In this first cell, we'll load the necessary libraries.

In [6]:
import math
import shutil
import numpy as np
import pandas as pd
import tensorflow as tf

tf.logging.set_verbosity(tf.logging.INFO)
pd.options.display.max_rows = 10
pd.options.display.float_format = '{:.1f}'.format
print(tf.__version__)

  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])


1.8.0


In [11]:
%%bash
ls

california_housing_train.csv
Practice.ipynb
requirements.txt


Next, we'll load our data set.

In [23]:
df = pd.read_csv('./california_housing_train.csv', sep=",")

## Examine the data

It's a good idea to get to know your data a little bit before you work with it.

We'll print out a quick summary of a few useful statistics on each column.

This will include things like mean, standard deviation, max, min, and various quantiles.

In [24]:
df.head()

Unnamed: 0,longitude,latitude,housing_median_age,total_rooms,total_bedrooms,population,households,median_income,median_house_value
0,-114.3,34.2,15.0,5612.0,1283.0,1015.0,472.0,1.5,66900.0
1,-114.5,34.4,19.0,7650.0,1901.0,1129.0,463.0,1.8,80100.0
2,-114.6,33.7,17.0,720.0,174.0,333.0,117.0,1.7,85700.0
3,-114.6,33.6,14.0,1501.0,337.0,515.0,226.0,3.2,73400.0
4,-114.6,33.6,20.0,1454.0,326.0,624.0,262.0,1.9,65500.0


In [25]:
df.describe()

Unnamed: 0,longitude,latitude,housing_median_age,total_rooms,total_bedrooms,population,households,median_income,median_house_value
count,17000.0,17000.0,17000.0,17000.0,17000.0,17000.0,17000.0,17000.0,17000.0
mean,-119.6,35.6,28.6,2643.7,539.4,1429.6,501.2,3.9,207300.9
std,2.0,2.1,12.6,2179.9,421.5,1147.9,384.5,1.9,115983.8
min,-124.3,32.5,1.0,2.0,1.0,3.0,1.0,0.5,14999.0
25%,-121.8,33.9,18.0,1462.0,297.0,790.0,282.0,2.6,119400.0
50%,-118.5,34.2,29.0,2127.0,434.0,1167.0,409.0,3.5,180400.0
75%,-118.0,37.7,37.0,3151.2,648.2,1721.0,605.2,4.8,265000.0
max,-114.3,42.0,52.0,37937.0,6445.0,35682.0,6082.0,15.0,500001.0


This data is at the city block level, so these features reflect the total number of rooms in that block, or the total number of people who live on that block, respectively.  Let's create a different, more appropriate feature.  Because we are predicting the price of a single house, we should try to make all our features correspond to a single house as well

In [26]:
df['num_rooms'] = df['total_rooms'] / df['households']
df['num_bedrooms'] = df['total_bedrooms'] / df['households']
df['persons_per_house'] = df['population'] / df['households']
df.head()

Unnamed: 0,longitude,latitude,housing_median_age,total_rooms,total_bedrooms,population,households,median_income,median_house_value,num_rooms,num_bedrooms,persons_per_house
0,-114.3,34.2,15.0,5612.0,1283.0,1015.0,472.0,1.5,66900.0,11.9,2.7,2.2
1,-114.5,34.4,19.0,7650.0,1901.0,1129.0,463.0,1.8,80100.0,16.5,4.1,2.4
2,-114.6,33.7,17.0,720.0,174.0,333.0,117.0,1.7,85700.0,6.2,1.5,2.8
3,-114.6,33.6,14.0,1501.0,337.0,515.0,226.0,3.2,73400.0,6.6,1.5,2.3
4,-114.6,33.6,20.0,1454.0,326.0,624.0,262.0,1.9,65500.0,5.5,1.2,2.4


Now we will drop unnecessary column from our dataset.

In [27]:
df.drop(
    labels=['total_rooms','total_bedrooms','population','households'],
    axis=1,
    inplace=True
)
df.head()

Unnamed: 0,longitude,latitude,housing_median_age,median_income,median_house_value,num_rooms,num_bedrooms,persons_per_house
0,-114.3,34.2,15.0,1.5,66900.0,11.9,2.7,2.2
1,-114.5,34.4,19.0,1.8,80100.0,16.5,4.1,2.4
2,-114.6,33.7,17.0,1.7,85700.0,6.2,1.5,2.8
3,-114.6,33.6,14.0,3.2,73400.0,6.6,1.5,2.3
4,-114.6,33.6,20.0,1.9,65500.0,5.5,1.2,2.4


## Build a neural network model

In this exercise, we'll be trying to predict `median_house_value`. It will be our label (sometimes also called a target). We'll use the remaining columns as our input features.

To train our model, we'll first use the [LinearRegressor](https://www.tensorflow.org/api_docs/python/tf/contrib/learn/LinearRegressor) interface. Then, we'll change to DNNRegressor


In [29]:
featcols = {
    colname:tf.feature_column.numeric_column(key=colname) \
    for colname in 'housing_median_age,median_income,num_rooms,num_bedrooms,persons_per_house'.split(',')
}

# Now we will bucketize our latitude and longitude so that it is not so in high-res.
# California is mostly North-South, so more lats than lons
featcols['longitude'] = tf.feature_column.bucketized_column(
    source_column=tf.feature_column.numeric_column('longitude'),
    boundaries=np.linspace(start=-124.3, stop=-114.3, num=5).tolist()
)

featcols['latitude'] = tf.feature_column.bucketized_column(
    source_column=tf.feature_column.numeric_column('latitude'),
    boundaries=np.linspace(start=32.5, stop=42, num=10).tolist()
)

In [31]:
featcols.keys()

dict_keys(['housing_median_age', 'median_income', 'num_rooms', 'num_bedrooms', 'persons_per_house', 'longitude', 'latitude'])

## Split the dataset into training and evaluation

In [36]:
msk = np.random.randn(len(df)) < 0.8
train_df = df[msk] # Put 80% data in training
eval_df = df[~msk] # Put 20% data in evaluation
print('Training Size: {}'.format(train_df.size))
print('Evaluation Size: {}'.format(eval_df.size))

Training Size: 106664
Evaluation Size: 29336


## Constant Of Our Training

In [37]:
SCALE = 100000 # 1 Lac
BATCH_SIZE = 100
OUTDIR = './house_trained'

## Create training and evaluation input function

In [41]:
train_input_fn  = tf.estimator.inputs.pandas_input_fn(
    x=train_df[list(featcols.keys())],
    y=train_df['median_house_value'] / SCALE, # Note the scalling
    batch_size=BATCH_SIZE,
    num_epochs=None,
    shuffle=True
)

eval_input_fn = tf.estimator.inputs.pandas_input_fn(
    x=eval_df[list(featcols.keys())],
    y=eval_df['median_house_value'] / SCALE, # Note the scalling
    num_epochs=1,
    batch_size=len(eval_df),
    shuffle=False
)

print('Features List = {}'.format(list(featcols.keys())))

Features List = ['housing_median_age', 'median_income', 'num_rooms', 'num_bedrooms', 'persons_per_house', 'longitude', 'latitude']


## Linear Regressor

In [50]:
def train_and_evaluate(output_dir, num_train_steps):
    """
    """
    # Creat Optimizer 
    myopt = tf.train.FtrlOptimizer(learning_rate=0.01) # Note the learning rate
    
    # Create Linear Regressor Estimator Object
    estimator = tf.estimator.LinearRegressor(
        feature_columns=featcols.values(),
        model_dir=output_dir,
        optimizer=myopt
    )
    
    # Create a custom evaluation metric
    def rmse(labels, predictions):
        """
        """
        pred_value = tf.cast(x=predictions['predictions'],dtype=tf.float64)
        return {'rmse': tf.metrics.root_mean_squared_error(labels,pred_value)}
    
    # Attach avobe custom evaluation metric with our esimator object
    estimator = tf.contrib.estimator.add_metrics(estimator,rmse)
    
    # Create training specification
    train_spec = tf.estimator.TrainSpec(
        input_fn=train_input_fn,
        max_steps=num_train_steps
    )
    
    # Create evaluation specification
    eval_spec = tf.estimator.EvalSpec(
        input_fn=eval_input_fn,
        steps=None,
        start_delay_secs=1, # Start evaluating after N seconds
        throttle_secs=10 # Evaluate every N seconds
    )
    
    tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)

## Now run the training using Linear Regressor

In [51]:
shutil.rmtree(path=OUTDIR, ignore_errors=True) # Start fresh every time
train_and_evaluate(OUTDIR, num_train_steps = (100 * len(train_df)) / BATCH_SIZE)

INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_model_dir': './house_trained', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': None, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7fccb2a1cd68>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
INFO:tensorflow:Using config: {'_model_dir': './house_trained', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': None, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_service'

INFO:tensorflow:global_step/sec: 464.313
INFO:tensorflow:loss = 56.938095, step = 5325 (0.212 sec)
INFO:tensorflow:global_step/sec: 577.045
INFO:tensorflow:loss = 97.26297, step = 5425 (0.173 sec)
INFO:tensorflow:global_step/sec: 489.22
INFO:tensorflow:loss = 53.186085, step = 5525 (0.204 sec)
INFO:tensorflow:global_step/sec: 462.641
INFO:tensorflow:loss = 51.6343, step = 5625 (0.217 sec)
INFO:tensorflow:global_step/sec: 578.779
INFO:tensorflow:loss = 59.817154, step = 5725 (0.173 sec)
INFO:tensorflow:global_step/sec: 496.399
INFO:tensorflow:loss = 77.06701, step = 5825 (0.205 sec)
INFO:tensorflow:global_step/sec: 500.825
INFO:tensorflow:loss = 51.623436, step = 5925 (0.195 sec)
INFO:tensorflow:global_step/sec: 528.991
INFO:tensorflow:loss = 50.433384, step = 6025 (0.190 sec)
INFO:tensorflow:global_step/sec: 445.781
INFO:tensorflow:loss = 50.209816, step = 6125 (0.224 sec)
INFO:tensorflow:global_step/sec: 553.601
INFO:tensorflow:loss = 73.939026, step = 6225 (0.180 sec)
INFO:tensorflow

INFO:tensorflow:global_step/sec: 556.78
INFO:tensorflow:loss = 48.159588, step = 12427 (0.180 sec)
INFO:tensorflow:global_step/sec: 572.794
INFO:tensorflow:loss = 102.71781, step = 12527 (0.177 sec)
INFO:tensorflow:global_step/sec: 569.168
INFO:tensorflow:loss = 26.556543, step = 12627 (0.174 sec)
INFO:tensorflow:global_step/sec: 576.66
INFO:tensorflow:loss = 118.23193, step = 12727 (0.173 sec)
INFO:tensorflow:global_step/sec: 567.453
INFO:tensorflow:loss = 52.6072, step = 12827 (0.177 sec)
INFO:tensorflow:global_step/sec: 576.895
INFO:tensorflow:loss = 103.003265, step = 12927 (0.173 sec)
INFO:tensorflow:global_step/sec: 568.014
INFO:tensorflow:loss = 21.472506, step = 13027 (0.175 sec)
INFO:tensorflow:global_step/sec: 571.178
INFO:tensorflow:loss = 92.53259, step = 13127 (0.178 sec)
INFO:tensorflow:global_step/sec: 564.038
INFO:tensorflow:loss = 58.690685, step = 13227 (0.175 sec)
INFO:tensorflow:global_step/sec: 576.858
INFO:tensorflow:loss = 96.02444, step = 13327 (0.174 sec)
INFO:

## DNN Regressor

In [52]:
def train_and_evalaute_dnn(ouput_dir, num_train_steps):
    """
    """
    # Create custom optimizer
    myopt = tf.train.FtrlOptimizer(learning_rate=0.01) # Note the learning rate
    
    # Create DNN Regressor estimator object
    estimator = tf.estimator.DNNRegressor(
        hidden_units=[100, 50, 20],
        feature_columns=featcols.values(),
        model_dir=ouput_dir,
        optimizer=myopt,
        dropout=0.10
    )
    
    # Create custom evaluation metric
    def rmse(labels, predictions):
        """
        """
        pred_values = tf.cast(predictions['predictions'], tf.float64)
        return {'rmse' : tf.metrics.root_mean_squared_error(labels,pred_values)}
    
    # Attach custom estimator with the estimator object
    estimator = tf.contrib.estimator.add_metrics(estimator, rmse)
    
    # Create training specification
    train_spec = tf.estimator.TrainSpec(
        input_fn=train_input_fn,
        max_steps=num_train_steps
    )
    
    # Create evaluation specefication
    eval_spec = tf.estimator.EvalSpec(
        input_fn=eval_input_fn,
        steps=None,
        start_delay_secs=1, # Start after N seconds
        throttle_secs=10 # Evaluate every N seconds
    )
    
    tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)

## Now run the training using DNN Regressor

In [54]:
shutil.rmtree(OUTDIR, ignore_errors=True) # Start fresh every time
tf.summary.FileWriterCache.clear() # ensure filewriter cache is clear for TensorBoard events file
train_and_evalaute_dnn(OUTDIR, num_train_steps = (100 * len(train_df)) / BATCH_SIZE)

INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_model_dir': './house_trained', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': None, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7fcc77a199b0>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
INFO:tensorflow:Using config: {'_model_dir': './house_trained', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': None, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_service'

INFO:tensorflow:global_step/sec: 496.686
INFO:tensorflow:loss = 50.332787, step = 5305 (0.200 sec)
INFO:tensorflow:global_step/sec: 457.655
INFO:tensorflow:loss = 50.145046, step = 5405 (0.222 sec)
INFO:tensorflow:global_step/sec: 422.641
INFO:tensorflow:loss = 33.780228, step = 5505 (0.234 sec)
INFO:tensorflow:global_step/sec: 318.297
INFO:tensorflow:loss = 59.528854, step = 5605 (0.314 sec)
INFO:tensorflow:global_step/sec: 388.053
INFO:tensorflow:loss = 65.801315, step = 5705 (0.257 sec)
INFO:tensorflow:global_step/sec: 475.428
INFO:tensorflow:loss = 42.773476, step = 5805 (0.209 sec)
INFO:tensorflow:global_step/sec: 520.318
INFO:tensorflow:loss = 29.435116, step = 5905 (0.193 sec)
INFO:tensorflow:global_step/sec: 512.719
INFO:tensorflow:loss = 39.365223, step = 6005 (0.195 sec)
INFO:tensorflow:global_step/sec: 508.05
INFO:tensorflow:loss = 72.2889, step = 6105 (0.197 sec)
INFO:tensorflow:global_step/sec: 521.184
INFO:tensorflow:loss = 52.905075, step = 6205 (0.193 sec)
INFO:tensorfl

INFO:tensorflow:Saving dict for global step 11818: average_loss = 0.37941366, global_step = 11818, loss = 1391.3099, rmse = 0.6159656
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from ./house_trained/model.ckpt-11818
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Saving checkpoints for 11819 into ./house_trained/model.ckpt.
INFO:tensorflow:loss = 26.42388, step = 11819
INFO:tensorflow:global_step/sec: 400.678
INFO:tensorflow:loss = 70.38367, step = 11919 (0.252 sec)
INFO:tensorflow:global_step/sec: 507.563
INFO:tensorflow:loss = 57.19625, step = 12019 (0.197 sec)
INFO:tensorflow:global_step/sec: 505.774
INFO:tensorflow:loss = 56.5259, step = 12119 (0.198 sec)
INFO:tensorflow:global_step/sec: 506.925
INFO:tensorflow:loss