## Linear Regression Model for Boston Housing Prices

Each record in the database describes a Boston suburb or town. The data was drawn from the Boston Standard Metropolitan Statistical Area (SMSA) in 1970. The attributes are deﬁned as follows (taken from the UCI Machine Learning Repository1): CRIM: per capita crime rate by town 2. ZN: proportion of residential land zoned for lots over 25,000 sq.ft. 3. INDUS: proportion of non-retail business acres per town 4. CHAS: Charles River dummy variable (= 1 if tract bounds river; 0 otherwise) 5. NOX: nitric oxides concentration (parts per 10 million) 1https://archive.ics.uci.edu/ml/datasets/Housing 123 20.2. Load the Dataset 124 6. RM: average number of rooms per dwelling 7. AGE: proportion of owner-occupied units built prior to 1940 8. DIS: weighted distances to ﬁve Boston employment centers 9. RAD: index of accessibility to radial highways 10. TAX: full-value property-tax rate per $10,000 11. PTRATIO: pupil-teacher ratio by town 12. B: 1000(Bk−0.63)2 where Bk is the proportion of blacks by town 13. LSTAT: % lower status of the population 14. MEDV: Median value of owner-occupied homes in $1000s We can see that the input attributes have a mixture of units.

In [0]:
import tensorflow as tf
import numpy as np

Reset Default graph - Needed only for Jupyter notebook

In [0]:
tf.reset_default_graph()

### Load Data
Use Boston housing prices data available

In [0]:
import pandas as pd
url = 'https://raw.githubusercontent.com/codebuild81/Deep_Learning_Projects/master/housing.csv'



In [0]:
names = ['CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD', 'TAX', 'PTRATIO', 'B', 'LSTAT', 'MEDV']


In [0]:
#boston = pd.read_csv(url, sep = '\s+', names=names)
boston = pd.read_csv(url, delim_whitespace=True, names=names)

In [0]:
boston.head()

In [0]:
boston.describe()

In [0]:
#Input feature columns
features_col = [x for x in list(boston.columns) if x != 'MEDV']
features_col

In [0]:
#Input feature dataset, #create a Pandas data frame for independent
features = boston[features_col]
features.head()

In [0]:
#Actual output, #create a Pandas data frame for dependent variables
prices = boston['MEDV']
prices.head()

In [0]:
prices = np.reshape(prices,(prices.shape[0],1))

In [0]:

print('Input features shape: ', features.shape)
print('Actual Prices data shape: ', prices.shape)

In [0]:
features.shape[1]

### Building the Graph

Define input data placeholders as actual data will be provided at run time..

In [0]:
#Input features placeholder
x = tf.placeholder(shape=[None,features.shape[1]],dtype=tf.float32, name='x-input')

#Actual Price placeholder
y_ = tf.placeholder(shape=[None,prices.shape[1]],dtype=tf.float32, name='y-input')

Define Weights and Bias

In [0]:
W = tf.Variable(tf.zeros(shape=[features.shape[1],1]), name="Weights")
b = tf.Variable(tf.zeros(shape=[1]),name="Bias")

Prediction

In [0]:
#y = xW + b
y = tf.add(tf.matmul(x,W),b,name='output')

Loss (Cost) Function

In [0]:
#Mean Suared error
loss = tf.reduce_mean(tf.square(y-y_),name='Loss')

GradientDescent Optimizer to minimize Loss

In [0]:
learn_rate = 0.03 #Can try different rates
train_op = tf.train.GradientDescentOptimizer(learn_rate).minimize(loss)

###Executing the Graph

In [0]:
#Lets start graph Execution
with tf.Session() as sess:
  
    # variables need to be initialized before we can use them
    sess.run(tf.global_variables_initializer())
    
    #how many times data need to be shown to model
    training_epochs = 1000  
    
    for epoch in range(training_epochs):
        
        #Calculate train_op and loss
        _, train_loss = sess.run([train_op,loss], #Execute train_op and loss node
                                 feed_dict={x:features, #Data for Input feature
                                            y_:prices}) #Actual price data
        
        #Print the loss after every 100 iterations
        if epoch % 100 == 0:
            print ('Training loss at step: ', epoch, ' is ', train_loss)

How do you explain the training loss?

In [0]:
#Print Max value for each Feature
for num in np.max(features, axis=0):
    print(num)

In [0]:
#Assign session
sess = tf.Session()

In [0]:
#Close Session
sess.close()


##Normalize the data & build Boston Housing Price Prediction

In [0]:

y = tf.add(tf.matmul(x_n,W),b,name='output')

In [0]:
#Normalize the data, xi = (xi - mean)/(max - min)
x_n = tf.nn.l2_normalize(features,1)

In [0]:
x_n = tf.placeholder(shape=[None,features.shape[1]],dtype=tf.float32, name='x-input')

In [0]:
#Lets start graph Execution
#After Normalize the data
with tf.Session() as sess:
  
    # variables need to be initialized before we can use them
    sess.run(tf.global_variables_initializer())
    
    #how many times data need to be shown to model
    training_epochs = 100 
    
    for epoch in range(training_epochs):
        
        #Calculate train_op and loss
        _, train_loss = sess.run([train_op,loss], #Execute train_op and loss node
                                 feed_dict={x_n:features, #Data for Input feature
                                            y_:prices}) #Actual price data
        
        #Print the loss after every 100 iterations
        if epoch % 10 == 0:
            print ('Training loss at step: ', epoch, ' is ', train_loss,_)