# Project 3: Time Series Forecasting with Recurrent Neural Nets

# Part III: Time series analysis with recurrent neural nets

The remaining parts of the project will build models for time series data using tensorflow.

We will be using the mortgage affordability provided by Zillow https://www.zillow.com/research/data/

The project repository includes the data and utility functions to load and plot it

In [1]:
%load_ext autoreload
%autoreload 2

In [79]:
import proj3_lib.utils as utils

afford_df = utils.get_affordability_df()
afford_df.head(5)

Unnamed: 0,RegionID,RegionName,SizeRank,1979-03,1979-06,1979-09,1979-12,1980-03,1980-06,1980-09,...,2014-12,2015-03,2015-06,2015-09,2015-12,2016-03,2016-06,2016-09,2016-12,2017-03
324,394913,"New York, NY",1,0.2617,0.278316,0.284399,0.319438,0.379845,0.31897,0.337586,...,0.241961,0.240494,0.247387,0.245957,0.249606,0.241544,0.239503,0.240021,0.265376,0.269264
325,753899,"Los Angeles-Long Beach-Anaheim, CA",2,0.357694,0.388161,0.401762,0.459422,0.541508,0.464528,0.48675,...,0.383162,0.382531,0.396277,0.395686,0.402116,0.393042,0.390708,0.386895,0.430334,0.433476
326,394463,"Chicago, IL",3,0.261928,0.27679,0.276248,0.309709,0.353099,0.299344,0.295616,...,0.134913,0.134063,0.139074,0.137962,0.138784,0.134412,0.133701,0.132856,0.147025,0.149054
327,394514,"Dallas-Fort Worth, TX",4,0.301131,0.328453,0.340857,0.386321,0.452818,0.376912,0.394715,...,0.118163,0.117663,0.124935,0.128167,0.133517,0.131539,0.133219,0.135142,0.149861,0.152387
328,394974,"Philadelphia, PA",5,0.204333,0.215107,0.219247,0.24769,0.295396,0.248018,0.260259,...,0.141915,0.1391,0.142831,0.141625,0.142089,0.137387,0.135599,0.135064,0.148746,0.148559


In [80]:
utils.plot_affordability_series(afford_df)

## Getting data ready

We will build models that forecast mortgage affordability based on the previous year's affordability values. First, we will split the data into a training and testing set. We take data from 2015-2017 as the test set, and all previous data as the training set.

In [81]:
train_df, test_df = utils.split_train_test(afford_df)
train_df.head(5)

Unnamed: 0,RegionID,RegionName,SizeRank,1979-03,1979-06,1979-09,1979-12,1980-03,1980-06,1980-09,...,2012-09,2012-12,2013-03,2013-06,2013-09,2013-12,2014-03,2014-06,2014-09,2014-12
324,394913,"New York, NY",1,0.2617,0.278316,0.284399,0.319438,0.379845,0.31897,0.337586,...,0.224406,0.217553,0.224845,0.241536,0.260265,0.261124,0.258286,0.253598,0.252561,0.241961
325,753899,"Los Angeles-Long Beach-Anaheim, CA",2,0.357694,0.388161,0.401762,0.459422,0.541508,0.464528,0.48675,...,0.304163,0.30008,0.32092,0.359241,0.399297,0.402892,0.400288,0.393393,0.395619,0.383162
326,394463,"Chicago, IL",3,0.261928,0.27679,0.276248,0.309709,0.353099,0.299344,0.295616,...,0.121624,0.117433,0.121061,0.131412,0.141545,0.142888,0.141657,0.138968,0.14014,0.134913
327,394514,"Dallas-Fort Worth, TX",4,0.301131,0.328453,0.340857,0.386321,0.452818,0.376912,0.394715,...,0.104923,0.101957,0.105754,0.114501,0.123787,0.125153,0.123385,0.121437,0.121808,0.118163
328,394974,"Philadelphia, PA",5,0.204333,0.215107,0.219247,0.24769,0.295396,0.248018,0.260259,...,0.13888,0.134923,0.139083,0.148469,0.156956,0.155812,0.153202,0.149514,0.148801,0.141915


In [82]:
test_df.head(5)

Unnamed: 0,RegionID,RegionName,SizeRank,2015-03,2015-06,2015-09,2015-12,2016-03,2016-06,2016-09,2016-12,2017-03
324,394913,"New York, NY",1,0.240494,0.247387,0.245957,0.249606,0.241544,0.239503,0.240021,0.265376,0.269264
325,753899,"Los Angeles-Long Beach-Anaheim, CA",2,0.382531,0.396277,0.395686,0.402116,0.393042,0.390708,0.386895,0.430334,0.433476
326,394463,"Chicago, IL",3,0.134063,0.139074,0.137962,0.138784,0.134412,0.133701,0.132856,0.147025,0.149054
327,394514,"Dallas-Fort Worth, TX",4,0.117663,0.124935,0.128167,0.133517,0.131539,0.133219,0.135142,0.149861,0.152387
328,394974,"Philadelphia, PA",5,0.1391,0.142831,0.141625,0.142089,0.137387,0.135599,0.135064,0.148746,0.148559


In [84]:
train_data= train_df.drop(['RegionName', 'SizeRank', 'RegionID'], axis=1).values
test_data= test_df.drop(['RegionName', 'SizeRank', 'RegionID'], axis=1).values
print(train_data.shape)
print(test_data.shape)

(75, 144)
(75, 9)


# Part III: Recurrent Neural Nets
# RNN
## Use the tf.nn.dynamic_rnn and tf.nn.rnn_cell.BasicRNNCell functions to build a basic recurrent neural net to solve the affordability prediction problem. A couple of notes:
### Setup input/target pairs as follows: for sequence x_0, x_1, x_2, x_3, \ldots, x_m, use targets as x_1, x_2, x_3, \ldots, xm, x{m+1}.
### Use truncated back propagation through time (use about 8 timesteps for each backprop), remember to carry state forward between truncated backprop chunks.
### To make predictions for county j over the test set period, you should pass the training sequence (time series) for that county through the RNN so that state is used properly

## Here I first reset the graph and then build the graph. I use 8 timesteps as truncated backprop chuncks. I use tf.contrib.rnn.OutputProjectionWrapper for cell to transform the n_neurons cells to one. I just use 50 iterations but if we use more iterations the results will improve. 
### I used these references in this part:
####     - https://r2rt.com/recurrent-neural-networks-in-tensorflow-i.html
####     -https://www.tensorflow.org/tutorials/recurrent#truncated-backpropagation
####     - https://www.tensorflow.org/tutorials/recurrent#truncated-backpropagation

In [118]:
def reset_graph():
    if 'sess' in globals() and sess:
        sess.close()
    tf.reset_default_graph()
    
reset_graph()

total_train_length=train_data.shape[1]
n_inputs=1
n_neurons=100
n_outputs=1
n_epochs=50
chunck_steps=8
num_chuncks=total_train_length/chunck_steps
import tensorflow as tf


"""
Placeholders
"""

X = tf.placeholder(tf.float32, [None, chunck_steps,n_inputs], name='input_placeholder')
y = tf.placeholder(tf.float32, [None, chunck_steps,n_outputs], name='labels_placeholder')
init_state = tf.zeros([1, n_neurons])



"""
RNN
"""

cell = tf.contrib.rnn.OutputProjectionWrapper(tf.contrib.rnn.BasicRNNCell(n_neurons,activation=tf.nn.relu),output_size=n_outputs)
final_state=init_state
rnn_outputs, final_state = tf.nn.dynamic_rnn(cell, X, initial_state=final_state,dtype=tf.float32 )

"""
Predictions, loss, training step
"""
learning_rate=0.001
loss=tf.reduce_mean(tf.abs(rnn_outputs-y))
optimizer=tf.train.AdamOptimizer(learning_rate=learning_rate)
training_op=optimizer.minimize(loss)
init=tf.global_variables_initializer()


"""
Train the network
"""

with tf.Session() as sess:
    init.run()

    for epoch in range(n_epochs):
        MAE_set=[]
        i=0 

        for inputs in train_data:
          
            for chunck in range(int(num_chuncks)): 
                X_batch=inputs[chunck*chunck_steps:(chunck+1)*chunck_steps]
                X_batch=X_batch.reshape((-1,chunck_steps,n_inputs))
                if chunck < int(num_chuncks)-1:
                    y_batch=inputs[chunck*chunck_steps+1:(chunck+1)*chunck_steps+1]
                if chunck == int(num_chuncks)-1:
                    y_batch=np.append(inputs[chunck*chunck_steps+1:(chunck+1)*chunck_steps+1],test_data[i,0])

                y_batch=y_batch.reshape((-1,chunck_steps,n_outputs))

                sess.run(training_op, feed_dict={X:X_batch, y:y_batch})
                mae=loss.eval(feed_dict={X:X_batch, y:y_batch})
                #print(i,chunck, "\tMAE:", mae)
            X_new=test_data[i,0:8]    
            X_new=X_new.reshape((-1,chunck_steps,n_inputs))
            y_new=test_data[i,1:9]
            y_new=y_new.reshape((-1,chunck_steps,n_inputs))
            #print(y_new)
            y_pred=sess.run(rnn_outputs, feed_dict={X:X_new})
            #print(y_pred)
            MAE=loss.eval(feed_dict={X:X_new, y:y_new})
            #print("MAE on Test Data", i, "=", MAE)
            MAE_set=np.append(MAE_set,MAE)
            i+=1
        if (epoch==49):    
            print("MAE on Test Data=",MAE_set)
            print("Average of MAE=", np.mean(MAE_set))
            print("Standard Deviation of MAE=", np.std(MAE_set) )   



MAE on Test Data= [ 0.00985166  0.01507053  0.00419663  0.00711916  0.00387961  0.00359939
  0.00543834  0.00848203  0.00357559  0.00732664  0.01723113  0.00435127
  0.00923536  0.00540945  0.00607572  0.00348713  0.0151413   0.00436873
  0.00564725  0.00586718  0.00772886  0.00427587  0.00792591  0.00461137
  0.00861033  0.00527136  0.00425602  0.00396999  0.00388337  0.00610631
  0.00353847  0.0045512   0.01822483  0.00477435  0.00672561  0.00471239
  0.00554695  0.00470306  0.0037141   0.00424363  0.00562415  0.00493963
  0.00416846  0.00554335  0.00548873  0.01460411  0.00433963  0.00609928
  0.00427012  0.00416327  0.00551484  0.00517231  0.00567318  0.00473823
  0.00386649  0.00362335  0.00729644  0.00374404  0.0095823   0.00495519
  0.00558974  0.00438712  0.00487304  0.00629621  0.01529427  0.00359596
  0.00529133  0.008425    0.0045935   0.0089673   0.00354437  0.00908049
  0.01344069  0.00856618  0.00349115]
Average of MAE= 0.00642095337311
Standard Deviation of MAE= 0.003398

## LSTM
### Do the same as above using tf.nn.rnn_cell.BasicLSTMCellto use the LSTM model. Also look attf.nn.rnn_cell.MultiLSTMCell to build deep LSTM network

## Here I build basic LSTM with just one layer.

In [114]:
def reset_graph():
    if 'sess' in globals() and sess:
        sess.close()
    tf.reset_default_graph()
    
reset_graph()

total_train_length=train_data.shape[1]
n_inputs=1
n_neurons=100
n_outputs=1
n_epochs=50
chunck_steps=8
num_chuncks=total_train_length/chunck_steps
import tensorflow as tf


"""
Placeholders
"""

X = tf.placeholder(tf.float32, [None, chunck_steps,n_inputs], name='input_placeholder')
y = tf.placeholder(tf.float32, [None, chunck_steps,n_outputs], name='labels_placeholder')



"""
RNN
"""
lstm_cell = tf.contrib.rnn.OutputProjectionWrapper(tf.contrib.rnn.BasicLSTMCell(n_neurons),output_size=n_outputs)

outputs, state = tf.nn.dynamic_rnn(cell=lstm_cell,
                                   inputs=X,
                                   dtype=tf.float32)

"""
Predictions, loss, training step
"""
learning_rate=0.001
loss=tf.reduce_mean(tf.abs(outputs-y))
optimizer=tf.train.AdamOptimizer(learning_rate=learning_rate)
training_op=optimizer.minimize(loss)
init=tf.global_variables_initializer()


"""
Train the network
"""

with tf.Session() as sess:
    init.run()

    for epoch in range(n_epochs):
        MAE_set=[]
        i=0 

        for inputs in train_data:

            for chunck in range(int(num_chuncks)): 
                X_batch=inputs[chunck*chunck_steps:(chunck+1)*chunck_steps]
                X_batch=X_batch.reshape((-1,chunck_steps,n_inputs))
                if chunck < int(num_chuncks)-1:
                    y_batch=inputs[chunck*chunck_steps+1:(chunck+1)*chunck_steps+1]
                if chunck == int(num_chuncks)-1:
                    y_batch=np.append(inputs[chunck*chunck_steps+1:(chunck+1)*chunck_steps+1],test_data[i,0])

                y_batch=y_batch.reshape((-1,chunck_steps,n_outputs))

                sess.run(training_op, feed_dict={X:X_batch, y:y_batch})
                mae=loss.eval(feed_dict={X:X_batch, y:y_batch})
                #print(i,chunck, "\tMAE:", mae)
            X_new=test_data[i,0:8]    
            X_new=X_new.reshape((-1,chunck_steps,n_inputs))
            y_new=test_data[i,1:9]
            y_new=y_new.reshape((-1,chunck_steps,n_inputs))
            #print(y_new)
            y_pred=sess.run(outputs, feed_dict={X:X_new})
            #print(y_pred)
            MAE=loss.eval(feed_dict={X:X_new, y:y_new})
            #print("MAE on Test Data", i, "=", MAE)
            MAE_set=np.append(MAE_set,MAE)
            i+=1
        if (epoch==49):    
            print("MAE on Test Data=",MAE_set)
            print("Average of MAE=", np.mean(MAE_set))
            print("Standard Deviation of MAE=", np.std(MAE_set) )
        


MAE on Test Data= [ 0.00873547  0.01573171  0.00512684  0.00667009  0.00449899  0.00290796
  0.00611302  0.00689097  0.00410824  0.00897991  0.01642117  0.00482043
  0.00828072  0.00625571  0.00724943  0.00437695  0.01295693  0.00341084
  0.00582734  0.00495953  0.00739597  0.00238627  0.0070089   0.00393627
  0.00752453  0.00453218  0.0034133   0.00326491  0.00290527  0.00834396
  0.00444787  0.00377234  0.01778782  0.00502433  0.0058052   0.00401571
  0.00346731  0.00316935  0.00314541  0.00322458  0.00412143  0.00498856
  0.00296428  0.00414536  0.00439742  0.01554278  0.00343555  0.00574521
  0.00489257  0.00332741  0.0044419   0.00555747  0.00563371  0.00359005
  0.00277062  0.0032036   0.00739498  0.00320676  0.00727938  0.00368422
  0.00486239  0.00439631  0.00579891  0.00811361  0.01579919  0.00321093
  0.00509102  0.00736043  0.00384183  0.0081352   0.00324007  0.00726874
  0.01164765  0.00707213  0.00312851]
Average of MAE= 0.00597578617434
Standard Deviation of MAE= 0.003432

## Here I build multi LSTM with 2 layers.

In [115]:
def reset_graph():
    if 'sess' in globals() and sess:
        sess.close()
    tf.reset_default_graph()
    
reset_graph()

total_train_length=train_data.shape[1]
n_inputs=1
n_neurons_size=[200,100]
n_outputs=1
n_epochs=50

chunck_steps=8
num_chuncks=total_train_length/chunck_steps
import tensorflow as tf


"""
Placeholders
"""

X = tf.placeholder(tf.float32, [None, chunck_steps,n_inputs], name='input_placeholder')
y = tf.placeholder(tf.float32, [None, chunck_steps,n_outputs], name='labels_placeholder')



"""
RNN
"""

lstm_layers = [tf.contrib.rnn.OutputProjectionWrapper(tf.contrib.rnn.BasicLSTMCell(n_neurons),output_size=n_outputs) for n_neurons in n_neurons_size]
multi_lstm_cell = tf.contrib.rnn.MultiRNNCell(lstm_layers)
outputs, state = tf.nn.dynamic_rnn(cell=multi_lstm_cell,
                                   inputs=X,
                                   dtype=tf.float32)


"""
Predictions, loss, training step
"""
learning_rate=0.001
loss=tf.reduce_mean(tf.abs(outputs-y))
optimizer=tf.train.AdamOptimizer(learning_rate=learning_rate)
training_op=optimizer.minimize(loss)
init=tf.global_variables_initializer()


"""
Train the network
"""

with tf.Session() as sess:
    init.run()

    for epoch in range(n_epochs):
        MAE_set=[]
        i=0 

        for inputs in train_data:

            for chunck in range(int(num_chuncks)): 
                X_batch=inputs[chunck*chunck_steps:(chunck+1)*chunck_steps]
                X_batch=X_batch.reshape((-1,chunck_steps,n_inputs))
                if chunck < int(num_chuncks)-1:
                    y_batch=inputs[chunck*chunck_steps+1:(chunck+1)*chunck_steps+1]
                if chunck == int(num_chuncks)-1:
                    y_batch=np.append(inputs[chunck*chunck_steps+1:(chunck+1)*chunck_steps+1],test_data[i,0])

                y_batch=y_batch.reshape((-1,chunck_steps,n_outputs))

                sess.run(training_op, feed_dict={X:X_batch, y:y_batch})
                mae=loss.eval(feed_dict={X:X_batch, y:y_batch})
                #print(i,chunck, "\tMAE:", mae)
            X_new=test_data[i,0:8]    
            X_new=X_new.reshape((-1,chunck_steps,n_inputs))
            y_new=test_data[i,1:9]
            y_new=y_new.reshape((-1,chunck_steps,n_inputs))
            #print(y_new)
            y_pred=sess.run(outputs, feed_dict={X:X_new})
            #print(y_pred)
            MAE=loss.eval(feed_dict={X:X_new, y:y_new})
            #print("MAE on Test Data", i, "=", MAE)
            MAE_set=np.append(MAE_set,MAE)
            i+=1
        if (epoch==49):    
            print("MAE on Test Data=",MAE_set)
            print("Average of MAE=", np.mean(MAE_set))
            print("Standard Deviation of MAE=", np.std(MAE_set) )       
            
        
      
            
        


MAE on Test Data= [ 0.01016437  0.01653736  0.00398009  0.00528836  0.00459133  0.0028228
  0.00738354  0.00844439  0.00334629  0.00852444  0.01465988  0.00791115
  0.00801934  0.0056774   0.00714596  0.00460641  0.013224    0.00610596
  0.00478674  0.00469626  0.00723619  0.00391236  0.004901    0.00408586
  0.00792013  0.00491752  0.00355871  0.00374068  0.00506581  0.00738937
  0.00395161  0.00460182  0.01722602  0.00516756  0.00607558  0.00450329
  0.00350253  0.00460772  0.00363526  0.00405008  0.00467305  0.00353363
  0.00386669  0.00564894  0.00523234  0.01379528  0.00434224  0.00600349
  0.00715543  0.00371959  0.00519215  0.00413011  0.00441213  0.00542333
  0.00359068  0.0035362   0.00757689  0.00335291  0.01078339  0.00556382
  0.00351798  0.00408383  0.00529185  0.00767436  0.01408565  0.00557907
  0.00551017  0.00708768  0.00351748  0.0082474   0.0028021   0.0078242
  0.01203807  0.00763683  0.00328326]
Average of MAE= 0.00623569848637
Standard Deviation of MAE= 0.00320791

# Part IV: Discussion
## Discuss the models you have used so far. Which proved to be most effective, and why? What were challenges in using these methods? What would you wish to try if you had more time to work on this?

### From the models I fit in this project the RNN with truncated backprop and Basic LSTM and Multi LSTM that we used in this part have best results comparing with previous models that we fit. In these three models with just 50 iterations I got better results from Basic LSTM. 

### The baseline model and linear regression are very straight forward models but we need to play around with the neural network, RNN and also LSTM models a lot to find the best architecture. There are a lot of hyper parameters like number of layers, number of neurons in each layer, number of steps in truncated back prop and number of iterations that we need to play with and see which combination performs best.
### If I had more time, first of all I would have tried to find good combination of hyper parameters that give good results. Also I would have learned how to use Keras and used it to solve this problem.



## Part V: Bonus question: how would you use these models to make longer-horizon forecasts (e.g., two or more years out)

### I would use an Encoder Decoder Network to make longer-horizon forecasts. For example for two years forecast I would use 8 steps truncated back prop and then forecast for next truncated step which is 8 steps (2 years). I would use it for the whole training set to train the model and then use the model on test data to see how are the errors.