# Advanced Certification Program in Computational Data Science

##  A program by IISc and TalentSprint

### Mini Project Notebook: Urban Traffic Flow Prediction using Graph Convolution Network - LSTM

## Learning Objectives

At the end of the experiment, you will be able to :

* forecast traffic flow using Graph Convolutional Network and LSTM
* understand the graph structured data and implement the forecasting model

## Information

Accurate and real-time traffic forecasting plays an important role in the Intelligent Traffic System and is important for

- urban traffic planning,
- traffic management, and
- traffic control.

Traffic forecasting is a challenging issue because of the constraints of the urban road network topological structure and the law of dynamic change with time (spatial dependence and temporal dependence). To capture the spatial and temporal dependence simultaneously, a neural network-based traffic forecasting method called the temporal graph convolutional network (T-GCN) model is very useful. It is a combination of the graph convolutional network (GCN) and gated recurrent unit (GRU).

- Specifically, the GCN is used to learn complex topological structures to capture spatial dependence and the gated recurrent unit is used to learn dynamic changes of traffic data to capture temporal dependence. Then, the T-GCN model is employed to traffic forecasting based on the urban road network. T-GCN model can obtain the spatio-temporal correlation from traffic data and the predictions outperform state-of-art baselines on real-world traffic datasets.

Reference: https://arxiv.org/abs/1811.05320

## Dataset



Urban Traffic Prediction from Spatio-Temporal Data Using Deep Meta Learning.

This traffic dataset contains traffic information collected from loop detectors in the highway of Los Angeles County (Jagadish et al., 2014). This dataset contains traffic speeds from Mar-1 to Mar-7, 2012 of 207 sensors, recorded every 5 minutes.  There are 2016 observations (timesteps) of speed records over 207 sensors. Speeds are recorded every 5 minutes. This means that, for a single hour, you will have 12 observations. Similarly, a single day will contain 288 (12x24) observations. Overall, the data consists of speeds recorded every 5 minutes over 207 for 7 days (12X24X7).

Data Source:
https://github.com/lehaifeng/T-GCN/tree/master/data

## Problem Statement

Forecasting urban traffic flow using spatio-temporal data with combined Graph Convolution + LSTM model


## Grading = 10 Points

In [None]:
#@title Download dataset
!wget -qq https://raw.githubusercontent.com/lehaifeng/T-GCN/master/data/los_adj.csv
!wget -qq https://raw.githubusercontent.com/lehaifeng/T-GCN/master/data/los_speed.csv

### Import required packages

In [None]:
!pip -qq install stellargraph

In [None]:
import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import Sequential, Model
from tensorflow.keras.layers import LSTM, Dense, Dropout, Input, RepeatVector, TimeDistributed
import stellargraph as sg
import networkx as nx

In [None]:
# from stellargraph.layer import GCN
# from stellargraph.mapper import FullBatchNodeGenerator, PaddedGraphGenerator

### Data loading and preparation

In [None]:
sensor_dist_adj = pd.read_csv("/content/los_adj.csv",header=None)
speed_data = pd.read_csv("/content/los_speed.csv").T
sensor_dist_adj = np.mat(sensor_dist_adj)

In [None]:
speed_data

In [None]:
sensor_dist_adj

In [None]:
num_nodes, time_len = speed_data.shape
print("No. of sensors:", num_nodes, "\nNo of timesteps:", time_len)

In [None]:
sensor_dist_adj.shape

In [None]:
speed_data.head()

#### Plotting the time series of 10 sensors data

In [None]:
speed_data.T.iloc[:,:10].plot(figsize=(14,5))
plt.ylabel('sales')
plt.legend(loc='center left', bbox_to_anchor=(1.0, 0.5))

#### Create and draw the graph of adjacency of matrix

Hint: [link](https://towardsdatascience.com/graph-coloring-with-networkx-88c45f09b8f4)

In [None]:
def show_graph_with_labels(adjacency_matrix):
    rows, cols = np.where(adjacency_matrix > 0)
    edges = zip(rows.tolist(), cols.tolist())
    gr = nx.Graph()
    gr.add_edges_from(edges)
    nx.draw(gr, node_size=10)#, labels=mylabels, with_labels=True)
    plt.show()

show_graph_with_labels(sensor_dist_adj)

### Preprocessing and train test split

In [None]:
def train_test_split(data, train_portion):
    time_len = data.shape[1]
    train_size = int(time_len * train_portion)
    train_data = np.array(data.iloc[:,:train_size])
    test_data = np.array(data.iloc[:,train_size:])
    return train_data, test_data
train_rate = 0.8
train_data, test_data = train_test_split(speed_data, train_rate)
print("Train data: ", train_data.shape)
print("Test data: ", test_data.shape)

In [None]:
def scale_data(train_data, test_data):
    max_speed = train_data.max()
    min_speed = train_data.min()
    train_scaled = (train_data - min_speed) / (max_speed - min_speed)
    test_scaled = (test_data - min_speed) / (max_speed - min_speed)
    return train_scaled, test_scaled
train_scaled, test_scaled = scale_data(train_data, test_data)

#### Prepare Time series data


Aim is to use 50 minutes of historical speed observations to predict the speed in future (1 hour ahead)

* Choose windows of 10 historical observations i.e. 5 * 10 = 50 minutes (`seq_len`) for each segment as the input and use it to predict the speed after 5 * 12 = 60 minutes (target) using the sliding window approach.

**Note:**
The below parameters
-  `seq_len` is the size of the past window of information.
- `pre_len` is future prediction ( 1 hour in future = 12 * 5 minutes)



Steps:

* Prepare the data to be fed into an LSTM. The LSTM model learns a function that maps a **sequence of past observations as input to an output observation**, so the sequence of observations must be transformed into multiple examples from which the LSTM can learn.

* Choose to use 50 minutes of historical speed observations to predict the speed in future (eg. 1 hour ahead). First reshape the timeseries data into windows of 10 historical observations for each segment as the input and the speed 60 minutes later as the prediction label. This can be performed using a sliding window approach:

    - Starting from the beginning of the timeseries, we take the first 10 speed records as the 10 input features and the speed 12 timesteps head (60 minutes) as the speed we want to predict.

    - Shift the timeseries by one timestep and take the 10 observations from the current point as the input features and the speed one hour ahead as the output to predict.

    - Keep shifting by 1 timestep and picking the 10 timestep window from the current time as input feature and the speed one hour ahead of the 10th timestep as the output to predict, for the entire data.

  *Note: The above steps are done for each sensor.*

The function below returns the above transformed timeseries data for the model to train on. The parameter seq_len is the size of the past window of information. The pre_len is how far in the future does the model need to learn to predict.

Each **training observation** is 10 historical speeds **(seq_len).**

Each **training prediction** is the speed 60 minutes later **(pre_len).**

In [None]:
seq_len = 10
pre_len = 12
def sequence_data_preparation(seq_len, pre_len, train_data, test_data):
    trainX, trainY, testX, testY = [], [], [], []
    for i in range(train_data.shape[1] - int(seq_len + pre_len - 1)):
        a = train_data[:, i : i + seq_len + pre_len]
        trainX.append(a[:,:seq_len])
        trainY.append(a[:,-1])

    for i in range(test_data.shape[1] - int(seq_len + pre_len - 1)):
        b = test_data[:,i : i + seq_len + pre_len]
        testX.append(b[:,:seq_len])
        testY.append(b[:,-1])
    return np.array(trainX), np.array(trainY), np.array(testX), np.array(testY)

trainX, trainY, testX, testY = sequence_data_preparation(seq_len, pre_len, train_scaled, test_scaled)
trainX.shape, trainY.shape, testX.shape, testY.shape,

### Build and Train the LSTM model and plot the loss results

In [None]:
inp = Input(shape=(207, 10))
x = LSTM(200, activation='tanh')(inp)
x = RepeatVector(200)(x)
x = LSTM(200, activation='tanh')(x)
out = Dense(207)(x)
model_lstm = Model(inp, out)
model_lstm.compile(optimizer='adam', loss='mse')
model_lstm.summary()

In [None]:
hist = model_lstm.fit(x = trainX,y= trainY, epochs=10, validation_data=(testX, testY))

In [None]:
sg.utils.plot_history(hist)

### StellarGraph Graph Convolution and LSTM model ( 3 points)

In order to use the model, we need:

* An **N by N** adjacency matrix, which describes the distance relationship between the N sensors,

* An **N by T** feature matrix, which describes the ($f_1, .., f_T$) speed records over T timesteps for the N sensors.

Arguments of GCN_LSTM:
  - seq_len: No. of LSTM cells

  - adj: unweighted/weighted adjacency matrix

  - gc_layer_sizes (list of int): Output sizes of Graph Convolution  layers in the stack.

  - lstm_layer_sizes (list of int): Output sizes of LSTM layers in the stack.

  - gc_activations (list of str or func): Activations applied to each layer's output.

  - lstm_activations (list of str or func): Activations applied to each layer's output; defaults to ``['tanh', ..., 'tanh']``.

In [None]:
from stellargraph.layer import GCN_LSTM
GCN_LSTM?

In [None]:
gcn_lstm = GCN_LSTM(
    seq_len=seq_len,
    adj=sensor_dist_adj,
    gc_layer_sizes=[16, 10],
    gc_activations=["relu", "relu"],
    lstm_layer_sizes=[200, 200],
    lstm_activations=["tanh", "tanh"],
)

In [None]:
x_input, x_output = gcn_lstm.in_out_tensors()
model = Model(inputs=x_input, outputs=x_output)
model.compile(optimizer="adam", loss="mae", metrics=["mse"])
model.summary()

In [None]:
history = model.fit(trainX, trainY, epochs=100, batch_size=60, validation_data=(testX, testY))

In [None]:
sg.utils.plot_history(history)

In [None]:
ythat = model.predict(trainX)
yhat = model.predict(testX)

In [None]:
trainX.shape, testX.shape

#### Rescale values
Rescale the predicted values to the original value range of the timeseries.

In [None]:
## Rescale values
max_speed = train_data.max()
min_speed = train_data.min()

## actual train and test values
train_rescref = np.array(trainY * max_speed)
test_rescref = np.array(testY * max_speed)
## Rescale model predicted values
train_rescpred = np.array((ythat) * max_speed)
test_rescpred = np.array((yhat) * max_speed)

In [None]:
test_rescref.shape

In [None]:
onetest_sample = testX[0,].reshape(1,207,10)
print(onetest_sample.shape)
onesample_pred = model.predict(onetest_sample)
onesample_pred.shape

### Plot the predictions and Loss of each sensors

In [None]:
##all test result visualization
fig1 = plt.figure(figsize=(15, 8))
#    ax1 = fig1.add_subplot(1,1,1)
a_pred = test_rescpred[:, 5]
a_true = test_rescref[:, 5]
plt.plot(a_pred, "r-", label="prediction")
plt.plot(a_true, "b-", label="true")
plt.xlabel("time")
plt.ylabel("speed")
plt.legend(loc="best", fontsize=10)
plt.show()

Error plotting for all the sensors

In [None]:
from sklearn.metrics import mean_squared_error, mean_absolute_error
mse_sensors, mae_sensors = [],[]
for sensor in range(test_rescpred.shape[1]):
  a_pred = test_rescpred[:, sensor]
  a_true = test_rescref[:, sensor]
  mse_sensors.append(mean_squared_error(a_true, a_pred))
  mae_sensors.append(mean_absolute_error(a_true, a_pred))

In [None]:
# mse bar plot
plt.bar(range(207),mse_sensors)
plt.xlabel("Sensors")
plt.ylabel("MSE")
plt.show()

In [None]:
# mae bar plot
plt.bar(range(207),mae_sensors)
plt.xlabel("Sensors")
plt.ylabel("MAE")
plt.show()

#### Report Analysis

  * Discuss: Why is this called a spatio-temporal problem?

  * Discuss: In what way is GCN-LSTM more useful for the traffic prediction task than LSTM?