<center>
    <h1>Verbal Explanation of Spatial Temporal GNNs for Traffic Forecasting</h1>
    <h2>Visualizing an Explained Instance of the PeMS-Bay Dataset</h2>
</center>

---

In this notebook an explained instance of PeMS-Bay is visualized.

In [25]:
import sys
import os

# Set the main path in the root folder of the project.
sys.path.append(os.path.join('..'))

In [26]:
# Settings for autoreloading.
%load_ext autoreload
%autoreload 2

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [27]:
from src.utils.seed import set_random_seed

# Set the random seed for deterministic operations.
SEED = 42
set_random_seed(SEED)

# 1 Loading the Data
The data is loaded.

In [28]:
import os

BASE_DATA_DIR = os.path.join('..', 'data', 'pems-bay')

In [29]:
from src.data.data_extraction import get_adjacency_matrix

# Get the adjacency matrix
adj_matrix_structure = get_adjacency_matrix(
    os.path.join(BASE_DATA_DIR, 'raw', 'adj_mx_pems_bay.pkl'))

# Get the header of the adjacency matrix, the node indices and the
# matrix itself.
header, node_ids_dict, adj_matrix = adj_matrix_structure

In [30]:
from src.data.data_extraction import get_locations_dataframe

# Get the dataframe containing the latitude and longitude of each sensor.
locations_df = get_locations_dataframe(
    os.path.join(BASE_DATA_DIR, 'raw', 'graph_sensor_locations_pems_bay.csv'),
    has_header=False)

In [31]:
# Get the node positions dictionary.
node_pos_dict = { i: id for id, i in node_ids_dict.items() }

In [32]:
import os
import numpy as np

# Get the explained data.
x_test = np.load(os.path.join(BASE_DATA_DIR, 'explained', 'x_test.npy'))
y_test = np.load(os.path.join(BASE_DATA_DIR, 'explained', 'y_test.npy'))

test_clusters = np.load(os.path.join(BASE_DATA_DIR, 'clustered', 'x_test.npy'))
test_translations = np.load(os.path.join(BASE_DATA_DIR, 'translated', 'test.npy'))

# Get the time information of the explained data.
x_test_time = np.load(os.path.join(BASE_DATA_DIR, 'explained', 'x_test_time.npy'))
y_test_time = np.load(os.path.join(BASE_DATA_DIR, 'explained', 'y_test_time.npy'))

In [33]:
# Turn the results in kilometers per hour.
from src.utils.config import MPH_TO_KMH_FACTOR


x_test[..., 0] = x_test[..., 0] * MPH_TO_KMH_FACTOR
y_test = y_test * MPH_TO_KMH_FACTOR

In [34]:
import pickle

with open(os.path.join(BASE_DATA_DIR, 'structured', 'node_locations.pkl'), 'rb') as f:
    node_info = pickle.load(f)

# 2 Visualization
The first instance explanation is visualized.

In [35]:
i = 172

sample_x = x_test[i]
clusters_x = test_clusters[i]
sample_y = y_test[i]
sample_x_time = x_test_time[i]
sample_y_time = y_test_time[i]

clusters_x = clusters_x.astype(object)

verbal_translation = test_translations[i]

In [36]:
_, n_timesteps, n_nodes, _ = y_test.shape

In [37]:
for c in np.unique(clusters_x):
    if c == -1:
        clusters_x[clusters_x == c] = ' '
    else:
        clusters_x[clusters_x == c] = f'cluster {c}'

In [38]:
clusters_y = (sample_y > 0).astype(np.int64).astype(object)
clusters_y[clusters_y == 0] = ' '
clusters_y[clusters_y == 1] = 'target'

In [39]:
from src.explanation.clustering.analyisis import get_node_values_with_clusters_and_location_dataframe

df = get_node_values_with_clusters_and_location_dataframe(sample_x[..., 0:1], clusters_x, node_pos_dict, locations_df, sample_x_time)

In [40]:
icons = {
    ' ': 'cancel',
    'cluster 0': 'star',
    'cluster 1': 'circle',
    'cluster 2':'heart',
    'cluster 3': 'play',
    'cluster 4': 'pause',
    'target': 'certified'}

In [41]:
df['icon'] = df['cluster'].apply(lambda x: icons[x])

In [42]:
print(verbal_translation)

A congestion was predicted on on I 880 at km 61 on Saturday, 21/06/2017, with an average speed of 83.19 km/h from 16:20 to 17:15. This was a result of a series of congestions and a free flow.

Firstly, a contributing congestion took place on Bayshore Freeway at km 62, with an average speed of 68.93 km/h, from 15:20 to 15:55.

Next, a contributing free flow happened, at 112.06 km/h, on, again, Bayshore Freeway at km 63 from 15:20 to 16:10.

After this, an extra contributing congestion materialized, with an average speed of 91.86 km/h, on I 880 at kms 60 and 61 occurring from 15:20 to 16:15.

Lastly, a contributing severe congestion happened on, another time, I 880 at kms 60 and 61, occurring from 15:25 to 16:05 with an average speed of 52.03 km/h.


In [46]:
from src.data.data_analysis import show_kepler_map

show_kepler_map(
    df, config_file_path='../config/kepler/pems-bay/visualization_test_x_clusters.json')

KeplerGl(config={'version': 'v1', 'config': {'visState': {'filters': [{'dataId': ['data'], 'id': 'q0lm1bpo4', …

In [47]:
from src.explanation.clustering.analyisis import (
    get_node_values_with_clusters_and_location_dataframe)

df = get_node_values_with_clusters_and_location_dataframe(sample_y, clusters_y, node_pos_dict, locations_df, sample_y_time)

In [48]:
df['icon'] = df['cluster'].apply(lambda x: icons[x])

In [52]:
from src.data.data_analysis import show_kepler_map

show_kepler_map(
    df, config_file_path='../config/kepler/pems-bay/visualization_test_y_clusters.json')

KeplerGl(config={'version': 'v1', 'config': {'visState': {'filters': [{'dataId': ['data'], 'id': 'iwu8nbd4c', …