<center>
    <h1>Verbal Explanation of Spatial Temporal GNNs for Traffic Forecasting</h1>
    <h2>Verbal Explanations on the PeMS-Bay Dataset</h2>
</center>

---

The verbal translation occurs through a template-based approach. This method involves substituting placeholders in textual templates with the chosen content to form coherent narratives.

The verbal translation consists of composing a series of paragraphs by exploiting the content extracted from the graphical explanations. The first paragraph describes the predicted event and briefly sums up its causes, while the second to last paragraphs illustrate in detail each cause leading to the event which is each cluster of the important subgraph. The paragraphs describing the causes are sorted by the time they occurred.

In [1]:
import sys
import os

# Set the main path in the root folder of the project.
sys.path.append(os.path.join('..'))

In [2]:
# Settings for autoreloading.
%load_ext autoreload
%autoreload 2

In [3]:
from src.utils.seed import set_random_seed

# Set the random seed for deterministic operations.
SEED = 42
set_random_seed(SEED)

# 1 Loading the Data
In this section the data is loaded

In [4]:
import os

BASE_DATA_DIR = os.path.join('..', 'data', 'pems-bay')

In [5]:
from src.data.data_extraction import get_adjacency_matrix

# Get the adjacency matrix
adj_matrix_structure = get_adjacency_matrix(
    os.path.join(BASE_DATA_DIR, 'raw', 'adj_mx_pems_bay.pkl'))

# Get the header of the adjacency matrix, the node indices and the
# matrix itself.
_, node_ids_dict, _ = adj_matrix_structure

# Get the node positions dictionary.
node_pos_dict = { i: id for id, i in node_ids_dict.items() }

In [6]:
import pickle

# Get the node street and kilometrage dictionary.
with open(os.path.join(BASE_DATA_DIR, 'structured', 'node_locations.pkl'), 'rb') as f:
    node_info = pickle.load(f)

In [7]:
import os
import numpy as np

# Get the explained data.
x_train = np.load(os.path.join(BASE_DATA_DIR, 'explained', 'x_train.npy'))[..., :1]
y_train = np.load(os.path.join(BASE_DATA_DIR, 'explained', 'y_train.npy'))[..., :1]
x_val = np.load(os.path.join(BASE_DATA_DIR, 'explained', 'x_val.npy'))[..., :1]
y_val = np.load(os.path.join(BASE_DATA_DIR, 'explained', 'y_val.npy'))[..., :1]
x_test = np.load(os.path.join(BASE_DATA_DIR, 'explained', 'x_test.npy'))[..., :1]
y_test = np.load(os.path.join(BASE_DATA_DIR, 'explained', 'y_test.npy'))[..., :1]

# Get the clustered data.
x_train_clusters = np.load(os.path.join(BASE_DATA_DIR, 'clustered', 'x_train.npy'))
x_val_clusters = np.load(os.path.join(BASE_DATA_DIR, 'clustered', 'x_val.npy'))
x_test_clusters = np.load(os.path.join(BASE_DATA_DIR, 'clustered', 'x_test.npy'))


# Get the time information of the explained data.
x_train_times = np.load(os.path.join(BASE_DATA_DIR, 'explained', 'x_train_time.npy'))
y_train_times = np.load(os.path.join(BASE_DATA_DIR, 'explained', 'y_train_time.npy'))
x_val_times = np.load(os.path.join(BASE_DATA_DIR, 'explained', 'x_val_time.npy'))
y_val_times = np.load(os.path.join(BASE_DATA_DIR, 'explained', 'y_val_time.npy'))
x_test_times = np.load(os.path.join(BASE_DATA_DIR, 'explained', 'x_test_time.npy'))
y_test_times = np.load(os.path.join(BASE_DATA_DIR, 'explained', 'y_test_time.npy'))

The datasets are turned in km/h.

In [8]:
from src.utils.config import MPH_TO_KMH_FACTOR

x_train = x_train * MPH_TO_KMH_FACTOR
y_train = y_train * MPH_TO_KMH_FACTOR

x_val = x_val * MPH_TO_KMH_FACTOR
y_val = y_val * MPH_TO_KMH_FACTOR

x_test = x_test * MPH_TO_KMH_FACTOR
y_test = y_test * MPH_TO_KMH_FACTOR

# 2 Verbal Translation
The translation is performed on the datasets

In [9]:
VERBAL_TRANSLATION_DIR = os.path.join(BASE_DATA_DIR, 'translated')

os.makedirs(VERBAL_TRANSLATION_DIR, exist_ok=True)

In [10]:
from src.verbal_explanations.verbal_translation import translate_dataset

train_translated = translate_dataset(
    x_train,
    x_train_times,
    x_train_clusters,
    y_train,
    y_train_times,
    node_pos_dict,
    node_info)
    

np.save(os.path.join(VERBAL_TRANSLATION_DIR, 'train.npy'), train_translated)

val_translated = translate_dataset(
    x_val,
    x_val_times,
    x_val_clusters,
    y_val,
    y_val_times,
    node_pos_dict,
    node_info)

np.save(os.path.join(VERBAL_TRANSLATION_DIR, 'val.npy'), val_translated)

test_translated = translate_dataset(
    x_test,
    x_test_times,
    x_test_clusters,
    y_test,
    y_test_times,
    node_pos_dict,
    node_info)

np.save(os.path.join(VERBAL_TRANSLATION_DIR, 'test.npy'), test_translated)

Following an example of a verbal translation

In [11]:
print(test_translated[100])

A congestion was forecasted to hit I 880 at km 60 on Wednesday, 25/05/2017, from 20:20 to 21:15, with an average speed of 88.10 km/h. This was triggered by a series of congestions and free flows.

An initial contributing congestion manifested on I 880 at kms 59 and 60 from 19:20 to 19:45 with an average speed of 63.09 km/h.

Following this, a contributing free flow manifested from 19:20 to 20:15 on, once more, I 880 at kms 59 and 60 with an average speed of 101.28 km/h.

Next, another contributing free flow materialized from 19:25 to 19:50 on Bayshore Freeway at km 63 with an average speed of 114.14 km/h.

Next, yet a new contributing free flow occurred, with an average speed of 96.74 km/h, on, yet another time, I 880 at kms 58 and 59 occurring from 19:25 to 20:15.

Eventually, an extra contributing congestion occurred, at 77.57 km/h, on, yet another time, I 880 at km 60 at 20:10.
