# Universal graph neural networks for multi-modal multi-task learning
---

### Overview
1. Import data

---

Assume that every event in the world occurs at a real or virtual point in time <b>t</b> and space <b>s</b> that is represented by a node <b>n</b><sub>t,s</sub>. Any relationship or dependency between different events can then be described by three types of adjacency, that is, temporal adjacency <b>a</b><sub>(t1,t2),s</sub>, spatial adjacency <b>a</b><sub>t,(s1,s2)</sub> and spatio-temporal adjacency <b>a</b><sub>(t1,t2),(s1,s2)</sub>. Temporal adjacency <b>a</b><sub>(t1,t2),s</sub> describes the relationship between events at the same point <b>s</b> in space but different points <b>(t<sub>1</sub>,t<sub>2</sub>)</b> in time. Spatial adjacency <b>a</b><sub>s</sub> describes the relationship between events at the same point <b>t</b> in time but different points <b>(s<sub>1</sub>,s<sub>2</sub>)</b> in space. Spatio-temporal adjacency <b>a</b><sub>t,s</sub> respectively describes the relationship between events at different points <b>(t<sub>1</sub>,t<sub>2</sub>),(s<sub>1</sub>,s<sub>2</sub>)</b> in both time and space.

Let coordinates (<b>t</b>, <b>s</b>) describe the location of nodes with arbitrary data <b>t</b> $\in \mathbb{R}^{D_{t}}$ of dimension $D_{t}$ and <b>s</b> $\in \mathbb{R}^{D_{s}}$ of dimension $D_{s}$. Further, let node data <b>n</b><sub>t,s</sub> $\in \mathbb{R}^{D_{n}}$ of dimension $D_{n}$ and adjacency types <b>a</b><sub>t</sub> $\in \mathbb{R}^{D_{at}}$ of dimension $D_{at}$, <b>a</b><sub>s</sub> $\in \mathbb{R}^{D_{as}}$ of dimension $D_{as}$, and <b>a</b><sub>t,s</sub> $\in \mathbb{R}^{D_{ast}}$ of dimension $D_{ast}$ to be described by arbitrary data as well. Then, every consistent set of events can be described by a dynamic graph that evolves over time and space as we measure and add new data to it. We call this our **universal graph representation of data**.

Given our universal graph representation of data, our goal is to design a neural network prediction model that is able to solve multiple prediction tasks using arbitrary data types for <b>t</b>, <b>s</b>, <b>n</b><sub>t,s</sub>, <b>a</b><sub>t</sub>, <b>a</b><sub>s</sub> and <b>a</b><sub>t,s</sub> in each task. In order to achieve this, we must be able to transform our universal graph representation of data into a computation graph that can be trained using traditional deep learning algorithms like backpropagation and stochastic gradient descent. Graph neural networks (GNNs) can solve a wide range of prediction tasks as either node-, edge- or graph-level predictions every time that data can be structured as a graph. We will therefore use a **GNN architecture as the backbone** of our prediction model, whose parameters we share among different tasks. 




![UniversalDataGraph](../figures/UniversalDataGraph.png)



In [1]:
import torch
import pandas as pd
import hyper_arsam

HYPER = hyper_arsam.HyperParameter()

### 1. Import data

#### 1.1 Import Uber Movement data

In [2]:
###
# Import training validation and testing data ###
###

# choose exemplar files
filename_train = 'training_data_1.csv'
filename_val = 'validation_data_1.csv'
filename_test = 'testing_data_1.csv'

# create path to chosen exemplar files
path_to_train = HYPER.PATH_TO_UBERMOVEMENT_TRAIN + filename_train
path_to_val = HYPER.PATH_TO_UBERMOVEMENT_VAL + filename_val
path_to_test = HYPER.PATH_TO_UBERMOVEMENT_VAL + filename_test

# import data
df_uber_train = pd.read_csv(path_to_train)
df_uber_val = pd.read_csv(path_to_val)
df_uber_test = pd.read_csv(path_to_val)


###
# Import city name to ID mapping ###
###

# set filename of city name to id mapping
filename = '0_city_to_id_mapping.csv'

# set path to file
path_to_file = HYPER.PATH_TO_UBERMOVEMENT_ADD + filename

# read the file as csv
df_city_to_id = pd.read_csv(path_to_file, index_col=0)


###
# 
###

# list of cities
list_of_cities = df_city_to_id.index.to_list()

# declare empty dictionary to save data in
dict_df_city_geographics = {}

# iterate over all city names
for city_name in list_of_cities:
    
    # set path to data
    path_to_data = HYPER.PATH_TO_UBERMOVEMENT_ADD + city_name + '.csv'
    
    # import data as csv files
    csv_city_data = pd.read_csv(path_to_data, index_col=0)
    
    # save geographic data of iterated city in dictionary
    dict_df_city_geographics[city_name] = csv_city_data
    
    
###
# Display data ###
###

# display city to ID mapping
display(df_city_to_id)
    
# display train, val and test
display(df_uber_train)
display(df_uber_val)
display(df_uber_test)

# display exemplarly geographic city zone data for last iterated city
print('Exemplar geographic city zone data for city of', city_name)
display(csv_city_data)

Unnamed: 0,city_id
Guadalajara,0
Stockholm,1
San Francisco,2
Perth,3
Auckland,4
Boston,5
Brussels,6
London,7
Miami,8
Leeds,9


Unnamed: 0,city_id,source_id,destination_id,year,quarter_of_year,daytype,hour_of_day,mean_travel_time,standard_deviation_travel_time,geometric_mean_travel_time,geometric_standard_deviation_travel_time
0,2,1728,537,2018,3,1,13,1748.21,322.02,1720.56,1.19
1,2,1299,677,2018,3,0,15,884.85,257.80,851.56,1.31
2,0,697,101,2016,1,0,10,570.62,153.02,553.41,1.27
3,7,282,497,2016,1,0,14,3380.71,582.98,3333.37,1.18
4,7,977,543,2018,3,1,9,1104.18,255.40,1083.02,1.20
...,...,...,...,...,...,...,...,...,...,...,...
19999995,2,1575,1181,2019,4,1,10,2454.90,745.91,2363.81,1.30
19999996,4,34,254,2016,1,0,12,515.63,421.52,356.26,2.41
19999997,7,763,319,2019,4,1,20,1599.58,272.16,1581.01,1.16
19999998,7,478,358,2016,1,1,12,2118.55,373.31,2090.51,1.17


Unnamed: 0,city_id,source_id,destination_id,year,quarter_of_year,daytype,hour_of_day,mean_travel_time,standard_deviation_travel_time,geometric_mean_travel_time,geometric_standard_deviation_travel_time
0,5,659,656,2019,4,0,23,525.15,400.57,452.94,1.65
1,5,1118,884,2019,4,1,16,1211.42,431.92,1143.44,1.40
2,2,957,1235,2019,4,0,2,1446.33,296.95,1417.96,1.22
3,2,104,472,2020,1,1,12,81.65,109.97,54.30,2.18
4,2,1528,2275,2020,1,1,1,1603.29,356.83,1570.88,1.22
...,...,...,...,...,...,...,...,...,...,...,...
19999995,0,1263,294,2020,1,0,5,424.13,326.62,353.64,1.72
19999996,2,437,1333,2018,3,1,12,538.16,185.58,510.09,1.38
19999997,0,1116,1443,2019,4,1,21,768.87,242.86,726.41,1.44
19999998,0,842,1225,2019,4,1,21,1390.07,451.47,1331.47,1.32


Unnamed: 0,city_id,source_id,destination_id,year,quarter_of_year,daytype,hour_of_day,mean_travel_time,standard_deviation_travel_time,geometric_mean_travel_time,geometric_standard_deviation_travel_time
0,5,659,656,2019,4,0,23,525.15,400.57,452.94,1.65
1,5,1118,884,2019,4,1,16,1211.42,431.92,1143.44,1.40
2,2,957,1235,2019,4,0,2,1446.33,296.95,1417.96,1.22
3,2,104,472,2020,1,1,12,81.65,109.97,54.30,2.18
4,2,1528,2275,2020,1,1,1,1603.29,356.83,1570.88,1.22
...,...,...,...,...,...,...,...,...,...,...,...
19999995,0,1263,294,2020,1,0,5,424.13,326.62,353.64,1.72
19999996,2,437,1333,2018,3,1,12,538.16,185.58,510.09,1.38
19999997,0,1116,1443,2019,4,1,21,768.87,242.86,726.41,1.44
19999998,0,842,1225,2019,4,1,21,1390.07,451.47,1331.47,1.32


Geographic data for city of Leeds


Unnamed: 0,x_cord_1,x_cord_2,x_cord_3,x_cord_4,x_cord_5,x_cord_6,x_cord_7,x_cord_8,x_cord_9,x_cord_10,...,z_cord_290,z_cord_291,z_cord_292,z_cord_293,z_cord_294,z_cord_295,z_cord_296,z_cord_297,z_cord_298,z_cord_299
0,0.590381,0.590303,0.590438,0.590119,0.590361,0.590260,0.590540,0.590568,0.590443,0.590445,...,0.805492,0.805164,0.805160,0.805520,0.807008,0.807081,0.807136,0.807085,0.806968,0.806991
1,0.590422,0.590334,0.590476,0.590174,0.590398,0.590270,0.590553,0.590668,0.590517,0.590486,...,0.805455,0.805144,0.805106,0.805489,0.806959,0.807065,0.807066,0.807061,0.806949,0.806964
2,0.590388,0.590329,0.590481,0.590176,0.590388,0.590270,0.590527,0.590666,0.590575,0.590561,...,0.805442,0.805163,0.805228,0.805491,0.806961,0.807057,0.807043,0.807062,0.806911,0.806971
3,0.590398,0.590332,0.590461,0.590194,0.590422,0.590347,0.590565,0.590631,0.590580,0.590611,...,0.805395,0.805160,0.805291,0.805479,0.806926,0.807018,0.807029,0.807061,0.806907,0.806970
4,0.590361,0.590305,0.590506,0.590229,0.590440,0.590355,0.590651,0.590669,0.590626,0.590645,...,0.805364,0.805264,0.805308,0.805391,0.806940,0.806989,0.807005,0.807040,0.806861,0.806971
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
65,,,,,,,,,,,...,,,,,,,,,,
66,,,,,,,,,,,...,,,,,,,,,,
67,,,,,,,,,,,...,,,,,,,,,,
68,,,,,,,,,,,...,,,,,,,,,,


#### 1.2 Import ClimART data