Feature/gcn lstm #1085

habiba-h · 2020-03-17T00:12:49Z

GCN-LSTM is stack of graph convolution and lstm layers used for time series prediction on spatio-temporal data.
We use a 2-layer graph convolution network to leverage the graph structure. The output of the graph convolution network is fed into an LSTM based sequence to sequence model, that is jointly trained on the output of gcn and the historical speeds of a segment.

This work is inspired by the paper: T-GCN: A Temporal Graph Convolutional Network for Traffic Prediction.

The authors have made available the implementation of their model in their github repo.

There has been a few differences in the architecture proposed in the paper and the implementation with regards to the graph convolution component, such as:

-lehaifeng/T-GCN#18

-lehaifeng/T-GCN#14

Architecture:

We use the 2-layer graph convolution proposed by Kipf and Welling (ICLR2017).
The output of gcn feeds in an LSTM layer that combines gcn output with the time-series history to learn a combined spatio-temporal end to end model.
A dense and dropout layer is further added as it helps improve the model performance. Note, these final two layers are not part of the architecture proposed in the TGCN paper.

Reviewer Checklist

The code for gcn-lstm class is available.
The gcn-lstm class implements the architecture explained above.
Demo notebook runs and clearly demonstrates how to use GCN-LSTM for spatio-temporal data.

review-notebook-app · 2020-03-17T00:12:55Z

Check out this pull request on

You'll be able to see Jupyter notebook diff and discuss changes. Powered by ReviewNB.

codeclimate · 2020-03-17T00:13:24Z

demos/spatio-temporal/input_data.py

+
+import numpy as np
+import pandas as pd
+import pickle as pkl


Consider possible security implications associated with pickle module.

codeclimate · 2020-03-17T00:13:26Z

Code Climate has analyzed commit 92a2c65 and detected 1 issue on this pull request.

Here's the issue category breakdown:

Category	Count
Security	1

View more on Code Climate.

huonw

I've got a few questions about the approach and data used here, as well as a couple of minor things about the Graph_Convolution_LSTM class.

huonw · 2020-03-18T03:34:47Z

demos/spatio-temporal/input_data.py

+import pickle as pkl
+
+def load_sz_data(dataset):
+    sz_adj = pd.read_csv(r'data/sz_adj.csv',header=None)


What's the sz data?

This is the alternative dataset of trajectories of taxis. I had it as alternative but not using it. I will remove it.

huonw · 2020-03-18T03:37:02Z

demos/spatio-temporal/input_data.py

+    return sz_tf, adj
+
+def load_los_data(dataset):
+    los_adj = pd.read_csv(r'data/los_adj.csv',header=None)


What's the los data? It would be best if we could add it to stellargraph/datasets/datasets.py instead of having an extra script for the demo. If it ends up being too awkward to add it there, feel free to open an issue for follow up work.

Yes, it can be easily added to the stellargraph/datasets/datasets.py.
There are a number of other methods that I think of as data related such as formatting the data in a time series to be fed into a time series forecasting model. So having it all in one utility file where anyone interested in just doing spatio-temporal stuff can look them, instead of looking them up in a centralized utility file. This was just for convenience. I can merge it in the stellargraph/datasets/datasets.py.

huonw · 2020-03-18T03:38:36Z

stellargraph/layer/gcn_lstm.py

+from .preprocessing_layer import GraphPreProcessingLayer
+
+
+class GraphConvolution(Layer):


As far as I can tell, this differs to the existing gcn.GraphConvolution layer in four ways:

the adjacency matrix is a constant

it doesn't squeeze out the batch dimension

it does some transposes in the matrix multiplications

it doesn't have special handling for final_layer=True

Is this accurate? Could you expand on why 1 and 3 in particular are needed?

As far as I can tell, this differs to the existing gcn.GraphConvolution layer in four ways:

the adjacency matrix is a constant

Yes, it is. There are two pragmatic reasons for that but neither are binding.

The temporal data means structure is fixed but features are changing. So its from performance perspective better to just load it at initialization. However, we can make it more general for dynamic graphs and also to make it more aligned with the already existing graph_convolution layer.

The graph convolution happens for each timestep of each observation. For example, for a sequence of 10 timesteps, gcn is performed for each one of them with the same static graph but different time (feature), so seems like a big overhead of passing the adjacency matrix in each build.
I am sure there is a more optimal way of doing this. This was just meant to produce a working version.

it doesn't squeeze out the batch dimension

Yes and this is by design. Keras require the batch dimension whereas GCN doesn't so even though a unit batch dimension is passed it gets squeezed inside the layer. However, in the gcn_lstm, the data is passed as a 3D tensor [batch, sequence, nodes], don't need to squeeze the batch dimension.
Maybe we can add the not squeezing as an option to the existing graph_convolution layer but didn't want to update it as a first pass until I have the actual dimensions I need to gcn_lstm figured out. This can be merged in the more polished version.

it does some transposes in the matrix multiplications

Yes, due to the dimension. Typically, you have nodes times features dimensions. Here we have batch times features times nodes dimension. Since graph convolution happens on the feature dimension, so i transpose it before convolving on the feature dimension and then flip it back.

it doesn't have special handling for final_layer=True

Since here graph_convolution is stacked over the RNN layer, i.e. it feeds into the RNN so its never a final_layer, so its never true. I can pass it as false when I define the architecture, so this can easily be handled with the graph_convolution layer we already have. I am just keeping it false without any special handling. I'll add comment to explain this in the code.

Is this accurate? Could you expand on why 1 and 3 in particular are needed?

See above. Any thoughts on these design choices are really appreciated!

Yes, it is. There are two pragmatic reasons for that but neither are binding.

Makes sense; seems fine to start with it as written for now 👍

However, in the gcn_lstm, the data is passed as a 3D tensor [batch, sequence, nodes]

And batch > 1 here?

Yes, due to the dimension. Typically, you have nodes times features dimensions. Here we have batch times features times nodes dimension. Since graph convolution happens on the feature dimension, so i transpose it before convolving on the feature dimension and then flip it back.

Ah, right, our other GCN layers have input shape [batch, nodes, features], makes sense!

Since here graph_convolution is stacked over the RNN layer, i.e. it feeds into the RNN so its never a final_layer, so its never true. I can pass it as false when I define the architecture, so this can easily be handled with the graph_convolution layer we already have. I am just keeping it false without any special handling. I'll add comment to explain this in the code.

The other GCN layers are usually stacked into another dense layer to compute predictions (or similar), so they're not the "final layer" of the (whole) model either. The final_layer handling is so that the full-batch layer only yields predictions for the nodes that were passed to the .flow method, rather than all nodes.

Given that, this code currently seems to just yield predictions for all nodes. I think this means it's not possible to train on only a subset of labelled nodes? (For instance, do a train/test split across the nodes, rather than the times?) I guess this isn't too important for these sort of datasets, because it's time that is interesting and so it's assumed we'll always have information for every node?

huonw · 2020-03-18T03:43:54Z

stellargraph/layer/__init__.py

@@ -36,3 +36,4 @@
 from .rgcn import *
 from .watch_your_step import *
 from .knowledge_graph import *
+from .gcn_lstm import *


Both gcn and gcn_lstm now include a class called GraphConvolution, which means these * imports will shadow each other as the stellargraph.layer.GraphConvolution name. I think in particular the gcn_lstm one will "win" so existing code importing stellargraph.layer.GraphConvolution will switch from using GCN one to using GCN-LSTM one. They're not interchangeable so that code will break.

One way to fix this would be to rename the gcn_lstm GraphConvolution to something like FixedAdjacencyGraphConvolution or even just LSTMGraphConvolution so that it no longer overlaps with the gcn one.

I didn't think about it causing conflicts. I'll fix the name.
However, why would the gcn_lstm's GraphConvolution would win in the import?

It's last, and so overwrites the earlier one. I guess one can think of it like:

GraphConvolution = gcn.GraphConvolution ... GraphConvolution = gcn_lstm.GraphConvolution

huonw · 2020-03-18T03:48:16Z

stellargraph/layer/gcn_lstm.py

+        return output
+
+
+class Graph_Convolution_LSTM(Model):


Our other "models" aren't actually formal models, meaning they don't inherit from tf.keras.Model. I think it would be good to stick to that.

Also, Python style is to not have underscores in class names, e.g.

Suggested change

class Graph_Convolution_LSTM(Model):

class GraphConvolutionLSTM(Model):

but potentially it's best to break the "style" slightly and call it:

Suggested change

class Graph_Convolution_LSTM(Model):

class GCN_LSTM(Model):

huonw · 2020-03-18T03:50:44Z

stellargraph/layer/gcn_lstm.py

+        self.dense = Dense(self.outputs, activation=self.activations[2])
+        self.dropout = Dropout(self.dropout)
+
+    def call(self, inputs):


Echoing the comment about the name, our "model" classes usually have two methods:

def __call__(self, inputs): # like this `call`, but note the double-underscore, to make it work like `some_model(inputs)` ... def build(self): # builds appropriate input tensors and then applies the model inputs = ... outputs = self(inputs) return inputs, outputs

kieranricardo

thanks for implementing this! aside from @huonw's feedback I have a comment about sequence_data_preparation and integrating this with the library.

Also re. the storing the adjacency in the GraphConvolution, I guess this makes gcn lstm non-inductive? I'm not familiar with spatial temporal modelling but is this important? Are there cases when you'd want to train on one graph and test on another? Or even cases where the adjacency matrix changes with time?

kieranricardo · 2020-03-20T01:59:42Z

demos/spatio-temporal/input_data.py

+    test_scaled = (test_data - min_speed)/ (max_speed - min_speed)
+    return train_scaled, test_scaled
+
+def sequence_data_preparation(seq_len, pre_len, train_data, test_data):


I think seems like something that should be in the stellargraph/mappers rather than a demo script. It looks like sequence_data_preparation is essential for using gcn lstm so we should include this in the library.

Also, I think this could refactored be refactored into a stellargraph generator and either a keras Sequence or a tf Dataset. If I'm understanding this right, say if you have N nodes, T time steps, and sequence length L, this function takes in a NxT array and returns a NxTxL array. This could get quite big! with a Sequence or Dataset object you could generate batches of N x batch_times x L on the fly to avoid having to construct the NxTxL array. What do you think?

didn't mean to approve here, sorry

huonw

The plots in the notebooks look really good 👍

huonw · 2020-03-23T22:06:38Z

stellargraph/layer/__init__.py

@@ -36,3 +36,4 @@
 from .rgcn import *
 from .watch_your_step import *
 from .knowledge_graph import *
+from .gcn_lstm import *


It's last, and so overwrites the earlier one. I guess one can think of it like:

GraphConvolution = gcn.GraphConvolution ... GraphConvolution = gcn_lstm.GraphConvolution

huonw · 2020-03-25T01:13:23Z

stellargraph/layer/gcn_lstm.py

+from .preprocessing_layer import GraphPreProcessingLayer
+
+
+class GraphConvolution(Layer):


Yes, it is. There are two pragmatic reasons for that but neither are binding.

Makes sense; seems fine to start with it as written for now 👍

However, in the gcn_lstm, the data is passed as a 3D tensor [batch, sequence, nodes]

And batch > 1 here?

Yes, due to the dimension. Typically, you have nodes times features dimensions. Here we have batch times features times nodes dimension. Since graph convolution happens on the feature dimension, so i transpose it before convolving on the feature dimension and then flip it back.

Ah, right, our other GCN layers have input shape [batch, nodes, features], makes sense!

Since here graph_convolution is stacked over the RNN layer, i.e. it feeds into the RNN so its never a final_layer, so its never true. I can pass it as false when I define the architecture, so this can easily be handled with the graph_convolution layer we already have. I am just keeping it false without any special handling. I'll add comment to explain this in the code.

The other GCN layers are usually stacked into another dense layer to compute predictions (or similar), so they're not the "final layer" of the (whole) model either. The final_layer handling is so that the full-batch layer only yields predictions for the nodes that were passed to the .flow method, rather than all nodes.

Given that, this code currently seems to just yield predictions for all nodes. I think this means it's not possible to train on only a subset of labelled nodes? (For instance, do a train/test split across the nodes, rather than the times?) I guess this isn't too important for these sort of datasets, because it's time that is interesting and so it's assumed we'll always have information for every node?

huonw · 2020-04-21T06:24:43Z

stellargraph/datasets/datasets.py

@@ -849,3 +850,55 @@ def load(self):
        )

        return StellarGraph(nodes=nodes, edges=edges, edge_weight_column="time"), edges
+
+
+@experimental(reason="the data isn't downloaded automatically", issues=[9999])


The 9999 in my suggestion here and in test-demo-notebooks.sh was a placeholder that needs to be replaced by a real issue number. At the moment, this is creating a link like https://github.com/stellargraph/stellargraph/issues/9999, which doesn't currently exist. Could you open an issue about this and then replace the 9999 here and in test-demo-notebooks.sh?

Ah. Ok! I'll fix this.

Great! Reminder:

Suggested change

@experimental(reason="the data isn't downloaded automatically", issues=[9999])

@experimental(reason="the data isn't downloaded automatically", issues=[1303])

… notebook

kieranricardo

nice! just a few docstring edits needed I think

kieranricardo · 2020-04-21T06:43:44Z

stellargraph/layer/gcn_lstm.py

@@ -212,41 +195,41 @@ def call(self, features):
 class GraphConvolutionLSTM:

    """
-        A stack of 2 Graph Convolutional layers followed by an LSTM, Dropout and,  Dense layer.
+        A stack of N1 Graph Convolutional layers followed by N2 LSTM, Dropout and,  Dense layer.


I think this might read a bit clearer:

Suggested change

A stack of N1 Graph Convolutional layers followed by N2 LSTM, Dropout and, Dense layer.

A stack of N1 Graph Convolutional layers followed by N2 LSTM layers, a Dropout layer, and a Dense layer.

kieranricardo · 2020-04-21T06:46:06Z

stellargraph/layer/gcn_lstm.py

-           2. 1 LSTM layer
+        The StellarGraph implementation is built as a stack of the following set of layers:
+           1. User specified no. of Graph Convolutional layers
+           2. User specified no. of LSTM layers
           3. 1 Dense layer
           4. 1 Dropout layer
           The last two layers consistently showed better performance and regularization experimentally.


I think the docstring has changed a fair bit now, do you better performance compared to the implementation in the paper?

Actually its very hard to compare with the paper. The paper reports the results different from what I get when I run their code. And their implementation is quite different from what they propose in the paper :-).
But this implementation does better than the simple LSTM or Naive baseline.

kieranricardo · 2020-04-21T06:52:17Z

stellargraph/layer/gcn_lstm.py

+           seq_len: No. of LSTM cells
+           adj: unweighted/weighted adjacency matrix of [no.of nodes by no. of nodes dimension
+           gc_layers: No. of Graph Convolution  layers in the stack. The output of each layer is equal to sequence length.
+           lstm_layer_size (list of int): Output sizes of LSTM layers in the stack.


small typoe

Suggested change

lstm_layer_size (list of int): Output sizes of LSTM layers in the stack.

lstm_layer_sizes (list of int): Output sizes of LSTM layers in the stack.

kieranricardo · 2020-04-21T06:53:05Z

stellargraph/layer/gcn_lstm.py

@@ -47,6 +38,7 @@ class FixedAdjacencyGraphConvolution(Layer):

     Args:
        units (int): dimensionality of output feature vectors
+        A (N x N): weighted/unweighted adjacency matrix
        activation (str or func): nonlinear activation applied to layer's output to obtain output features
        use_bias (bool): toggles an optional bias
        final_layer (bool): If False the layer returns output for all nodes,


I don't think there's a final_layer arg here

Suggested change

final_layer (bool): If False the layer returns output for all nodes,

ah hit the wrong button again!

…ellargraph into feature/gcn-lstm

kieranricardo

LGTM :)

huonw · 2020-04-21T07:30:27Z

.buildkite/steps/test-demo-notebooks.sh

+    # FIXME #849: CI does not have neo4j
+    # FIXME #907: socialcomputing.asu.edu is down
+    # FIXME #1303: METR_LA dataset can't be downloaded automatically
+    # FIXME #818: datasets can't be downloaded
+    # FIXME #819: out-of-memory


This has added a whole pile more lines then it needs to: the #849 and #907 fixme's have been resurrected and the #818 and #819 ones have been duplicated

Suggested change

# FIXME #849: CI does not have neo4j

# FIXME #907: socialcomputing.asu.edu is down

# FIXME #1303: METR_LA dataset can't be downloaded automatically

# FIXME #818: datasets can't be downloaded

# FIXME #819: out-of-memory

# FIXME #1303: METR_LA dataset can't be downloaded automatically

huonw · 2020-04-21T07:30:44Z

stellargraph/datasets/datasets.py

@@ -849,3 +850,55 @@ def load(self):
        )

        return StellarGraph(nodes=nodes, edges=edges, edge_weight_column="time"), edges
+
+
+@experimental(reason="the data isn't downloaded automatically", issues=[9999])


Great! Reminder:

Suggested change

@experimental(reason="the data isn't downloaded automatically", issues=[9999])

@experimental(reason="the data isn't downloaded automatically", issues=[1303])

habiba-h added 11 commits March 11, 2020 12:06

gcn-lstm preliminary model sketch

e762079

gcn + lstm with feature attention

1b6825d

with gcn_lstm import

8a7f4f4

working demo of gcn and lstm architecture on LA data

c8ce716

utility methods to load and preprocess the data

0554d56

LA data distance adjacency matrix of the LA sensor network

cba0fb8

LA data time series of speed records of sensors

4343e6c

Fixing docstrings and polishing the methods.

a689576

testing method with the updates.

34fd2c2

Adding details of the model.

df8bb92

Making the architecture more flexible w.r.t. training parameters.

dd292db

habiba-h added the sg-library label Mar 17, 2020

habiba-h added this to the Sprint 26 (20 Mar) milestone Mar 17, 2020

habiba-h requested a review from kieranricardo March 17, 2020 00:12

habiba-h self-assigned this Mar 17, 2020

codeclimate bot reviewed Mar 17, 2020

View reviewed changes

huonw reviewed Mar 18, 2020

View reviewed changes

habiba-h added 3 commits March 19, 2020 17:23

Fixing the demo.

fca9b83

with plots update

6088635

increasing prediction time

17ce0da

kieranricardo previously approved these changes Mar 20, 2020

View reviewed changes

huonw reviewed Mar 25, 2020

View reviewed changes

habiba-h added 5 commits March 30, 2020 13:20

fixing the load data to align with stellargraph.datasets.

bddeb57

testing changes

f43f1ff

Merge branch 'develop' into feature/gcn-lstm

35d51fb

Merge branch 'develop' into feature/gcn-lstm

8d7ba51

updates based on reviews

f697725

remove the LA data from github

a395b1d

huonw mentioned this pull request Apr 21, 2020

Add changelog for 1.0.0rc1 release #1287

Merged

4 tasks

habiba-h added 7 commits April 21, 2020 16:08

excluded gcn_lstm-LA

363da2f

added the laplacian method of gcn_lstm

21fb15c

added import gcn_lstm

5af1e26

fixed gcn_lstm based on reviews

40a4868

latest run based on changes in gcn_lstm updates

396656a

added @experimental bit for the METR_LA class

baf316a

Merge branch 'develop' into feature/gcn-lstm

1920e28

habiba-h requested a review from kieranricardo April 21, 2020 06:18

huonw reviewed Apr 21, 2020

View reviewed changes

updated the issue number with a real issue number for the gcn_lstm-LA…

b7bd216

… notebook

kieranricardo previously approved these changes Apr 21, 2020

View reviewed changes

habiba-h added 2 commits April 21, 2020 16:59

Merge branch 'feature/gcn-lstm' of https://github.com/stellargraph/st…

1ecb6d0

…ellargraph into feature/gcn-lstm

fixes based on the latest comments

3069e89

habiba-h requested a review from kieranricardo April 21, 2020 07:09

kieranricardo approved these changes Apr 21, 2020

View reviewed changes

habiba-h added 4 commits April 21, 2020 17:20

resolved formatting issue

57f8c64

formatting

5bbdbda

Merge remote-tracking branch 'origin/develop' into feature/gcn-lstm

a60a37a

black formatting

e7c184a

huonw reviewed Apr 21, 2020

View reviewed changes

habiba-h added 5 commits April 21, 2020 17:54

increased parallelism

3bea112

remove the #fixme that shouldn't be there and those that are duplicated

7a57a4a

overriding download method in the METR-LA class

f9bf989

white space

5d9e9c9

black formatting

92a2c65

habiba-h merged commit ee993bb into develop Apr 21, 2020

habiba-h deleted the feature/gcn-lstm branch April 21, 2020 09:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/gcn lstm #1085

Feature/gcn lstm #1085

habiba-h commented Mar 17, 2020

review-notebook-app bot commented Mar 17, 2020

codeclimate bot Mar 17, 2020

codeclimate bot commented Mar 17, 2020 •

edited

Loading

huonw left a comment

huonw Mar 18, 2020

habiba-h Mar 23, 2020

huonw Mar 18, 2020

habiba-h Mar 23, 2020

huonw Mar 18, 2020

habiba-h Mar 23, 2020

huonw Mar 25, 2020 •

edited

Loading

huonw Mar 18, 2020

habiba-h Mar 23, 2020

huonw Mar 23, 2020

huonw Mar 18, 2020

huonw Mar 18, 2020

kieranricardo left a comment

kieranricardo Mar 20, 2020

huonw left a comment

huonw Mar 23, 2020

huonw Mar 25, 2020 •

edited

Loading

huonw Apr 21, 2020

habiba-h Apr 21, 2020

huonw Apr 21, 2020

kieranricardo left a comment

kieranricardo Apr 21, 2020

kieranricardo Apr 21, 2020

habiba-h Apr 21, 2020

kieranricardo Apr 21, 2020

kieranricardo Apr 21, 2020

kieranricardo left a comment

huonw Apr 21, 2020

huonw Apr 21, 2020

		from .preprocessing_layer import GraphPreProcessingLayer


		class GraphConvolution(Layer):

	class Graph_Convolution_LSTM(Model):
	class GraphConvolutionLSTM(Model):

	@experimental(reason="the data isn't downloaded automatically", issues=[9999])
	@experimental(reason="the data isn't downloaded automatically", issues=[1303])

	A stack of N1 Graph Convolutional layers followed by N2 LSTM, Dropout and, Dense layer.
	A stack of N1 Graph Convolutional layers followed by N2 LSTM layers, a Dropout layer, and a Dense layer.

	lstm_layer_size (list of int): Output sizes of LSTM layers in the stack.
	lstm_layer_sizes (list of int): Output sizes of LSTM layers in the stack.

Feature/gcn lstm #1085

Feature/gcn lstm #1085

Conversation

habiba-h commented Mar 17, 2020

Reviewer Checklist

review-notebook-app bot commented Mar 17, 2020

codeclimate bot Mar 17, 2020

Choose a reason for hiding this comment

codeclimate bot commented Mar 17, 2020 • edited Loading

huonw left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

huonw Mar 25, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kieranricardo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

huonw left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

huonw Mar 25, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kieranricardo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kieranricardo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codeclimate bot commented Mar 17, 2020 •

edited

Loading

huonw Mar 25, 2020 •

edited

Loading

huonw Mar 25, 2020 •

edited

Loading