Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why is the adjacency matrix taken in to graph convolutional layers each time? and self loops #364

Closed
amjass12 opened this issue May 5, 2022 · 10 comments

Comments

@amjass12
Copy link

amjass12 commented May 5, 2022

Hi!

I am building a graph convolutional network which will be used in conjunction with a merged layer for a reinforcement learning task.

I have a technical question about the convolutional layer itself which is slightly confusing to me which is: why is the adjacency matrix passed in to each conv layer and not ONLY the first one? My code is as follows:


adj = nx.to_numpy_array(graph)

node_features = [] #just the degree of the graph nodes
node_degree = nx.degree(damage_graph)
for i in dict(node_degree).values():
    node_features.append(i / len(damage_graph))

node_features_final = np.array(node_features).reshape(-1, 1)


adj_normalised = normalized_adjacency(adj)
adj_normalised = sp_matrix_to_sp_tensor(adj_normalised)
node_feature_shape = 1


nodefeature_input = tf.keras.layers.Input(shape=(node_feature_shape,), name='node_features_input')
adjacency_input = tf.keras.layers.Input(shape=(None,), name='adjacency_input', sparse=True)

conv_layer_one = GCNConv(64, activation='relu')([nodefeature_input, adj_normalised])
conv_layer_one = tf.keras.layers.Dropout(0.2)(conv_layer_one)
conv_layer_two = GCNConv(32, activation='relu')([conv_layer_one, adj_normalised])
conv_layer_pool = GlobalAvgPool()(conv_layer_two)
dense_layer_graph = tf.keras.layers.Dense(128, activation='relu')(conv_layer_pool)

input_action_vector = tf.keras.layers.Input(shape=(action_vector,), name='action_vec_input')
action_vector_dense = tf.keras.layers.Dense(128, activation='relu', name='action_layer_dense')(input_action_vector)

merged_layer = tf.keras.layers.Concatenate()([dense_layer_graph, action_vector_dense])
#output_layer... etc
model = Model([nodefeature_input, adjacency_input], [output_layer])

and my second question is about the normalise_adjacency - it does not add self loops. Should self loops be added before or after normalising the matrix?

thank you!

@danielegrattarola
Copy link
Owner

danielegrattarola commented May 5, 2022

Hi,

  1. The reason for passing the adjacency matrix as input every time is that GNN layers only change the features, not the adjacency matrix. So, it makes sense that the output of the layer is whatever has changed.
    Another way of looking at it is that the adjacency matrix only describes the underlying structure of your data, but the data itself is whatever is stored on the nodes.
  2. normalized_adjacency is not the correct way of normalizing the adjacency matrix for GCN. You can use the built-in class method GCNConv.preprocess (every layer has this method) or if you want to do it manually you can use spektral.utils.convolution.gcn_filter.
  3. Anyway, self-loops must be added before normalizing the matrix.

Cheers

@amjass12
Copy link
Author

amjass12 commented May 5, 2022

Hi @danielegrattarola ,

thank you for such a quick response!

  1. thank you for clarifying, this makes sense
    2 and 3. thank you - I have amended as follows:
adj = nx.adjacency_matrix(graph)
#identity
I = np.matrix(np.eye(adj.shape[0]))
adj_with_loops = adj + I
adj_preprocessed = GCNConv.preprocess(adj_with_loops)

and then code identical to above, but adjacency input layer now includes the preprocessed adjacency (self loops and GCNconv.preprocess) - I am guessing this is now correct and GCNconv.preprocess only needs to be called once on the matrix? - also, is there computation speedup of working with a sparse matrix instead of the entire adjacency matrix? In other words is nx.adjacency_matrix (which produces a sparse adj matrix) more efficient than nx.to_numpy_array(which produces the dense adj matrix)

One last question i have is, i thought the normalisation was relative to node degree for any given node. I see that the first element in the adjaceny matrix has a value of 0.5 (or 0.6 with self loops) but this node has the lowest degree (1)... is there a justification for this way of normalising?

thanks again!

@danielegrattarola
Copy link
Owner

danielegrattarola commented May 5, 2022

If you use GCNConv.preprocess then you must not add the self loops manually, the method does it for you.
Sorry if my answer was confusing.

GCNconv.preprocess only needs to be called once on the matrix?

Yes

is there computation speedup of working with a sparse matrix instead of the entire adjacency matrix?

Yes, the difference is that with sparse matrices the computation costs O(n_edges) while with dense matrices it costs O(n_nodes^2)

is there a justification for this way of normalising?

I suggest you to fix the computation first, and then dig into the function to see how it is computed.

@amjass12
Copy link
Author

amjass12 commented May 5, 2022

no problem, and thank you so much! so just:

adj = nx.adjacency_matrix(graph)
adj_preprocessed = GCNConv.preprocess(adj)

correct? so now this would be ready for input in to the conv layers?

I have re-run this and looked at the normalised adjacency matrix - i still get higher values with nodes that have a lower degree!

@danielegrattarola
Copy link
Owner

Yes, that is correct.

What you are observing is expected, if a node has a low degree its neighbours are relatively more important.

@amjass12
Copy link
Author

amjass12 commented May 5, 2022

thank you so much for all of your help this afternoon - I did a little more reading on the normalisation and understand also why lower degrees can have more weigthing!

finally - just while you are here - the nx.adjacency_matrix() call produces a sparse matrix in numpy format:

<1405x1405 sparse matrix of type '<class 'numpy.int64'>'
	with 233906 stored elements in Compressed Sparse Row format>

can i confirm that 1. this format does not interfere with any of the processes of GCNconv.preprocess and that all preprocessing continues as intended - and that 2. the adj_preprocess sparse format is also acceptable as input to the GCN once the model has been constructed? thank you! and happy to close once answered :)

@danielegrattarola
Copy link
Owner

Sure, no problem!

  1. The sparse format is what GCNconv.preprocess expects, you can be sure.
  2. The sparse format is acceptable to GCN, but in order to give it as input to the model you will need to convert it to a SparseTensor (which, judging from your code, you are already doing)

@amjass12
Copy link
Author

amjass12 commented May 5, 2022

perfect, thank you very much!! I appreciate all your time :)

@amjass12 amjass12 closed this as completed May 5, 2022
@danielegrattarola
Copy link
Owner

You're welcome!!

@amjass12
Copy link
Author

amjass12 commented May 5, 2022

sorry @danielegrattarola - yes you are right:

it is:

adj = nx.adjacency_matrix(damage_graph)
adj_preprocessed = GCNConv.preprocess(adj)
adj_preprocessed = sp_matrix_to_sp_tensor(adj_preprocessed)

without the last line, it was giving me a datatype error, so, sp_matrix_to_sp_tensor is necessary after the GCNConv.preprocess line!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants