Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

predicting on a batch of sparse tensors #365

Closed
amjass12 opened this issue May 8, 2022 · 8 comments
Closed

predicting on a batch of sparse tensors #365

amjass12 opened this issue May 8, 2022 · 8 comments

Comments

@amjass12
Copy link

amjass12 commented May 8, 2022

Hi @danielegrattarola ,

I posted the other day about the adjacency matrix input - sorry for a second post, but i am now trying to predict on a batch of sparse tensors with no success - I won't post all of the code, however, i have tried feeding in the adj matrix by creating a dataset and then using the batchloader with no success.. as well as the following:

dummy network:

nodefeature_input = tf.keras.layers.Input(shape=(node_feature_shape,), name='node_features_input')
adjacency_input = tf.keras.layers.Input(shape=(None,), name='adjacency_input', sparse=True)

conv_layer_one = GCNConv(64, activation='relu')([nodefeature_input, adjacency_input])
conv_layer_one = tf.keras.layers.Dropout(0.2)(conv_layer_one)
conv_layer_two = GCNConv(32, activation='relu')([conv_layer_one, adjacency_input])
conv_layer_pool = GlobalAvgPool()(conv_layer_two)
dense_layer_graph = tf.keras.layers.Dense(128, activation='relu')(conv_layer_pool)

dummy_gnn = Model(inputs=[nodefeature_input ,adjacency_input], outputs=[dense_layer_graph])

if i create a dummy batch of data:

adj_matrix = nx.adjacency_matrix(nx_graph) 
x = []
for i in range(10):
    x.append(adj_matrix) #its the same graph for the batch, but this is just to try with a batch of data

practice_batch = [GCNConv.preprocess(i) for i in x]
practice_batch = [sp_matrix_to_sp_tensor(i) for i in new_list]

node_features = node_features #just node features
y = np.zeros((10, 176, 1))
for i in range(10):
    y[index] = node_features
    index +=1

if i predict on one sample (as discussed the other day) this works without issue

dummy_gnn([y[0], practice_batch[0])

however, if I now predict on a batch of sparse tensors.. this fails..

sp_batch = []
for i in practice_batch:
    sp_batch.append(tf.sparse.expand_dims(i, 0))
sp_batch = tf.sparse.concat(sp_inputs= sp_batch, axis=0)
dummy_gnn([y.reshape(1, 10, 176, 1), sp_batch])

this now produces an error AssertionError: Expected a of rank 2 or 3, got 4

If i create a dummy network with only an input for the node features (y), the batch is accepted without issue, so the problem seems to be with the batch of sparse tensors (adjacency matrix input) - I have tried so many different ways of feeding these tensors in, any help is appreciated!! thank you again :)

@danielegrattarola
Copy link
Owner

Hi,

batch mode does not support sparse tensors (due to TensorFlow shenanigans) so you need to cast your adjacency matrices to np.arrays.
However, the error you're seeing is not even related to that, it's due to that 4 dimensional y tensor: node features in batch mode should have shape [batch_size, n_nodes, n_features].

Cheers

@amjass12
Copy link
Author

amjass12 commented May 8, 2022

Hi @danielegrattarola ,

thank you for this! i did not know sparse tensors would cause a problem when feeding in batches! its slightly problematic as i would need to store a buffer of non sparse adjacency matrices ha! I haver to reduce the buffer size..

with regards to the y tensor - if i feed it in as [batch_size, n_nodes, n_features] (which is the original y_shape), i get an error: InvalidArgumentError: Matrix size-incompatible: In[0]: [10,176], In[1]: [1,64] [Op:MatMul] - so it seems i need to expnd the dimension. when I feed it in to a network with only the node features as inputs, it works when i have the dimension as [1, batch_size, n_nodes, n_features]...

input shapes cause me headache in tensorflow - even when they are seemingly correct..

@danielegrattarola
Copy link
Owner

Have you considered using disjoint mode instead of batch mode? Disjoint mode uses sparse adjacency matrices.

I don't know what's going on with your model/data, but node features in batch mode are expected to be 3-dimensional, so even if it doesn't crash with a 4-dimensional tensor you can be sure that the resulting computation will not be as expected.

@amjass12
Copy link
Author

amjass12 commented May 9, 2022

Hi @danielegrattarola ,

I made a silly mistake in the input for the node features! before, my input was:

nodefeature_input = tf.keras.layers.Input(shape=(node_feature_shape,), name='node_features_input')

but the correct shape to put as you pointed out is batch, node_size, feature_size

nodefeature_input = tf.keras.layers.Input(shape=(num_nodes, node_feature_shape,), name='node_features_input')

now with an input of shape (batch, nodes, features) it worked without issue and also the the batch of adj matrices worked without issue also! it is just that i also need the model to predict on batches of size one as well (single graphs), but that is straight forward as it just needs to expand the dimension by one (1, num_nodes, feature_shape) - so this now works without issue.

It would be preferable to feed in sparse tensors for the graphs as i need to feed in many at a time, however, i also need to predict on one sample as well which is why i was trying to avoid the loaders. Can the disjoint loader be fed in as a normal input with adjacency matrices only?

for example would this work?


nodefeature_input = tf.keras.layers.Input(shape=(number_of_nodes,node_feature_shape,), name='node_features_input')
adjacency_input = tf.keras.layers.Input(shape=(None,), name='adjacency_input', sparse=False) #false, as need to predict on individual as well as on batch for off policy learning

conv_layer_one = GCNConv(64, activation='relu')([nodefeature_input, **disjoint_loader.load()**])
conv_layer_one = tf.keras.layers.Dropout(0.2)(conv_layer_one)
conv_layer_two = GCNConv(32, activation='relu')([conv_layer_one, **disjoint_loader.load()**])

thanks!

@danielegrattarola
Copy link
Owner

No, the disjoint loader creates batches of inputs with a generator, you cannot pass it to the model like that.
But if you're OK with implementing your own training loop, then there is no need to use a loader.

@amjass12
Copy link
Author

amjass12 commented May 10, 2022

perfect, thank you so much for clarfiying - by own training loop do you mean the GradientTape method? if so, yes this is a possibility.

and with the disjoint loader, am i only able to provide the adjacency matrix with node features without y labels, as i wont be predicting on a specific set of y labels..

thank you!

@danielegrattarola
Copy link
Owner

Yes, that's what I meant. If you can take care of creating the batches, there's no need to use a loader at all.

Cheers

@amjass12
Copy link
Author

thank you so much! again, for your time in answering al of my questions :) the only problematic part still is that i need to feed in the sparse adj matrices due to the size of the batch itself!

I will see if i can think of a workaround. thank you again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants