You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for sharing your code. It is a very exciting paper! I have a few concerns about some details in your code. Please correct me if I make any mistakes. :-)
The parameters in BatchNormalization is not trainable, as discussed in this issue. In this case, I personally think it is better to call it standardization instead of batch normalization. And I am wondering whether it is possible to view it in the way of CNN and image processing, where 1D batch normalization is applied to every image in a batch and every location in an image. Here we can regard the different nodes in graphs as different locations in images where convolutional kernels are applied.
Which dimension should be applied to BatchNormalization. As discussed in (1), if we regard the nodes in graphs equivalent to the different locations in images, whether it would be better to do batch normalization to the last dimension, which is the feature channel of nodes. I personally guess it might make more sense than applying BatchNormalization to the node dimension.
The direction of softmax in calculating the assignment matrix. I am curious whether it is better to softmax across the dimension of the node or the dimension of the new feature. I agree that to create an assignment matrix, the contribution of the original nodes to the new clusters should sum to 1 by applying softmax to the last dimension. But I am still wondering whether it makes sense if I apply softmax to the dimension of the node. In this way, each column of the assignment matrix could represent a contribution distribution of the original nodes.
The text was updated successfully, but these errors were encountered:
Hi,
Thank you for sharing your code. It is a very exciting paper! I have a few concerns about some details in your code. Please correct me if I make any mistakes. :-)
The parameters in BatchNormalization is not trainable, as discussed in this issue. In this case, I personally think it is better to call it standardization instead of batch normalization. And I am wondering whether it is possible to view it in the way of CNN and image processing, where 1D batch normalization is applied to every image in a batch and every location in an image. Here we can regard the different nodes in graphs as different locations in images where convolutional kernels are applied.
Which dimension should be applied to BatchNormalization. As discussed in (1), if we regard the nodes in graphs equivalent to the different locations in images, whether it would be better to do batch normalization to the last dimension, which is the feature channel of nodes. I personally guess it might make more sense than applying BatchNormalization to the node dimension.
The direction of softmax in calculating the assignment matrix. I am curious whether it is better to softmax across the dimension of the node or the dimension of the new feature. I agree that to create an assignment matrix, the contribution of the original nodes to the new clusters should sum to 1 by applying softmax to the last dimension. But I am still wondering whether it makes sense if I apply softmax to the dimension of the node. In this way, each column of the assignment matrix could represent a contribution distribution of the original nodes.
The text was updated successfully, but these errors were encountered: