batch normalization #20

szhang0112 · 2019-06-10T20:18:20Z

I am a little bit confused by the batch normalization implementation here: everytime I run bn a self.bn(x) is used, where a new bn layer is created, i.e., bn_module = nn.BatchNorm1d(x.size()[1]).cuda(). Will this fail to train the parameters in the bn as it is created incrementally?

The text was updated successfully, but these errors were encountered:

shirley-wu · 2019-07-01T08:38:52Z

I also feel confused about this... It seems that as x.size(1) (that is, maximum graph size of a batch) varies from time to time, the size of BatchNorm cannot be fixed. However, I still think this usage of BatchNorm problematic. Hoping to get more explanation about this!

RexYing · 2019-07-24T23:55:20Z

Hi,

Thanks for pointing out. I think batch norm is quite confusing for GNNs and what you said make sense. I pushed the new version with bn having trainable parameters registered. It is confusing not just because of the size of graph, but also that I'm not sure if 2d batch norm or 1d make more sense, but there's also the problem that there is no alignment between nodes of different graphs, and we cannot do normalization along the axis of nodes.
Performance-wise I don't see a difference. I'm still working on improving batch norm in general for GNNs.

Thanks again for raising this issue!

Rex

murphyyhuang mentioned this issue Sep 11, 2019

Batch Normalization and the direction of softmax #27

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

batch normalization #20

batch normalization #20

szhang0112 commented Jun 10, 2019

shirley-wu commented Jul 1, 2019

RexYing commented Jul 24, 2019

batch normalization #20

batch normalization #20

Comments

szhang0112 commented Jun 10, 2019

shirley-wu commented Jul 1, 2019

RexYing commented Jul 24, 2019