Tessellate performance #2438
-
TLDR; tessellate backprop very slow. I have found my lbann StyleGAN implementation to be less performant than the pytorch implementation. I have just put it through nvprof (for 1000 epochs) and found the main time % is spent on backprop for the tessellate function:
At several points in my code I need to multiply a 1D set of activations (dim = 512) by the weights of a 2D conv (e.g. 512x512x3x3), so I tessellate the 1D array to the same shape as the weights and then do some operations. I was wondering if there is a better way to do this, to bypass this large performance overhead? Here is an example of what I mean: # apply weight demodulation, based on styles*weights.
styles_demod_reshaped = lbann.Reshape(styles, dims=[in_channels, 1, 1])
styles_shape_weights = lbann.Tessellate(
styles_demod_reshaped,
dims=[in_channels * out_channels, kernel_size, kernel_size],
)
w = lbann.Multiply(w, styles_shape_weights) Also I create the biases like so, so they need to be tessellated to add onto the activations: b = lbann.WeightsLayer(
weights=lbann.Weights(
initializer=lbann.ConstantInitializer(value=0.0),
name=name + "bias",
),
dims=[out_channels, 1, 1],
name=name + "biaslayer",
)
b = lbann.Tessellate(
b,
dims=[
self.out_channels,
self.resolution,
self.resolution
],
)
x = lbann.Add(x, b) Thanks, |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 2 replies
-
Hi @jvwilliams23, I just uploaded PR #2460 to try and address the issue. Please let us know if that performs better. |
Beta Was this translation helpful? Give feedback.
Hi @jvwilliams23, I just uploaded PR #2460 to try and address the issue. Please let us know if that performs better.