You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In your code here: x = self.conv(x.mul(self.scale)), the input x is multiplied by the scale which is equal to scale = sqrt(2 / fan_in) from HE initializer. I am a bit confused about the multiplication. The paper states that w_i_hat = w / scale which in case of convolution, can be achieved by doing out = conv(x / scale).
My question is: why is the scale multiplied by the x, instead of dividing? Please help.
The text was updated successfully, but these errors were encountered:
In your code here:
x = self.conv(x.mul(self.scale))
, the inputx
is multiplied by thescale
which is equal toscale = sqrt(2 / fan_in)
from HE initializer. I am a bit confused about the multiplication. The paper states thatw_i_hat = w / scale
which in case of convolution, can be achieved by doingout = conv(x / scale)
.My question is: why is the scale multiplied by the x, instead of dividing? Please help.
The text was updated successfully, but these errors were encountered: