You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the loss_function part of the VAE example, I noticed that
KLD = -0.5 * torch.sum(1 + logvar - mu.pow(2) - logvar.exp()) # Normalise by same number of elements as in reconstruction KLD /= args.batch_size * 784
But the dimensionality of the latent variables (logvar, mu) is 20, not 784 -- hence it should either be torch.sum and normalize by args.batch_size * 20 or just straight-up torch.mean, otherwise the BCE and KLD losses are not properly scaled against each other. Changing the normalization from 784 to 20 increases the test error at the end of training, but this is due to a lower normalization increasing the scale of the KLD.