Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vanilla-VAE and usage of kld_weight #11

Closed
ksachdeva opened this issue Sep 22, 2020 · 2 comments
Closed

Vanilla-VAE and usage of kld_weight #11

ksachdeva opened this issue Sep 22, 2020 · 2 comments

Comments

@ksachdeva
Copy link

Hi @AntixK

Many thanks for this great effort.

Based on my understanding so far the original VAE does not talk about weighing the kl_divergence_loss. Later beta-vae and many other papers made the case of weighing the kl_div (and essentially treat it as a hyper-parameter).

In your implementations, I see that you consistently use kld_weight = kwards['M_N'] = batch_size/num_of_images.

Is this a norm to select the weight for kl_div loss using the ratio of batch size and a number of images?

Since in the original VAE paper no weighing was done is it okay to use it in vanilla_vae.py?

Regards
Kapil

@AntixK
Copy link
Owner

AntixK commented Sep 27, 2020

It is just the bias correction term for accounting for the minibatch. When small batch-sizes are used, it can lead to a large variance in the KLD value. But it should work without that kld_weight term too.

@simonhessner
Copy link

Related question: #40

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants