Clarification on the KL Divergence term in the Generator loss for SHVN->MNIST model #42

hsm207 · 2018-01-14T15:46:43Z

I have a question about the _compute_kl function in the class COCOGANDAContextTrainer. The following are the relevant parts of the code:

  def _compute_kl(self, mu, sd):
    mu_2 = torch.pow(mu, 2)
    sd_2 = torch.pow(sd, 2)
    encoding_loss = (mu_2 + sd_2 - torch.log(sd_2)).sum() / mu_2.size(0)
    return encoding_loss

This function was used in gen_update:

    for i, lt in enumerate(lt_codes):
      encoding_loss += 2 * self._compute_kl(*lt)
    total_loss = hyperparameters['gan_w'] * ad_loss + \
                 hyperparameters['kl_normalized_direct_w'] * encoding_loss + \
                 hyperparameters['ll_normalized_direct_w'] * (ll_loss_a + ll_loss_b)

My question is how did you derive the formula to compute the KL divergence term?

I thought it was based on the Auto-Encoding Variational Bayes paper which has the following parts:

and in Appendix B:

I note the following differences between the code and the paper (Auto-Encoding Variational Bayes):

The KL divergence term is multiplied by 2 instead of 1/2. I guess this does not matter much since it just rescales the loss.
There is no - 1 in the encoding_loss. Did you choose not to include this term because it will not change the optimum point anyway?

The text was updated successfully, but these errors were encountered:

mingyuliutw · 2018-01-14T18:23:45Z

Yes, those were my considerations.

mingyuliutw closed this as completed Jan 14, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarification on the KL Divergence term in the Generator loss for SHVN->MNIST model #42

Clarification on the KL Divergence term in the Generator loss for SHVN->MNIST model #42

hsm207 commented Jan 14, 2018 •

edited

Loading

mingyuliutw commented Jan 14, 2018

Clarification on the KL Divergence term in the Generator loss for SHVN->MNIST model #42

Clarification on the KL Divergence term in the Generator loss for SHVN->MNIST model #42

Comments

hsm207 commented Jan 14, 2018 • edited Loading

mingyuliutw commented Jan 14, 2018

hsm207 commented Jan 14, 2018 •

edited

Loading