the final loss #4

kakusikun · 2018-11-23T06:18:22Z

In paper, the final loss function is presented in equation (12),
the estimated expected log-likelihood through SGVB and KL divergence.

It seems that the SBP layer only takes KL divergence into account, why don't we need to deal with the expected log-likelihood term?

Is the log likelihood included as our objective function?

necludov · 2018-11-23T15:33:31Z

Thanks for your questions!

Each SBP layer adds KL divergence to the final loss since the KL divergence depends on the specific values of parameters of the approximate posterior distribution. The final loss (negative ELBO) is evaluated after the forward pass through the entire network in sgvlb function. So the final loss includes cross-entropy term (log-loss). Note that there is also l2 loss there, but this is legacy and it is turned off in scripts for SBP model training.

Informally speaking, you can take ELBO as the objective that consists of two parts: data-term that is log-loss, and KL-term that is kind of a regularization.
ELBO = data-term (log-loss) + KL-term

kakusikun closed this as completed Nov 26, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

the final loss #4

the final loss #4

kakusikun commented Nov 23, 2018 •

edited

Loading

necludov commented Nov 23, 2018

the final loss #4

the final loss #4

Comments

kakusikun commented Nov 23, 2018 • edited Loading

necludov commented Nov 23, 2018

kakusikun commented Nov 23, 2018 •

edited

Loading