## Evidence lower bound (ELBO)
In variational Bayesian methods, the evidence lower bound (often abbreviated ELBO) is a useful lower bound on the log-likelihood of some observed data.

### Definition
$$
\begin{aligned}
\text{ELBO} \coloneqq E_{z \sim q_{\phi}} \left [ \text{log} \frac{p_{\theta}(x, z)}{q_{\phi} (z)} \right ]
\end{aligned}
$$
where $p_{\theta}(x, z)$ is joint distribution of $x$ and $z$. $\theta$ and $\phi$ are parameters.

ELBO is used to obtain the lower bound of the evidence (or log evidence). The evidence is the likelihood function evaluated at a fixed $\theta$.
$$
\begin{aligned}
\text{evidence} \coloneqq \text{log }p_{\theta}(x)
\end{aligned}
$$

### Properties
1. The evidence is always larger than ELBO. We refer to the these inequality as the ELBO inequality.
   $$
   \begin{aligned}
   \text{log }p_{\theta}(x) &= \text{log} \int p_{\theta} (x|z) p(z) dz \\
   &= \text{log}\int p_{\theta}(x, z) dz \\
   &= \text{log}\int p_{\theta}(x, z) \frac{q_{\phi}(z)}{q_{\phi}(z)} dz \\
   &= \text{log}\int q_{\phi}(z) \frac{p_{\theta}(x, z)}{q_{\phi}(z)} dz \\
   &= \text{log} E_{z \sim q_{\phi}} \left [ \frac{p_{\theta}(x, z)}{q_{\phi}(z)} \right ] \\
   &\geq E_{z \sim q_{\phi}} \left [\text{log} \frac{p_{\theta}(x, z)}{q_{\phi}(z)} \right ] \quad \because \text{log}(x) \text{ is a concave function.} \\
   \\
   \therefore \text{evidence} &\geq \text{ELBO}
   \end{aligned}
   $$

2. KL Divergence between $p_{\theta}(z|x)$ and $q_{\phi}(z)$ equals $\text{evidence} - \text{ELBO}$.
   $$
   \begin{aligned}
   D_{\text{KL}}(q_{\phi}(z)||p_{\theta}(z|x)) &= \int q_{\phi}(z) \text{log} \frac{q_{\phi}(z)}{p_{\theta}(z|x)} dz \\
   &= E_{z \sim q_{\phi}} \left [ \frac{q_{\phi}(z)}{p_{\theta}(z|x)} \right ] \\
   &= E_{z \sim q_{\phi}}[\text{log }q_{\phi}(z)] - E_{z \sim q_{\phi}}[\text{log } p_{\theta}(z|x)] \\
   &= E_{z \sim q_{\phi}}[\text{log }q_{\phi}(z)] - E_{z \sim q_{\phi}} \left [ \text{log} \left ( p_{\theta}(z|x) \frac{p_{\theta}(x)}{p_{\theta}(x)} \right ) \right ] \\ 
   &= E_{z \sim q_{\phi}}[\text{log}q_{\phi}(z)] - E_{z \sim q_{\phi}} \left [ \text{log} \frac{p_{\theta}(z, x)}{p_{\theta}(x)} \right ] \\ 
   &= E_{z \sim q_{\phi}}[\text{log}q_{\phi}(z)] - E_{z \sim q_{\phi}} [ \text{log } p_{\theta}(z, x) ] + E_{z \sim q_{\phi}}[\text{log } p_{\theta}(x)] \\ 
   &= \text{log } p_{\theta}(x) - E_{z \sim q_{\phi}} \left [ \text{log} \frac{p_{\theta}(x, z)}{q_{\phi}(z)} \right ] \\
   &= \text{evidence} - \text{ELBO}
   \end{aligned}
   $$