# Latent Dirichlet Allocation using TensorFlow (Batch VB)


### Latent Dirichlet Allocation model:

 - $\theta_{d=1,...,M} \sim \mathrm{Dir}_K(\alpha)$
 - $\beta_{k=1,...,K} \sim \mathrm{Dir}_V(\eta)$
 - $z_{d=1,...,M,i=1,...,N_d} \sim \mathrm{Multinomial}_{ \ K}(\theta_d)$
 - $w_{d=1,...,M,i=1,...,N_d} \sim \mathrm{Multinomial}_{ \ V}(\beta_{z_{di}})$



where

 - $\eta$ is the parameter of the Dirichlet prior on the per-topic word distribution
 - $\alpha$ is the parameter of the Dirichlet prior on the per-document topic distributions
 - $\theta_d$ is the topic distribution for document d
 - $\beta_k$ is the word distribution for topic k
 - $z_{di}$ is the topic for the i-th word in document d
 - $w_{di}$ is a specific word from d-th document
 
In plate notation:

![hustlin_erd](LDA_plate_notation.png)

### Posterior distributions:

We assume full factorial design:


$$q(z, \beta, \theta) = \prod_d \prod_i q(z_{di}) \times \prod_d q(\theta_d) \times \prod_k q(\beta_k)$$

where

- $q(z_{di}) = p(z_{di}|\phi) = \mathrm{Multinomial}_{ \ K}(\phi_{dw_{di}}) $
- $q(\theta_d) = p(\theta_d|\gamma) = \mathrm{Dir}_{K}(\gamma_{d}) $
- $q(\beta_k) = p(\beta_k|\lambda) = \mathrm{Dir}_{V}(\lambda_{k}) $

### ELBO:

From the definition of the **E**vidence **L**ower **Bo**und:

$$ \mathcal{L}(w, \phi, \gamma, \lambda) := \mathbb{E}_{z, \theta, \beta}\left[\log p(w, z, \theta, \beta| \alpha, \eta) \right] - \mathbb{E}_{z, \theta, \beta}\left[\log q(z, \theta, \beta) \right]$$

Using the probability factorization of the LDA model we can rewrite first term as:

$$ \mathbb{E}_{z, \theta, \beta}\left[\log p(w, z, \theta, \beta| \alpha, \eta) \right] = $$

In [2]:
import tensorflow as tf