# The Semi-Supervised VAE

The semi-supervised setting represents an interesting intermediate case where some of the data is labeled and some is not. It is also of great practical importance, since we often have very little labeled data and much more unlabeled data. We’d clearly like to leverage labeled data to improve our models of the unlabeled data.

The semi-supervised setting is also well suited to generative models, where missing data can be accounted for quite naturally—at least conceptually. As we will see, in restricting our attention to semi-supervised generative models, there will be no shortage of different model variants and possible inference strategies. Although we’ll only be able to explore a few of these variants in detail, hopefully you will come away from the tutorial with a greater appreciation for the abstractions and modularity offered by probabilistic programming.

So let’s go about building a generative model. We have a dataset $D$ with $N$ datapoints,

$ D = $ {(xi, yi)}$  $

where the  {xi} are always observed and the labels {yi} are only observed for some subset of the data. Since we want to be able to model complex variations in the data, we’re going to make this a latent variable model with a local latent variable {zi} private to each pair (xi, yi). Even with this set of choices, a number of model variants are possible: we’re going to focus on the model variant depicted in Figure 1.

<img src="ss_vae_m2.png"  width="180" height="200">

For convenience—and since we’re going to model MNIST in our experiments below—let’s suppose the {xi} are images and the {yi} are digit labels. In this model setup, the latent random variable {zi} and the (partially observed) digit label jointly generate the observed image. The {zi} represents everything but the digit label, possibly handwriting style or position. Let’s sidestep asking when we expect this particular factorization of (xi, yi, zi) to be appropriate, since the answer to that question will depend in large part on the dataset in question (among other things). Let’s instead highlight some of the ways that inference in this model will be challenging as well as some of the solutions that we’ll be exploring in the rest of the tutorial.

# The Challenge of Inference

