Better explanation of the decoder output

krasserm · Jul 24, 2018 · 6962f02 · 6962f02
1 parent c5fb836
commit 6962f02
Showing 1 changed file with 1 addition and 1 deletion.
diff --git a/variational_autoencoder.ipynb b/variational_autoencoder.ipynb
@@ -171,7 +171,7 @@
     "\n",
     "![vae-3](images/vae/auto-encoder-3.png)\n",
     "\n",
-    "We haven't defined the functional form of the probabilistic decoder $p(\\mathbf{x} \\lvert \\mathbf{t};\\mathbf\\theta)$ yet. If we train the variational auto-encoder with grey-scale [MNIST images](https://en.wikipedia.org/wiki/MNIST_database), for example, it makes sense to use a multivariate Bernoulli distribution that computes for each pixel the probability of being white. These probabilities are then simply mapped to values from 0-255 to generate grey-scale images. In the output layer of the decoder network there is one node per pixel with a sigmoid activation function. Hence, we compute the binary cross-entropy between the input image and the decoder output to estimate the expected reconstruction error.\n",
+    "We haven't defined the functional form of the probabilistic decoder $p(\\mathbf{x} \\lvert \\mathbf{t};\\mathbf\\theta)$ yet. If we train the variational auto-encoder with grey-scale [MNIST images](https://en.wikipedia.org/wiki/MNIST_database), for example, it makes sense to use a multivariate Bernoulli distribution. In this case, the output of the decoder network is the single parameter of this distribution. It defines for each pixel the probability of being white. These probabilities are then simply mapped to values from 0-255 to generate grey-scale images. In the output layer of the decoder network there is one node per pixel with a sigmoid activation function. Hence, we compute the binary cross-entropy between the input image and the decoder output to estimate the expected reconstruction error.\n",
     "\n",
     "Stochastic variational inference algorithms implemented as variational auto-encoders scale to very large datasets as they can be trained based on mini-batches. Furthermore, they can also be used for data other than image data. For example, Gómez-Bombarelli et. al.<sup>[5]</sup> use a sequential representation of chemical compounds together with an  RNN-based auto-encoder to infer a continuous latent space of chemical compounds that can be used e.g. for generating new chemical compounds with properties that are desirable for drug discovery. I'll cover that in another blog post together with an implementation example of a variational auto-encoder."
    ]