You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It this lesson, we will derive the core principles behind Variational Autoencoders (VAEs) and understand how they work.
10
+
It this Lecture, we will derive the core principles behind Variational Autoencoders (VAEs).
11
+
The goal is to learn a generative model through: i) a low-dimensional latent space, ii) a decoder that maps from the latent space to the data space.
11
12
12
-
### Generative Models by Autoencoding
13
+
{/* To achieve this, the an autoencoder will also learn a corresponding encoder that maps from the data space to the latent space. */}
13
14
14
-
The goal is to learn a generative model in the form of:
15
+
### Autoencoders
15
16
16
-
- a low-dimensional latent space
17
-
- a decoder that maps from the latent space to the data space
18
-
19
-
To achieve this, the an autoencoder will also learn a corresponding encoder that maps from the data space to the latent space.
20
-
21
-
### Plain Autoencoders
22
-
23
-
Plain autoencoders jointly learn an encoder $e_\phi(x)$ and a decoder $d_\theta(z)$ by minimizing the reconstruction error:
17
+
**Idea** The core idea of vanilla autoencoders id to *jointly* learn an *encoder* $e_\varphi$, and a *decoder* $d_\theta$. The encoder maps a data point $x \in \mathbb{R}^d$ to a low-dimensional *latent representation* $e_\varphi(x)$, and the decoder maps back this latent representation $d_\theta(e_\varphi(x))$ to the data space $\mathbb{R}^d$.
18
+
The parameters of the encoder and the decoder are learned by minimizing the error between a data point $x$ and its reconstruction $d_\theta(e_\varphi(x))$:
(or in typst) <Tblockv='cal(L)(phi, theta) = sum_(x in "data") || x - d_(theta)(e_(phi)(x)) ||^2' />
24
+
{/*(or in typst) <T block v='cal(L)(phi, theta) = sum_(x in "data") || x - d_(theta)(e_(phi)(x)) ||^2' />*/}
30
25
31
-
### A Probabilistic Interpretation of Autoencoders
26
+
####A Probabilistic Interpretation of Autoencoders
32
27
33
28
As in most cases where a squared error is minimized, we can interpret the decoder as a Gaussian likelihood model:
34
29
35
30
$$
36
-
p_\theta(x|z) = \mathcal{N}(x; d_\theta(z), I)
31
+
p(x|z, \theta) = \underbrace{\mathcal{N}(x; d_\theta(z), I)}_{\substack{\text{density of a Gaussian variable with mean } d_\theta(z) \\ \text{ and identity covariance, evaluated at point } x } } \enspace.
37
32
$$
38
33
39
-
(or in typst) <Tblockv='p_(theta)(x|z) = N(x; d_(theta)(z), I)' />
In Variational Autoencoders (VAEs), the position $z_i$ in the latent space (for a data point $x_i$) is supposed to be a random variable.
95
87
Indeed, there is technically some uncertainty on the exact position of $z_i$ that best explains $x_i$, especially given that we consider all points jointly.
@@ -98,7 +90,7 @@ In a VAE, on thus manipulates, for each point $x_i$, a distribution on its laten
98
90
We will now decompose the construction of the VAE, starting with formulations that have only the decoder.
99
91
The encoder will be introduced later as a trick (i.e., amortization).
100
92
101
-
### MAP Estimation of the Latent Variables
93
+
####MAP Estimation of the Latent Variables
102
94
103
95
We are interesting in estimating both the decoder parameters $\theta$ and the latent variables $z_i$ for each data point $x_i$.
104
96
@@ -138,11 +130,10 @@ $$
138
130
//However, this would lead to overfitting, as we could always increase the likelihood by increasing the capacity of the decoder and setting $z_i$ to arbitrary values.
0 commit comments