# **Chapter 17**
# **Representation Learning and Generative Learning Using Autoencoders and GANs**

**Introduction to Autoencoders and GANs**

This chapter introduces autoencoders and generative adversarial networks (GANs) as two major approaches for unsupervised representation learning and data generation. Autoencoders learn compact latent representations by reconstructing their inputs, while GANs generate highly realistic synthetic data through adversarial training between two neural networks.

Autoencoders are useful for dimensionality reduction, feature extraction, unsupervised pretraining, and basic generative tasks. However, the images they generate tend to be blurry. GANs, on the other hand, are capable of producing extremely realistic images and are widely used in modern generative applications.

**Efficient Data Representations**

Efficient representations help models store information using patterns rather than raw memorization. Just as humans remember patterns more easily than random data, autoencoders are forced to learn meaningful structures when constraints are applied to their architecture.

An autoencoder consists of:

Encoder: compresses the input into a latent representation

Decoder: reconstructs the input from that representation

If the latent representation has lower dimensionality than the input, the autoencoder is called undercomplete, which prevents trivial copying and forces feature learning.

**Performing PCA with an Undercomplete Linear Autoencoder**

When an autoencoder uses linear activations and mean squared error (MSE) loss, it performs Principal Component Analysis (PCA).

In [2]:
from tensorflow import keras

encoder = keras.models.Sequential([
    keras.layers.Dense(2, input_shape=[3])
])

decoder = keras.models.Sequential([
    keras.layers.Dense(3, input_shape=[2])
])

autoencoder = keras.models.Sequential([encoder, decoder])

autoencoder.compile(
    loss="mse",
    optimizer=keras.optimizers.SGD(learning_rate=0.1)
)

autoencoder.summary()


**Stacked Autoencoders**

Stacked autoencoders use multiple hidden layers to learn more complex representations. Their architecture is typically symmetrical around the coding layer.

In [5]:
from tensorflow import keras

stacked_encoder = keras.models.Sequential([
    keras.layers.Flatten(input_shape=[28, 28]),
    keras.layers.Dense(100, activation="selu"),
    keras.layers.Dense(30, activation="selu"),
])

stacked_decoder = keras.models.Sequential([
    keras.layers.Dense(100, activation="selu", input_shape=[30]),
    keras.layers.Dense(28 * 28, activation="sigmoid"),
    keras.layers.Reshape([28, 28])
])

stacked_ae = keras.models.Sequential([stacked_encoder, stacked_decoder])

stacked_ae.compile(
    loss="binary_crossentropy",
    optimizer=keras.optimizers.SGD(learning_rate=1.5)
)

stacked_ae.summary()


**Visualizing Reconstructions**

Reconstruction quality is evaluated by comparing original images with reconstructed outputs.

In [6]:
def plot_image(image):
    plt.imshow(image, cmap="binary")
    plt.axis("off")

def show_reconstructions(model, n_images=5):
    reconstructions = model.predict(X_valid[:n_images])
    fig = plt.figure(figsize=(n_images * 1.5, 3))
    for image_index in range(n_images):
        plt.subplot(2, n_images, 1 + image_index)
        plot_image(X_valid[image_index])
        plt.subplot(2, n_images, 1 + n_images + image_index)
        plot_image(reconstructions[image_index])


**Unsupervised Pretraining**

Autoencoders can be trained on large unlabeled datasets and reused as feature extractors for supervised tasks with limited labeled data. The encoder layers are transferred to a classifier, often with frozen weights.

**Tying Weights**

Weight tying reduces parameters by making decoder weights the transpose of encoder weights.

In [7]:
class DenseTranspose(keras.layers.Layer):
    def __init__(self, dense, activation=None, **kwargs):
        self.dense = dense
        self.activation = keras.activations.get(activation)
        super().__init__(**kwargs)

    def build(self, batch_input_shape):
        self.biases = self.add_weight(name="bias", initializer="zeros",
                                      shape=[self.dense.input_shape[-1]])
        super().build(batch_input_shape)

    def call(self, inputs):
        z = tf.matmul(inputs, self.dense.weights[0], transpose_b=True)
        return self.activation(z + self.biases)


**Convolutional Autoencoders**

For image data, convolutional autoencoders replace dense layers with convolutional and pooling layers.

In [8]:
conv_encoder = keras.models.Sequential([
    keras.layers.Reshape([28, 28, 1], input_shape=[28, 28]),
    keras.layers.Conv2D(16, kernel_size=3, padding="same", activation="selu"),
    keras.layers.MaxPool2D(pool_size=2),
    keras.layers.Conv2D(32, kernel_size=3, padding="same", activation="selu"),
    keras.layers.MaxPool2D(pool_size=2),
    keras.layers.Conv2D(64, kernel_size=3, padding="same", activation="selu"),
    keras.layers.MaxPool2D(pool_size=2)
])


  super().__init__(**kwargs)


**Denoising Autoencoders**

Noise is added to inputs during training, forcing the autoencoder to recover clean data.

In [9]:
keras.layers.Dropout(0.5)


<Dropout name=dropout, built=True>

**Sparse Autoencoders**

Sparsity constraints encourage the network to activate only a small number of neurons.

In [10]:
keras.layers.ActivityRegularization(l1=1e-3)


<ActivityRegularization name=activity_regularization, built=True>

**Variational Autoencoders (VAEs)**

VAEs are probabilistic generative models that learn a continuous latent space.

In [11]:
class Sampling(keras.layers.Layer):
    def call(self, inputs):
        mean, log_var = inputs
        return K.random_normal(tf.shape(log_var)) * K.exp(log_var / 2) + mean


**Generative Adversarial Networks (GANs)**

GANs consist of:

Generator: creates fake data

Discriminator: classifies real vs fake

Training alternates between both networks.

In [13]:
from tensorflow import keras

codings_size = 30

generator = keras.models.Sequential([
    keras.layers.Dense(100, activation="selu", input_shape=(codings_size,)),
    keras.layers.Dense(150, activation="selu"),
    keras.layers.Dense(28 * 28, activation="sigmoid"),
    keras.layers.Reshape([28, 28])
])

generator.summary()


In [14]:
discriminator = keras.models.Sequential([
    keras.layers.Flatten(input_shape=[28, 28]),
    keras.layers.Dense(150, activation="selu"),
    keras.layers.Dense(100, activation="selu"),
    keras.layers.Dense(1, activation="sigmoid")
])


**GAN Training Loop**

GANs require a custom training loop.

In [15]:
def train_gan(gan, dataset, batch_size, codings_size, n_epochs=50):
    generator, discriminator = gan.layers
    for epoch in range(n_epochs):
        for X_batch in dataset:
            noise = tf.random.normal(shape=[batch_size, codings_size])
            generated_images = generator(noise)
            X_fake_and_real = tf.concat([generated_images, X_batch], axis=0)
            y1 = tf.constant([[0.]] * batch_size + [[1.]] * batch_size)
            discriminator.trainable = True
            discriminator.train_on_batch(X_fake_and_real, y1)

            noise = tf.random.normal(shape=[batch_size, codings_size])
            y2 = tf.constant([[1.]] * batch_size)
            discriminator.trainable = False
            gan.train_on_batch(noise, y2)


**Training Difficulties**

GAN training is unstable due to:

Mode collapse

Oscillating gradients

Sensitivity to hyperparameters

Techniques like experience replay and minibatch discrimination help mitigate these issues.

**Deep Convolutional GANs (DCGANs)**

DCGANs use convolutional layers with specific architectural rules to stabilize training.