In [None]:
# === Environment Setup ===
import numpy as np
import matplotlib.pyplot as plt
from IPython.display import display, Markdown, Image
import tensorflow as tf
from tensorflow.keras import layers, models

# --- Configuration ---
plt.style.use('seaborn-v0_8-whitegrid')
plt.rcParams.update({'font.size': 14, 'figure.figsize': (10, 6), 'figure.dpi': 150})
np.set_printoptions(suppress=True, linewidth=120, precision=4)

# --- Utility Functions ---
def note(msg): display(Markdown(f"<div class='alert alert-info'>📝 {msg}</div>"))
def sec(title): print(f'\n{80*"="}\n| {title.upper()} |\n{80*'='}")

note("Environment initialized for Autoencoders.")

# Chapter 7.11: Autoencoders

---

### Table of Contents

1.  [**Introduction: Data Compression and Feature Learning**](#intro)
2.  [**The Autoencoder Architecture**](#architecture)
3.  [**Code Lab: Building a Simple Autoencoder for Denoising**](#code-lab)
4.  [**Variational Autoencoders (VAEs)**](#vaes)
5.  [**Summary**](#summary)

<a id='intro'></a>
## 1. Introduction: Data Compression and Feature Learning

An **autoencoder** is a type of neural network used for unsupervised learning, primarily for dimensionality reduction and feature learning. The goal of an autoencoder is to learn a compressed representation (an **encoding**) of a set of data.

The network is trained to reconstruct its own input. This seemingly trivial task forces the network to learn the most important features of the data in order to be able to compress it and then decompress it back to its original form.

<a id='architecture'></a>
## 2. The Autoencoder Architecture

An autoencoder consists of two main parts:
- **The Encoder:** This part of the network compresses the input into a lower-dimensional latent space. This compressed representation is the "encoding."
- **The Decoder:** This part of the network reconstructs the input data from the compressed encoding.

The network is trained by minimizing the **reconstruction loss**, which is the difference between the original input and the reconstructed output (e.g., mean squared error).

![Autoencoder Architecture](../images/png/autoencoder_architecture.png)

<a id='code-lab'></a>
## 3. Code Lab: Building a Simple Autoencoder for Denoising

A powerful application of autoencoders is **denoising**. We can train an autoencoder to reconstruct clean images from noisy ones. This forces the model to learn the underlying structure of the data, ignoring the noise.

In [None]:
sec("Building and Training a Denoising Autoencoder")

# Load MNIST data
(x_train, _), (x_test, _) = tf.keras.datasets.mnist.load_data()
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
x_train = np.reshape(x_train, (len(x_train), 28, 28, 1))
x_test = np.reshape(x_test, (len(x_test), 28, 28, 1))

# Add random noise
noise_factor = 0.5
x_train_noisy = x_train + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=x_train.shape)
x_test_noisy = x_test + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=x_test.shape)
x_train_noisy = np.clip(x_train_noisy, 0., 1.)
x_test_noisy = np.clip(x_test_noisy, 0., 1.)

# Define the autoencoder architecture
input_img = tf.keras.Input(shape=(28, 28, 1))

# Encoder
x = layers.Conv2D(32, (3, 3), activation='relu', padding='same')(input_img)
x = layers.MaxPooling2D((2, 2), padding='same')(x)
x = layers.Conv2D(32, (3, 3), activation='relu', padding='same')(x)
encoded = layers.MaxPooling2D((2, 2), padding='same')(x)

# Decoder
x = layers.Conv2D(32, (3, 3), activation='relu', padding='same')(encoded)
x = layers.UpSampling2D((2, 2))(x)
x = layers.Conv2D(32, (3, 3), activation='relu', padding='same')(x)
x = layers.UpSampling2D((2, 2))(x)
decoded = layers.Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)

autoencoder = models.Model(input_img, decoded)
autoencoder.compile(optimizer='adam', loss='binary_crossentropy')

note("Training autoencoder for 5 epochs as a demonstration.")
autoencoder.fit(x_train_noisy, x_train, epochs=5, batch_size=128, shuffle=True, validation_data=(x_test_noisy, x_test))

note("Autoencoder training complete.")

<a id='vaes'></a>
## 4. Variational Autoencoders (VAEs)

**Variational Autoencoders (VAEs)** are a more advanced, generative type of autoencoder. Instead of learning a single point encoding for each input, a VAE learns a **probability distribution** in the latent space. This allows us to sample from the latent space to generate new, synthetic data that resembles the original training data. VAEs are a key component of modern generative AI and are discussed in more detail in the **Chapter on Generative Models**.

![VAE Architecture](../images/png/VAE_architecture.png)

<a id='summary'></a>
## 5. Summary

Autoencoders are a versatile tool for unsupervised learning. They provide a powerful way to learn compressed representations of data, which can be used for dimensionality reduction, feature learning, and denoising. Their generative extension, the VAE, is a cornerstone of modern generative modeling.