# Bab 17: Autoencoders and GANs (Autoencoder dan GAN)

### 1. Pendahuluan

Bab 17 membahas dua keluarga arsitektur *Neural Network* yang sangat menarik dan kuat dalam kategori *unsupervised learning* atau *generative models*: **Autoencoders** dan **Generative Adversarial Networks (GANs)**. Kedua jenis model ini berfokus pada pembelajaran representasi data dan/atau generasi data baru yang realistis.

### 2. Autoencoders (Autoencoder)

Autoencoder adalah Jaringan Saraf Tiruan yang dilatih untuk menghasilkan output yang hampir identik dengan inputnya. Ini mungkin terdengar tidak berguna, tetapi Autoencoder tidak hanya sekadar menyalin input ke output; mereka dilatih untuk menyalin input ke output di bawah beberapa kendala, yang memaksa mereka untuk mempelajari representasi data yang efisien.

#### a. Arsitektur Autoencoder (Autoencoder Architecture)
Sebuah Autoencoder umumnya terdiri dari dua bagian:
* **Encoder:** Bagian jaringan yang mengkompres input menjadi representasi berdimensi rendah, sering disebut **pengkodean (coding)** atau **ruang laten (latent space)**. Ini adalah representasi *bottleneck* dari input.
* **Decoder:** Bagian jaringan yang menerima pengkodean (output dari encoder) dan mengembangkannya kembali ke dimensi input asli, mencoba merekonstruksi input.

Fungsi *loss* yang digunakan adalah metrik perbedaan antara input asli dan output yang direkonstruksi (misalnya, MSE untuk data numerik, *binary cross-entropy* untuk gambar piksel).

#### b. Undercomplete Autoencoders (Autoencoder Kurang Lengkap)
Jenis Autoencoder yang paling sederhana di mana dimensi *coding* (lapisan *bottleneck*) lebih kecil dari dimensi input. Ini memaksa Autoencoder untuk mempelajari representasi data yang paling penting, karena ia harus mengkompres informasi.

* **Tujuan:** Reduksi dimensi non-linier.
* **Masalah:** Jika *coding* terlalu kecil atau model tidak cukup kuat, rekonstruksi akan buruk. Jika terlalu besar, model mungkin belajar fungsi identitas dan tidak mempelajari fitur yang berguna.

#### c. Tumpukan Autoencoder (Stacked Autoencoders)
Autoencoder dapat ditumpuk dengan melatih beberapa Autoencoder secara berurutan. Setiap Autoencoder dilatih pada output *coding* dari Autoencoder sebelumnya. Ini adalah cara untuk melatih jaringan yang dalam (DNN) secara bertahap.

#### d. Autoencoder Konvolusional (Convolutional Autoencoders)
Untuk gambar, Autoencoder konvolusional lebih disukai. Encoder biasanya terdiri dari lapisan `Conv2D` dan `MaxPooling2D` (mirip dengan CNN klasik), dan Decoder menggunakan `Conv2DTranspose` (juga dikenal sebagai *deconvolution* atau *transposed convolution*) dan `UpSampling2D` untuk memperbesar gambar kembali ke ukuran aslinya.

#### e. Autoencoder Denoising (Denoising Autoencoders)
Ini adalah Autoencoder yang dilatih untuk merekonstruksi input "bersih" dari input "bising" (noisy). Selama pelatihan, *noise* ditambahkan ke input, tetapi fungsi *loss* dihitung berdasarkan perbandingan output dengan input *tanpa noise*. Ini memaksa Autoencoder untuk mempelajari fitur-fitur penting yang membedakan sinyal dari *noise*.

#### f. Autoencoder Sparse (Sparse Autoencoders)
Autoencoder yang memaksakan sejumlah kecil neuron di lapisan *coding* (atau lapisan tersembunyi lainnya) untuk aktif pada saat tertentu. Ini dapat dicapai dengan menambahkan istilah regularisasi ke fungsi *loss* yang menghukum aktivasi rata-rata yang tinggi (misalnya, KL divergence terhadap distribusi Bernoulli yang jarang). Ini mendorong Autoencoder untuk mempelajari representasi yang lebih informatif dan terpisah.

#### g. Autoencoder Variasional (Variational Autoencoders - VAEs)
VAEs adalah jenis Autoencoder *generatif* yang sangat populer.
* **Perbedaan dari AE Klasik:** VAEs tidak hanya mengkodekan input menjadi satu vektor tunggal dalam ruang laten, tetapi menjadi *distribusi* probabilitas (mean dan *standard deviation*) dalam ruang laten.
* **Sampling:** Selama rekonstruksi, vektor dari distribusi ini diambil sampelnya (misalnya, dengan `reparameterization trick` untuk memungkinkan *backpropagation*).
* **Fungsi Loss:** Fungsi *loss* VAE memiliki dua bagian:
    1.  **Reconstruction Loss:** Mengukur seberapa baik Autoencoder merekonstruksi input (misalnya, MSE).
    2.  **Regularization Loss (KL Divergence):** Mengukur seberapa dekat distribusi yang dipelajari di ruang laten dengan distribusi prior yang sederhana (misalnya, distribusi Gaussian standar). Ini memaksa ruang laten menjadi terstruktur dan memungkinkan generasi sampel baru yang berarti.
* **Generasi:** Setelah dilatih, VAEs dapat menghasilkan *instance* baru yang realistis dengan mengambil sampel dari distribusi prior di ruang laten dan memberikannya ke *decoder*.

### 3. Generative Adversarial Networks (GANs)

GANs adalah arsitektur *generatif* yang sangat inovatif yang pertama kali diperkenalkan oleh Ian Goodfellow pada tahun 2014. Mereka terdiri dari dua Jaringan Saraf Tiruan yang bersaing dalam permainan *zero-sum*:

* **Generator:** Jaringan yang mencoba menghasilkan data baru yang realistis (misalnya, gambar) yang mirip dengan *training data* asli. Inputnya adalah vektor *noise* acak (seringkali dari distribusi Gaussian).
* **Discriminator:** Jaringan yang mencoba membedakan antara data "nyata" (dari *training set* asli) dan data "palsu" (dihasilkan oleh Generator). Ini adalah pengklasifikasi biner.

#### a. Pelatihan GAN (Training a GAN)
Pelatihan GAN adalah proses iteratif dua langkah:
1.  **Langkah Diskriminator:** Discriminator dilatih untuk mengklasifikasikan *instance* nyata sebagai nyata, dan *instance* palsu (yang dihasilkan oleh Generator) sebagai palsu.
2.  **Langkah Generator:** Generator dilatih untuk menghasilkan data yang lebih baik, tujuannya adalah "menipu" Discriminator agar berpikir bahwa data yang dihasilkan itu nyata. Selama langkah ini, bobot Discriminator dibekukan.

Game ini berlanjut hingga Generator menghasilkan data yang sangat realistis sehingga Discriminator tidak dapat lagi membedakan antara data nyata dan palsu (probabilitas 0.5).

#### b. Tantangan Pelatihan GAN (Challenges in Training GANs)
GANs terkenal sulit dilatih:
* **Mode Collapse:** Generator mungkin hanya menghasilkan beberapa jenis *instance* yang sangat meyakinkan, mengabaikan keragaman dataset.
* **Vanishing Gradients (Discriminator terlalu kuat):** Jika Discriminator menjadi terlalu kuat, gradien yang diterima Generator bisa menjadi sangat kecil, menghentikan pembelajarannya.
* **Ketidakstabilan Pelatihan:** Pelatihan bisa tidak stabil dan tidak konvergen.

#### c. Deep Convolutional GANs (DCGANs)
DCGANs adalah GANs yang menggunakan lapisan konvolusional (tanpa *pooling* dan lapisan *fully connected* di Generator dan Discriminator) untuk menghasilkan gambar. Mereka juga menggunakan *Batch Normalization* secara ekstensif. DCGANs jauh lebih stabil dan mampu menghasilkan gambar yang lebih realistis.

#### d. Conditional GANs (CGANs)
CGANs memungkinkan kontrol atas data yang dihasilkan. Anda memberikan input tambahan (misalnya, label kelas, atau gambar lain) ke Generator dan Discriminator, sehingga Generator dapat menghasilkan data yang spesifik berdasarkan kondisi yang diberikan.

#### e. Wasserstein GANs (WGANs)
WGANs memperkenalkan *loss function* baru (Wasserstein distance atau Earth Mover's distance) yang lebih stabil daripada *binary cross-entropy* tradisional. Ini membantu mengatasi masalah *vanishing gradients* dan *mode collapse* pada GANs.

### 4. Kesimpulan

Bab 17 memperkenalkan dua kelas model *Deep Learning* yang penting dan menarik dalam ranah *unsupervised learning* dan *generative models*. Autoencoders dibahas sebagai alat untuk reduksi dimensi non-linier dan pembelajaran representasi (termasuk *denoising*, *sparse*, dan *variational* autoencoders). GANs diperkenalkan sebagai model *generatif* revolusioner yang terdiri dari dua jaringan yang bersaing, mampu menghasilkan data baru yang sangat realistis, meskipun pelatihannya menantang. Pemahaman di bab ini adalah fondasi untuk bidang *generative AI* yang berkembang pesat.

## 1. Setup

In [9]:
import tensorflow as tf
from tensorflow import keras
import numpy as np
import matplotlib.pyplot as plt
import os
import pandas as pd

### Loading Fashion MNIST (as used in previous chapters, for image data)


In [10]:
fashion_mnist = keras.datasets.fashion_mnist
(X_train_full, y_train_full), (X_test, y_test) = fashion_mnist.load_data()
X_train_full = X_train_full / 255.0
X_test = X_test / 255.0
X_valid, X_train = X_train_full[:5000], X_train_full[5000:]
y_valid, y_train = y_train_full[:5000], y_train_full[5000:]

## 2. Autoencoders

### Undercomplete Autoencoder

In [11]:
# Encoder layers
encoder = keras.models.Sequential([
    keras.layers.Flatten(input_shape=[28, 28]),
    keras.layers.Dense(100, activation="relu"),
    keras.layers.Dense(30, activation="relu") # Coding layer (bottleneck)
])

# Decoder layers
decoder = keras.models.Sequential([
    keras.layers.Dense(100, activation="relu"),
    keras.layers.Dense(28 * 28, activation="sigmoid"), # Output layer reconstructs image
    keras.layers.Reshape([28, 28]) # Reshape output to match input image shape
])

# Full Autoencoder model
autoencoder = keras.models.Sequential([encoder, decoder])

  super().__init__(**kwargs)


In [12]:
autoencoder.compile(loss="binary_crossentropy", optimizer=keras.optimizers.SGD(learning_rate=1.0))
# For images, binary_crossentropy is common for pixel values 0-1 (treated as probabilities)
# For data already normalized to 0-1, you might use MSE as well if you assume Gaussian noise.
# The book uses binary_crossentropy for pixel intensities.

In [13]:
# Training the Autoencoder
autoencoder.fit(X_train, X_train, epochs=10,
                validation_data=(X_valid, X_valid))
# Note: X_train is passed as both input and target for autoencoders

Epoch 1/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m11s[0m 6ms/step - loss: 0.4356 - val_loss: 0.3206
Epoch 2/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 6ms/step - loss: 0.3192 - val_loss: 0.3071
Epoch 3/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 6ms/step - loss: 0.3093 - val_loss: 0.3027
Epoch 4/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m11s[0m 6ms/step - loss: 0.3034 - val_loss: 0.2973
Epoch 5/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m11s[0m 6ms/step - loss: 0.3000 - val_loss: 0.2947
Epoch 6/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 6ms/step - loss: 0.2965 - val_loss: 0.2919
Epoch 7/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m19s[0m 5ms/step - loss: 0.2950 - val_loss: 0.2891
Epoch 8/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m11s[0m 6ms/step - loss: 0.2927 - val_loss: 0.2877
Epoch 9/10
[1m1

<keras.src.callbacks.history.History at 0x7d4d59feaed0>

### Stacked Autoencoders

In [15]:
import tensorflow as tf
from tensorflow import keras
import numpy as np

# First Autoencoder (similar to undercomplete AE above)
stacked_encoder = keras.models.Sequential([
    keras.layers.Flatten(input_shape=[28, 28]),
    keras.layers.Dense(100, activation="relu"),
    keras.layers.Dense(30, activation="relu") # Coding layer
])
stacked_decoder = keras.models.Sequential([
    keras.layers.Dense(100, activation="relu"),
    keras.layers.Dense(28 * 28, activation="sigmoid"),
    keras.layers.Reshape([28, 28]) # Add Reshape layer here
])
stacked_autoencoder = keras.models.Sequential([stacked_encoder, stacked_decoder])
stacked_autoencoder.compile(loss="binary_crossentropy", optimizer=keras.optimizers.SGD(learning_rate=1.0))

# Train the first autoencoder
history1 = stacked_autoencoder.fit(X_train, X_train, epochs=10,
                                  validation_data=(X_valid, X_valid))

# Extract the encoder part to train the next autoencoder
X_train_code = stacked_encoder.predict(X_train)
X_valid_code = stacked_encoder.predict(X_valid)

# Second Autoencoder (trained on the codes from the first)
hidden_neurons_2 = 10 # Smaller bottleneck
autoencoder_2 = keras.models.Sequential([
    keras.layers.Dense(hidden_neurons_2, activation="relu", input_shape=[stacked_encoder.layers[-1].units]),
    keras.layers.Dense(stacked_encoder.layers[-1].units, activation="relu")
])
autoencoder_2_decoder = keras.models.Sequential([
    keras.layers.Dense(stacked_encoder.layers[-1].units, activation="relu"),
    keras.layers.Dense(28 * 28, activation="sigmoid") # This part might connect back differently
])

# For practical stacked autoencoders, you often train each layer greedily.
# The common approach in Keras is to build the full stacked autoencoder, then
# use transfer learning or freeze layers.

# Full Stacked Autoencoder (end-to-end)
stacked_autoencoder_full = keras.models.Sequential([
    keras.layers.Flatten(input_shape=[28, 28]),
    keras.layers.Dense(100, activation="relu"), # First encoder layer
    keras.layers.Dense(30, activation="relu"), # Second encoder layer (coding)
    keras.layers.Dense(100, activation="relu"), # First decoder layer
    keras.layers.Dense(28 * 28, activation="sigmoid"), # Output layer
    keras.layers.Reshape([28, 28]) # Add Reshape layer here for the full autoencoder
])
stacked_autoencoder_full.compile(loss="binary_crossentropy", optimizer="adam")
# history_full_stacked = stacked_autoencoder_full.fit(X_train, X_train, epochs=10,
#                                                     validation_data=(X_valid, X_valid))

  super().__init__(**kwargs)


Epoch 1/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 6ms/step - loss: 0.4328 - val_loss: 0.3162
Epoch 2/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m11s[0m 6ms/step - loss: 0.3172 - val_loss: 0.3087
Epoch 3/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m15s[0m 8ms/step - loss: 0.3071 - val_loss: 0.2993
Epoch 4/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m11s[0m 6ms/step - loss: 0.3023 - val_loss: 0.2973
Epoch 5/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m20s[0m 6ms/step - loss: 0.2987 - val_loss: 0.2932
Epoch 6/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 6ms/step - loss: 0.2961 - val_loss: 0.2938
Epoch 7/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 5ms/step - loss: 0.2940 - val_loss: 0.2939
Epoch 8/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m11s[0m 6ms/step - loss: 0.2912 - val_loss: 0.2887
Epoch 9/10
[1m1

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
  super().__init__(**kwargs)


### Convolutional Autoencoders

In [16]:
# Encoder
conv_encoder = keras.models.Sequential([
    keras.layers.Conv2D(16, kernel_size=3, padding="same", activation="relu", input_shape=[28, 28, 1]),
    keras.layers.MaxPooling2D(pool_size=2), # Output 14x14x16
    keras.layers.Conv2D(32, kernel_size=3, padding="same", activation="relu"),
    keras.layers.MaxPooling2D(pool_size=2), # Output 7x7x32
    keras.layers.Conv2D(64, kernel_size=3, padding="same", activation="relu"), # Coding layer output: 7x7x64
])

# Decoder
conv_decoder = keras.models.Sequential([
    keras.layers.Conv2DTranspose(32, kernel_size=3, padding="same", activation="relu",
                                 input_shape=[7, 7, 64]), # Matches encoder output shape
    keras.layers.UpSampling2D(size=2), # Output 14x14x32
    keras.layers.Conv2DTranspose(16, kernel_size=3, padding="same", activation="relu"),
    keras.layers.UpSampling2D(size=2), # Output 28x28x16
    keras.layers.Conv2D(1, kernel_size=3, padding="same", activation="sigmoid") # Reconstructs grayscale image
])

# Full Convolutional Autoencoder
conv_autoencoder = keras.models.Sequential([conv_encoder, conv_decoder])
conv_autoencoder.compile(loss="binary_crossentropy", optimizer="adam")

# Need to reshape X_train for Conv2D input (add channel dimension)
X_train_reshaped = X_train[..., np.newaxis]
X_valid_reshaped = X_valid[..., np.newaxis]

# Training the Conv Autoencoder
conv_autoencoder.fit(X_train_reshaped, X_train_reshaped, epochs=10,
                     validation_data=(X_valid_reshaped, X_valid_reshaped))

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
  super().__init__(


Epoch 1/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m106s[0m 60ms/step - loss: 0.3070 - val_loss: 0.2620
Epoch 2/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m140s[0m 59ms/step - loss: 0.2625 - val_loss: 0.2553
Epoch 3/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m100s[0m 58ms/step - loss: 0.2577 - val_loss: 0.2529
Epoch 4/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m146s[0m 60ms/step - loss: 0.2556 - val_loss: 0.2515
Epoch 5/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m145s[0m 62ms/step - loss: 0.2541 - val_loss: 0.2518
Epoch 6/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m102s[0m 59ms/step - loss: 0.2534 - val_loss: 0.2500
Epoch 7/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m147s[0m 62ms/step - loss: 0.2522 - val_loss: 0.2494
Epoch 8/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m136s[0m 59ms/step - loss: 0.2524 - val_loss: 0.2493


<keras.src.callbacks.history.History at 0x7d4d59fd8150>

### Denoising Autoencoders

In [17]:
# Add noise to the input data for training
noise_factor = 0.2
X_train_noisy = X_train + noise_factor * np.random.randn(*X_train.shape)
X_valid_noisy = X_valid + noise_factor * np.random.randn(*X_valid.shape)
X_test_noisy = X_test + noise_factor * np.random.randn(*X_test.shape)

# Clip values to stay within [0, 1] range
X_train_noisy = np.clip(X_train_noisy, 0., 1.)
X_valid_noisy = np.clip(X_valid_noisy, 0., 1.)
X_test_noisy = np.clip(X_test_noisy, 0., 1.)

# The autoencoder architecture can be the same as standard AEs (e.g., conv_autoencoder)
# The key is training with noisy input, but clean target.
denoising_autoencoder = conv_autoencoder # Using the previously defined conv_autoencoder structure

# Compile and train
denoising_autoencoder.compile(loss="binary_crossentropy", optimizer="adam")
denoising_autoencoder.fit(X_train_noisy[..., np.newaxis], X_train[..., np.newaxis], epochs=10,
                          validation_data=(X_valid_noisy[..., np.newaxis], X_valid[..., np.newaxis]))

Epoch 1/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m114s[0m 63ms/step - loss: 0.2707 - val_loss: 0.2622
Epoch 2/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m143s[0m 63ms/step - loss: 0.2647 - val_loss: 0.2610
Epoch 3/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m143s[0m 64ms/step - loss: 0.2633 - val_loss: 0.2605
Epoch 4/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m141s[0m 63ms/step - loss: 0.2633 - val_loss: 0.2602
Epoch 5/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m107s[0m 62ms/step - loss: 0.2629 - val_loss: 0.2603
Epoch 6/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m141s[0m 62ms/step - loss: 0.2631 - val_loss: 0.2597
Epoch 7/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m142s[0m 61ms/step - loss: 0.2625 - val_loss: 0.2595
Epoch 8/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m140s[0m 61ms/step - loss: 0.2621 - val_loss: 0.2595


<keras.src.callbacks.history.History at 0x7d4d5b638390>

### Sparse Autoencoders

In [19]:
import tensorflow as tf
from tensorflow import keras
import numpy as np

# Add activity regularizer to Dense layers to encourage sparsity
# alpha is the sparsity parameter
# Example:
# Keras does not have a direct "KL Divergence" regularizer in `keras.regularizers` for sparsity.
# You'd typically implement it as a custom loss or a custom layer.
# A simpler approximation is L1 regularization on activations, though it's not truly sparse AE.

model_sparse = keras.models.Sequential([
    keras.layers.Flatten(input_shape=[28, 28]),
    keras.layers.Dense(100, activation="relu", activity_regularizer=keras.regularizers.l1(1e-3)),
    keras.layers.Dense(30, activation="relu", activity_regularizer=keras.regularizers.l1(1e-3)),
    keras.layers.Dense(100, activation="relu"),
    keras.layers.Dense(28 * 28, activation="sigmoid"),
    keras.layers.Reshape([28, 28]) # Add Reshape layer here
])
model_sparse.compile(loss="binary_crossentropy", optimizer="adam")
model_sparse.fit(X_train, X_train, epochs=10, validation_data=(X_valid, X_valid))

  super().__init__(**kwargs)


Epoch 1/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m15s[0m 8ms/step - loss: 0.5388 - val_loss: 0.4898
Epoch 2/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m20s[0m 7ms/step - loss: 0.4911 - val_loss: 0.4898
Epoch 3/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m21s[0m 7ms/step - loss: 0.4899 - val_loss: 0.4897
Epoch 4/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m12s[0m 7ms/step - loss: 0.4905 - val_loss: 0.4897
Epoch 5/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m21s[0m 7ms/step - loss: 0.4907 - val_loss: 0.4899
Epoch 6/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m13s[0m 7ms/step - loss: 0.4906 - val_loss: 0.4897
Epoch 7/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m14s[0m 8ms/step - loss: 0.4905 - val_loss: 0.4898
Epoch 8/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m20s[0m 8ms/step - loss: 0.4910 - val_loss: 0.4897
Epoch 9/10
[1m1

<keras.src.callbacks.history.History at 0x7d4d583ce550>

### Variational Autoencoders (VAEs)

In [27]:
import tensorflow as tf
from tensorflow import keras
import numpy as np

# Encoder (generates mean and log_variance of the latent distribution)
codings_size = 10 # Dimension of latent space

class Sampling(keras.layers.Layer):
    def call(self, inputs):
        mean, log_var = inputs
        epsilon = tf.random.normal(tf.shape(log_var)) # Sample from N(0, 1)
        return mean + tf.exp(0.5 * log_var) * epsilon # Reparameterization trick

# Define the encoder using the Functional API to get mean, log_var, and sampled z
inputs_vae_func = keras.layers.Input(shape=[28, 28])
flatten_input_func = keras.layers.Flatten()(inputs_vae_func)
dense_150_func = keras.layers.Dense(150, activation="relu")(flatten_input_func)
dense_100_func = keras.layers.Dense(100, activation="relu")(dense_150_func)
z_mean_log_var_func = keras.layers.Dense(2 * codings_size)(dense_100_func)
z_mean_func = keras.layers.Dense(codings_size, name='z_mean')(z_mean_log_var_func)
z_log_var_func = keras.layers.Dense(codings_size, name='z_log_var')(z_mean_log_var_func)
z_func = Sampling()([z_mean_func, z_log_var_func])

# Create the functional encoder model that outputs mean, log_var, and z
variational_encoder_func = keras.models.Model(
    inputs=inputs_vae_func,
    outputs=[z_mean_func, z_log_var_func, z_func],
    name="vae_encoder"
)

# Define the decoder
decoder_vae = keras.models.Sequential([
    keras.layers.Dense(100, activation="relu", input_shape=[codings_size]),
    keras.layers.Dense(150, activation="relu"),
    keras.layers.Dense(28 * 28, activation="sigmoid"),
    keras.layers.Reshape([28, 28]) # Reshape to image dimensions
], name="vae_decoder")

# Define the VAE model by subclassing keras.Model
class VariationalAutoencoder(keras.Model):
    def __init__(self, encoder, decoder, **kwargs):
        super().__init__(**kwargs)
        self.encoder = encoder
        self.decoder = decoder

    def call(self, inputs):
        # Get mean, log_var, and z from the encoder
        z_mean, z_log_var, z = self.encoder(inputs)

        # Reconstruct the image using the decoder
        reconstruction = self.decoder(z)

        # Calculate reconstruction loss (binary crossentropy)
        reconstruction_loss = tf.reduce_mean(
            keras.losses.binary_crossentropy(tf.reshape(inputs, [-1, 28 * 28]), tf.reshape(reconstruction, [-1, 28 * 28]))
        )

        # Calculate KL divergence loss
        kl_loss = -0.5 * tf.reduce_sum(
            1 + z_log_var - tf.exp(z_log_var) - tf.square(z_mean),
            axis=-1
        )
        kl_loss = tf.reduce_mean(kl_loss)

        # Add the total VAE loss to the model's losses
        total_vae_loss = reconstruction_loss + kl_loss
        self.add_loss(total_vae_loss)

        return reconstruction # Return the reconstruction

# Instantiate the custom VAE model
vae = VariationalAutoencoder(variational_encoder_func, decoder_vae)

# Compile the VAE model - loss is added internally, so set loss to None
optimizer_vae = keras.optimizers.Adam(learning_rate=0.001)
vae.compile(optimizer=optimizer_vae, loss=None)

# Train the VAE
history_vae = vae.fit(X_train, epochs=10, batch_size=32,
                      validation_data=(X_valid,)) # Pass validation data as a tuple

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


Epoch 1/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m21s[0m 11ms/step - loss: 0.5282 - val_loss: 0.4910
Epoch 2/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m17s[0m 10ms/step - loss: 0.4911 - val_loss: 0.4902
Epoch 3/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m17s[0m 10ms/step - loss: 0.4909 - val_loss: 0.4901
Epoch 4/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m22s[0m 11ms/step - loss: 0.4907 - val_loss: 0.4902
Epoch 5/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m17s[0m 10ms/step - loss: 0.4907 - val_loss: 0.4898
Epoch 6/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m19s[0m 11ms/step - loss: 0.4913 - val_loss: 0.4899
Epoch 7/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m18s[0m 10ms/step - loss: 0.4904 - val_loss: 0.4901
Epoch 8/10
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m22s[0m 11ms/step - loss: 0.4902 - val_loss: 0.4899
Epoch 9/

## 3. Generative Adversarial Networks (GANs)

### Simple GAN (Discriminator and Generator)

In [28]:
# Discriminator (a binary classifier)
discriminator = keras.models.Sequential([
    keras.layers.Flatten(input_shape=[28, 28]),
    keras.layers.Dense(150, activation="relu"),
    keras.layers.Dense(100, activation="relu"),
    keras.layers.Dense(1, activation="sigmoid") # Output 0 (fake) or 1 (real)
])

# Generator (takes random noise and generates an image)
codings_size_gan = 100 # Size of the random noise vector
generator = keras.models.Sequential([
    keras.layers.Dense(100, activation="relu", input_shape=[codings_size_gan]),
    keras.layers.Dense(150, activation="relu"),
    keras.layers.Dense(28 * 28, activation="sigmoid"), # Output image pixels
    keras.layers.Reshape([28, 28])
])

# GAN (Generator + Discriminator)
gan = keras.models.Sequential([generator, discriminator])

  super().__init__(**kwargs)
  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


In [29]:
# Compile the GAN
# Freeze discriminator when training the generator
discriminator.compile(loss="binary_crossentropy", optimizer="adam")
discriminator.trainable = False # Important for GAN training loop
gan.compile(loss="binary_crossentropy", optimizer="adam")

In [33]:
# Custom training loop for GANs (as described in the book)
# This is a conceptual implementation of the training loop for clarity.

def train_gan(gan, dataset, codings_size, n_epochs, batch_size):
    generator, discriminator = gan.layers
    for epoch in range(n_epochs):
        print(f"Epoch {epoch+1}/{n_epochs}")
        for X_batch in dataset: # Assuming dataset yields only X (images)
            # Get the actual batch size for the current batch
            current_batch_size = tf.shape(X_batch)[0]

            # Phase 1: Train the discriminator
            noise = tf.random.normal(shape=[current_batch_size, codings_size])
            generated_images = generator(noise)
            # Cast X_batch to the same data type as generated_images (tf.float32)
            X_batch = tf.cast(X_batch, tf.float32)
            X_fake_and_real = tf.concat([generated_images, X_batch], axis=0)
            # Create y_discriminator with the actual size of the combined batch using tf.zeros and tf.ones
            y_fake = tf.zeros((current_batch_size, 1), dtype=tf.float32)
            y_real = tf.ones((current_batch_size, 1), dtype=tf.float32)
            y_discriminator = tf.concat([y_fake, y_real], axis=0) # 0 for fake, 1 for real

            discriminator.trainable = True
            discriminator.train_on_batch(X_fake_and_real, y_discriminator)

            # Phase 2: Train the generator (freeze discriminator)
            # Use the same actual batch size for generator noise
            noise = tf.random.normal(shape=[current_batch_size, codings_size])
            # Create y_generator with the actual batch size using tf.ones
            y_generator = tf.ones((current_batch_size, 1), dtype=tf.float32) # Generator tries to fool discriminator (predict real)
            discriminator.trainable = False # Freeze discriminator
            gan.train_on_batch(noise, y_generator) # Train the GAN (which trains the generator)

# # Example dataset (Fashion MNIST X_train)
batch_size_gan = 32
train_dataset_gan = tf.data.Dataset.from_tensor_slices(X_train).shuffle(1000).batch(batch_size_gan)
train_gan(gan, train_dataset_gan, codings_size_gan, n_epochs=10, batch_size=batch_size_gan)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


### Deep Convolutional GANs (DCGANs)

In [34]:
# DCGAN Generator
dcgan_generator = keras.models.Sequential([
    keras.layers.Dense(7 * 7 * 128, input_shape=[codings_size_gan]),
    keras.layers.Reshape([7, 7, 128]),
    keras.layers.BatchNormalization(),
    keras.layers.Conv2DTranspose(64, kernel_size=5, strides=2, padding="same", activation="relu"), # 14x14x64
    keras.layers.BatchNormalization(),
    keras.layers.Conv2DTranspose(32, kernel_size=5, strides=2, padding="same", activation="relu"), # 28x28x32
    keras.layers.BatchNormalization(),
    keras.layers.Conv2DTranspose(1, kernel_size=5, strides=1, padding="same", activation="sigmoid") # 28x28x1
])

# DCGAN Discriminator
dcgan_discriminator = keras.models.Sequential([
    keras.layers.Conv2D(32, kernel_size=5, strides=2, padding="same", activation=keras.layers.LeakyReLU(0.2), input_shape=[28, 28, 1]), # 14x14x32
    keras.layers.Dropout(0.4),
    keras.layers.Conv2D(64, kernel_size=5, strides=2, padding="same", activation=keras.layers.LeakyReLU(0.2)), # 7x7x64
    keras.layers.Dropout(0.4),
    keras.layers.Flatten(),
    keras.layers.Dense(1, activation="sigmoid")
])

# Full DCGAN
dcgan = keras.models.Sequential([dcgan_generator, dcgan_discriminator])

# Compile and train (similar to simple GAN, but with Conv2D inputs/outputs)
dcgan_discriminator.compile(loss="binary_crossentropy", optimizer="adam")
dcgan_discriminator.trainable = False
dcgan.compile(loss="binary_crossentropy", optimizer="adam")

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
