# GAN Model Architecture Design: Monet-Style Image Generation

In this notebook, we'll design a Generative Adversarial Network (GAN) architecture for generating Monet-style images. We'll focus on the CycleGAN architecture, which is particularly well-suited for unpaired image-to-image translation tasks like ours.

## 1. Setup and Imports

First, let's import the necessary libraries and set up our environment.

In [3]:
# Import necessary libraries
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
import os
import time
import glob
import random
from PIL import Image

# Set plot style - using a style compatible with newer matplotlib versions
plt.style.use('default')

# Set random seeds for reproducibility
np.random.seed(42)
random.seed(42)
tf.random.set_seed(42)

# Check if GPU is available
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))
print("TensorFlow version:", tf.__version__)

Num GPUs Available:  0
TensorFlow version: 2.16.2


## 2. Data Loading and Preprocessing

Let's set up our data loading and preprocessing pipeline. We'll need to load both Monet paintings and photographs, and prepare them for training.

In [4]:
# Define paths to the dataset
# Check if we're in Kaggle environment
IN_KAGGLE = os.path.exists('/kaggle/input')

if IN_KAGGLE:
    # Kaggle paths
    MONET_JPG_DIR = "/kaggle/input/gan-getting-started/monet_jpg"
    PHOTO_JPG_DIR = "/kaggle/input/gan-getting-started/photo_jpg"
else:
    # Local paths - adjust these based on your data location
    BASE_DIR = '../data'
    MONET_JPG_DIR = os.path.join(BASE_DIR, 'monet_jpg')
    PHOTO_JPG_DIR = os.path.join(BASE_DIR, 'photo_jpg')
    MONET_TFREC_DIR = os.path.join(BASE_DIR, 'monet_tfrec')
    PHOTO_TFREC_DIR = os.path.join(BASE_DIR, 'photo_tfrec')

# Check if the paths exist
print(f"Monet JPG directory exists: {os.path.exists(MONET_JPG_DIR)}")
print(f"Photo JPG directory exists: {os.path.exists(PHOTO_JPG_DIR)}")
print(f"Monet TFRecord directory exists: {os.path.exists(MONET_TFREC_DIR)}")
print(f"Photo TFRecord directory exists: {os.path.exists(PHOTO_TFREC_DIR)}")

Monet JPG directory exists: True
Photo JPG directory exists: True
Monet TFRecord directory exists: True
Photo TFRecord directory exists: True


In [5]:
# Function for loading from JPG files directly
def load_jpg_dataset(dir_path, shuffle=True, batch_size=1):
    """Load a dataset from JPG files."""
    image_paths = [os.path.join(dir_path, fname) for fname in os.listdir(dir_path) if fname.endswith('.jpg')]
    
    def load_and_preprocess_image(path):
        img = tf.io.read_file(path)
        img = tf.image.decode_jpeg(img, channels=3)
        img = tf.image.resize(img, [256, 256])
        img = tf.cast(img, tf.float32)
        img = (img / 127.5) - 1  # Normalize to [-1, 1]
        return img
    
    dataset = tf.data.Dataset.from_tensor_slices(image_paths)
    dataset = dataset.map(load_and_preprocess_image, num_parallel_calls=tf.data.AUTOTUNE)
    
    if shuffle:
        dataset = dataset.shuffle(buffer_size=len(image_paths))
    
    dataset = dataset.batch(batch_size)
    dataset = dataset.prefetch(tf.data.AUTOTUNE)
    
    return dataset, len(image_paths)

In [6]:
# Function for loading from TFRecord files
def decode_image(image):
    """Decode image from TFRecord format."""
    image = tf.image.decode_jpeg(image, channels=3)
    image = tf.cast(image, tf.float32)
    image = (image / 127.5) - 1  # Normalize to [-1, 1]
    image = tf.reshape(image, [256, 256, 3])
    return image

def read_tfrecord(example):
    """Read TFRecord example."""
    tfrecord_format = {
        'image': tf.io.FixedLenFeature([], tf.string)
    }
    example = tf.io.parse_single_example(example, tfrecord_format)
    image = decode_image(example['image'])
    return image

def load_tfrecord_dataset(filenames, shuffle=True, batch_size=1):
    """Load a dataset from TFRecord files."""
    dataset = tf.data.TFRecordDataset(filenames)
    dataset = dataset.map(read_tfrecord, num_parallel_calls=tf.data.AUTOTUNE)
    
    if shuffle:
        dataset = dataset.shuffle(buffer_size=10000)
    
    dataset = dataset.batch(batch_size)
    dataset = dataset.prefetch(tf.data.AUTOTUNE)
    
    return dataset

In [7]:
# Load datasets from all available sources
datasets = {}
counts = {}

# Try loading JPG datasets
try:
    if os.path.exists(MONET_JPG_DIR):
        datasets['monet_jpg'], counts['monet_jpg'] = load_jpg_dataset(MONET_JPG_DIR, batch_size=1)
        print(f"Loaded {counts['monet_jpg']} Monet paintings from JPG files")
    
    if os.path.exists(PHOTO_JPG_DIR):
        datasets['photo_jpg'], counts['photo_jpg'] = load_jpg_dataset(PHOTO_JPG_DIR, batch_size=1)
        print(f"Loaded {counts['photo_jpg']} photographs from JPG files")
except Exception as e:
    print(f"Error loading JPG datasets: {e}")

# Try loading TFRecord datasets
try:
    if os.path.exists(MONET_TFREC_DIR):
        monet_tfrecords = tf.io.gfile.glob(os.path.join(MONET_TFREC_DIR, '*.tfrec'))
        if monet_tfrecords:
            datasets['monet_tfrec'] = load_tfrecord_dataset(monet_tfrecords, batch_size=1)
            print(f"Loaded Monet paintings from TFRecord files")
    
    if os.path.exists(PHOTO_TFREC_DIR):
        photo_tfrecords = tf.io.gfile.glob(os.path.join(PHOTO_TFREC_DIR, '*.tfrec'))
        if photo_tfrecords:
            datasets['photo_tfrec'] = load_tfrecord_dataset(photo_tfrecords, batch_size=1)
            print(f"Loaded photographs from TFRecord files")
except Exception as e:
    print(f"Error loading TFRecord datasets: {e}")

# Choose which datasets to use for training
# Prefer TFRecord datasets if available, otherwise use JPG datasets
monet_dataset = datasets.get('monet_tfrec', datasets.get('monet_jpg'))
photo_dataset = datasets.get('photo_tfrec', datasets.get('photo_jpg'))

if monet_dataset is not None and photo_dataset is not None:
    print("Datasets loaded successfully and ready for training")
else:
    print("Error: Could not load required datasets")

Loaded 300 Monet paintings from JPG files
Loaded 7038 photographs from JPG files
Loaded Monet paintings from TFRecord files
Loaded photographs from TFRecord files
Datasets loaded successfully and ready for training


## 3. CycleGAN Architecture

Now, let's implement the CycleGAN architecture. CycleGAN consists of two generators and two discriminators:

1. **Generator G**: Transforms photos to Monet-style paintings
2. **Generator F**: Transforms Monet paintings to photos (inverse mapping)
3. **Discriminator X**: Distinguishes real photos from generated photos
4. **Discriminator Y**: Distinguishes real Monet paintings from generated Monet paintings

In [8]:
# Define the generator building blocks
def downsample(filters, size, apply_batchnorm=True):
    """Downsampling block for the generator."""
    initializer = tf.random_normal_initializer(0., 0.02)
    
    result = tf.keras.Sequential()
    result.add(tf.keras.layers.Conv2D(filters, size, strides=2, padding='same',
                                      kernel_initializer=initializer, use_bias=False))
    
    if apply_batchnorm:
        result.add(tf.keras.layers.BatchNormalization())
    
    result.add(tf.keras.layers.LeakyReLU())
    
    return result

def upsample(filters, size, apply_dropout=False):
    """Upsampling block for the generator."""
    initializer = tf.random_normal_initializer(0., 0.02)
    
    result = tf.keras.Sequential()
    result.add(tf.keras.layers.Conv2DTranspose(filters, size, strides=2, padding='same',
                                              kernel_initializer=initializer, use_bias=False))
    
    result.add(tf.keras.layers.BatchNormalization())
    
    if apply_dropout:
        result.add(tf.keras.layers.Dropout(0.5))
    
    result.add(tf.keras.layers.ReLU())
    
    return result

In [9]:
def build_generator():
    """Build the generator model."""
    inputs = tf.keras.layers.Input(shape=[256, 256, 3])
    
    # Downsampling
    down_stack = [
        downsample(64, 4, apply_batchnorm=False),  # (128, 128, 64)
        downsample(128, 4),  # (64, 64, 128)
        downsample(256, 4),  # (32, 32, 256)
        downsample(512, 4),  # (16, 16, 512)
    ]
    
    # Upsampling
    up_stack = [
        upsample(256, 4, apply_dropout=True),  # (32, 32, 256)
        upsample(128, 4),  # (64, 64, 128)
        upsample(64, 4),  # (128, 128, 64)
    ]
    
    initializer = tf.random_normal_initializer(0., 0.02)
    last = tf.keras.layers.Conv2DTranspose(3, 4, strides=2, padding='same',
                                          kernel_initializer=initializer,
                                          activation='tanh')  # (256, 256, 3)
    
    x = inputs
    
    # Downsampling through the model
    skips = []
    for down in down_stack:
        x = down(x)
        skips.append(x)
    
    # Upsampling and establishing the skip connections
    skips = reversed(skips[:-1])
    for up, skip in zip(up_stack, skips):
        x = up(x)
        x = tf.keras.layers.Concatenate()([x, skip])
    
    x = last(x)
    
    return tf.keras.Model(inputs=inputs, outputs=x)

In [10]:
def build_discriminator():
    """Build the discriminator model (PatchGAN)."""
    initializer = tf.random_normal_initializer(0., 0.02)
    
    inp = tf.keras.layers.Input(shape=[256, 256, 3], name='input_image')
    
    # Downsampling
    x = downsample(64, 4, apply_batchnorm=False)(inp)  # (128, 128, 64)
    x = downsample(128, 4)(x)  # (64, 64, 128)
    x = downsample(256, 4)(x)  # (32, 32, 256)
    
    # Final layer
    x = tf.keras.layers.Conv2D(512, 4, strides=1, padding='same',
                              kernel_initializer=initializer, use_bias=False)(x)  # (32, 32, 512)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.LeakyReLU()(x)
    
    x = tf.keras.layers.Conv2D(1, 4, strides=1, padding='same',
                              kernel_initializer=initializer)(x)  # (32, 32, 1)
    
    return tf.keras.Model(inputs=inp, outputs=x)

In [11]:
# Create the generator and discriminator models
generator_g = build_generator()  # Photo to Monet
generator_f = build_generator()  # Monet to Photo

discriminator_x = build_discriminator()  # Photo discriminator
discriminator_y = build_discriminator()  # Monet discriminator

# Print model summaries
print("Generator Model Summary:")
generator_g.summary()

print("\nDiscriminator Model Summary:")
discriminator_x.summary()

Generator Model Summary:



Discriminator Model Summary:


## 4. Loss Functions

CycleGAN uses several loss functions:

1. **Adversarial Loss**: Encourages the generator to produce images that look real to the discriminator
2. **Cycle Consistency Loss**: Ensures that translating an image to the other domain and back results in the original image
3. **Identity Loss**: Encourages the generator to preserve colors and content when the input image is already from the target domain

In [12]:
# Define loss functions
def discriminator_loss(real, generated):
    """Discriminator loss function."""
    real_loss = tf.keras.losses.BinaryCrossentropy(from_logits=True, reduction=tf.keras.losses.Reduction.NONE)(tf.ones_like(real), real)
    generated_loss = tf.keras.losses.BinaryCrossentropy(from_logits=True, reduction=tf.keras.losses.Reduction.NONE)(tf.zeros_like(generated), generated)
    
    total_loss = real_loss + generated_loss
    return tf.reduce_mean(total_loss) * 0.5

def generator_loss(generated):
    """Generator adversarial loss function."""
    return tf.reduce_mean(tf.keras.losses.BinaryCrossentropy(from_logits=True, reduction=tf.keras.losses.Reduction.NONE)(tf.ones_like(generated), generated))

def calc_cycle_loss(real_image, cycled_image, LAMBDA=10):
    """Cycle consistency loss function."""
    loss = tf.reduce_mean(tf.abs(real_image - cycled_image))
    return LAMBDA * loss

def identity_loss(real_image, same_image, LAMBDA=5):
    """Identity loss function."""
    loss = tf.reduce_mean(tf.abs(real_image - same_image))
    return LAMBDA * 0.5 * loss

## 5. Save Models

Now that we've defined our model architecture, let's save the models so they can be loaded in the training notebook.

In [13]:
# Create models directory if it doesn't exist
models_dir = '../models'
os.makedirs(models_dir, exist_ok=True)

# Save the models
generator_g.save(os.path.join(models_dir, 'generator_g.keras'))
generator_f.save(os.path.join(models_dir, 'generator_f.keras'))
discriminator_x.save(os.path.join(models_dir, 'discriminator_x.keras'))
discriminator_y.save(os.path.join(models_dir, 'discriminator_y.keras'))

print("Models saved successfully to the 'models' directory.")

Models saved successfully to the 'models' directory.


## 6. Conclusion

In this notebook, we've designed the CycleGAN architecture for generating Monet-style images from photographs. We've defined:

1. **Data Loading and Preprocessing**: We've set up pipelines to load and preprocess the Monet paintings and photographs datasets.
2. **Generator and Discriminator Models**: We've implemented the generator and discriminator architectures using TensorFlow and Keras.
3. **Loss Functions**: We've defined the adversarial, cycle consistency, and identity loss functions that are essential for training CycleGAN.

In the next notebook (03b_Model_Training.ipynb), we'll load these models and implement the training process to generate Monet-style images.