# I’m Something of a Painter Myself – Monet GAN
#### Author: James Coffey   
#### Date: 2025‑07‑30
#### Challenge URL: [I’m Something of a Painter Myself](https://www.kaggle.com/competitions/gan-getting-started)

# Discussion – Monet CycleGAN (Public MiFID 75.00706)

## Key Implementation Decisions

### Data pipeline

* **TFRecord → `tf.data`** — I read the Monet & photo shards straight from
  Kaggle GCS and stream‑prefetch them on TPU.  Parsing converts JPEG bytes to
  `float32` in the `[-1, 1]` range.
* **Light augmentations** — random flip, brightness/contrast/hue jitter, and the
  canonical 286 → 256 random crop (matching the original CycleGAN tutorial).

### Model architecture

* **Encoder–decoder generators with skip connections** (8 down‑sampling + 7
  up‑sampling blocks) built entirely from Conv/ConvT layers.  Instance‑norm
  (`GroupNorm(groups=-1)`) is used instead of batch‑norm.
* **70 × 70 PatchGAN discriminators** — three stride‑2 convs plus stride‑1
  layers yield a `(30, 30, 1)` logit map, enforcing local texture realism.

### Optimization & training

* **Separate optimizers** for generators and discriminators
  (Adam 2 e‑4 / β₁ = 0.5).
* **BCE‑with‑logits GAN loss** (not LS‑GAN) and standard cycle + identity
  penalties (`λ_cycle = 10`, `λ_id = 0.5 λ_cycle`).
* **Mixed precision disabled** for the leaderboard run; it’s wired up but
  commented out to avoid numerical jitter observed in early experiments.

### Submission workflow

Generated images are first saved to a local `/kaggle/working/images` folder and
then zipped via `shutil.make_archive` to meet Kaggle’s single‑file output rule.

## Observations about the Data

1. **Domain imbalance** — only 300 Monet paintings versus 7 028 photos.  I
   relied on cycle‑consistency plus heavier color jitter on the Monet side to
   combat over‑fitting.
2. **Palette skew** — Monet HSV histograms lean heavily toward greens and soft
   blues; even a ±5 % hue jitter noticeably broadens color coverage.
3. **Clean dataset** — no corrupt JPEGs or duplicate hashes, so no additional
   filtering was necessary.

## Model Insights

* **Identity‑loss weight matters** — using `0.5 λ_cycle` kept skies and light
  areas from color‑shifting, without slowing convergence.
* **InstanceNorm beats BatchNorm** for tiny TPU batches; BatchNorm variants
  produce visible color flicker and higher MiFID.

## Results

* **Public leaderboard**: **75.00706 MiFID** (lower = better) at the time of
  writing.
* **Runtime**: 24 epochs completed in ~2 h 40 m on a single TPU v3‑8.
* **Visual quality**: foliage, water, and skies adopt a convincing Monet
  brush‑stroke texture (consistent with the Monet training set).


## Future Work

1. **Self‑attention layers** in the deepest encoder blocks to capture larger
   context.
2. **CUT / FastCUT** to halve training time while maintaining FID/MiFID.
3. **Re‑enable mixed precision** now that the graph is naming‑clean; this should
   shave \~20 % off runtime.
4. **Multi‑style pre‑training** (Monet + Van Gogh + Cézanne) followed by
   fine‑tuning to Monet‑only for better generalization.

With a straightforward encoder–decoder CycleGAN, disciplined hyper‑parameters,
and TPU‑optimized data loading, I achieved a MiFID of ≈75 without exotic tricks.
The remaining gap to state‑of‑the‑art now feels like an engineering iteration
rather than a research leap.

# Imports & Seeds

In [None]:
import random
import shutil
from pathlib import Path

import numpy as np
import tensorflow as tf
from tensorflow import keras
from kaggle_datasets import KaggleDatasets
import matplotlib.pyplot as plt
from art_creation_gan import AUTO, SEED


# Make randomness repeatable across hosts and replicas
np.random.seed(SEED)
random.seed(SEED)

E0000 00:00:1754519838.464572      10 common_lib.cc:612] Could not set metric server port: INVALID_ARGUMENT: Could not find SliceBuilder port 8471 in any of the 0 ports provided in `tpu_process_addresses`="local"
=== Source Location Trace: === 
learning/45eac/tfrc/runtime/common_lib.cc:230


# TPU Detection

In [None]:
from art_creation_gan import detect_tpu

strategy = detect_tpu()

# TFRecord Data Pipeline

In [None]:
from art_creation_gan import make_dataset

GCS_PATH = KaggleDatasets().get_gcs_path()
MONET_TFREC = tf.io.gfile.glob(f"{GCS_PATH}/monet_tfrec/*.tfrec")
PHOTO_TFREC = tf.io.gfile.glob(f"{GCS_PATH}/photo_tfrec/*.tfrec")
print(f"Monet TFRecords: {len(MONET_TFREC)}  |  Photo TFRecords: {len(PHOTO_TFREC)}")

# Build datasets
monet_ds = make_dataset(MONET_TFREC, augment=True)
photo_ds = make_dataset(PHOTO_TFREC, augment=False)

# Quick sanity‑check visual
sample_monet = next(iter(monet_ds.take(1)))[0]
sample_photo = next(iter(photo_ds.take(1)))[0]
plt.figure(figsize=(6, 3))
plt.subplot(1, 2, 1)
plt.imshow((sample_photo + 1) / 2)
plt.title("Photo")
plt.axis(False)
plt.subplot(1, 2, 2)
plt.imshow((sample_monet + 1) / 2)
plt.title("Monet")
plt.axis(False)
plt.show()

# Architecture – downsample & upsample blocks

In [None]:
# See /src/art_creation_gan/blocks.py for downsample() and upsample()

# Generator & Discriminator builders

In [None]:
from art_creation_gan import build_generator, build_discriminator

# Instantiate models under distribution scope

In [None]:
with strategy.scope():
    monet_generator = build_generator()  # Photo → Monet
    photo_generator = build_generator()  # Monet → Photo

    monet_discriminator = build_discriminator()
    photo_discriminator = build_discriminator()

## Demo a single forward pass for sanity

In [None]:
_to_monet = monet_generator(sample_photo[None, ...])
plt.subplot(1, 2, 1)
plt.title("Original Photo")
plt.imshow(sample_photo * 0.5 + 0.5)
plt.axis(False)
plt.subplot(1, 2, 2)
plt.title("Monet-esque Photo")
plt.imshow(_to_monet[0] * 0.5 + 0.5)
plt.axis(False)
plt.show()

# CycleGAN model

In [None]:
from art_creation_gan import CycleGan

# Loss functions

In [None]:
from art_creation_gan import discriminator_loss, generator_loss, calc_cycle_loss, identity_loss

# Optimizers

In [None]:
with strategy.scope():
    monet_gen_opt = keras.optimizers.Adam(2e-4, beta_1=0.5)
    photo_gen_opt = keras.optimizers.Adam(2e-4, beta_1=0.5)
    monet_disc_opt = keras.optimizers.Adam(2e-4, beta_1=0.5)
    photo_disc_opt = keras.optimizers.Adam(2e-4, beta_1=0.5)

    cycle_gan_model = CycleGan(
        monet_generator, photo_generator, monet_discriminator, photo_discriminator
    )
    cycle_gan_model.compile(
        m_gen_optimizer=monet_gen_opt,
        p_gen_optimizer=photo_gen_opt,
        m_disc_optimizer=monet_disc_opt,
        p_disc_optimizer=photo_disc_opt,
        gen_loss_fn=generator_loss,
        disc_loss_fn=discriminator_loss,
        cycle_loss_fn=calc_cycle_loss,
        identity_loss_fn=identity_loss,
    )

# Training loop

In [None]:
train_ds = tf.data.Dataset.zip((monet_ds, photo_ds)).prefetch(AUTO)
_ = cycle_gan_model(next(iter(train_ds.take(1))), training=False)  # Build variables
cycle_gan_model.fit(train_ds, epochs=25)

# Visualize a few results

In [None]:
fig, ax = plt.subplots(5, 2, figsize=(12, 12))
for i, img in enumerate(photo_ds.take(5)):
    pred = monet_generator(img, training=False)[0].numpy()
    pred = (pred * 127.5 + 127.5).astype(np.uint8)
    inp = (img[0] * 127.5 + 127.5).numpy().astype(np.uint8)
    ax[i, 0].imshow(inp)
    ax[i, 0].set_title("Input Photo")
    ax[i, 0].axis("off")
    ax[i, 1].imshow(pred)
    ax[i, 1].set_title("Monet Output")
    ax[i, 1].axis("off")
plt.show()

# Generate submission zip

In [None]:
SUB_DIR = Path("/kaggle/working/images")
SUB_DIR.mkdir(exist_ok=True)

idx = 0
for batch in photo_ds:
    fake_batch = monet_generator(batch, training=False).numpy()
    for img in fake_batch:
        arr = ((img * 127.5) + 127.5).astype(np.uint8)
        keras.utils.save_img(SUB_DIR / f"{idx}.jpg", arr, scale=False)
        idx += 1

print(f"✅ wrote {idx} images")
shutil.make_archive("/kaggle/working/images", "zip", SUB_DIR)