Project topic & generative DL
This project translates natural photos into Monet-style images. Generative deep learning—specifically GANs—pits a generator that synthesizes images against a discriminator that distinguishes real from fake, improving image realism through adversarial training.

Data (size, dimension, structure)
The dataset contains 300 Monet paintings and about 7,038 photos. All images are RGB with varying widths and heights; I standardize everything to 256×256 for training and inference.

EDA (visuals)
I show a 3×3 grid of Monet samples and a 3×3 grid of Photo samples to visualize style differences. I include pixel-intensity histograms to compare tonal distributions. Optionally, I add width/height histograms and a quick duplicate check to confirm data quality.

Data cleaning / preprocessing
All images are resized to 256×256; for GAN training I normalize to [-1, 1] and use simple augmentations such as horizontal flips and random crops. This stabilizes training and matches common baselines.

Plan of analysis
I begin with a fast color-transfer baseline to guarantee a valid submission, then outline a CycleGAN approach that uses cycle-consistency and identity losses with a PatchGAN discriminator. MiFID is the evaluation metric reported on the leaderboard.

Model architecture & tuning
The planned GAN uses a ResNet-9 (or U-Net) generator and a 70×70 PatchGAN discriminator. I vary λ_cycle, λ_identity, learning rate, β1, batch size, and training steps. I compare the baseline to the CycleGAN plan (or a partial training run) to show the effect of architecture and hyperparameters.

Results & analysis
I include a grid of generated outputs to assess visual quality. I report the Kaggle MiFID (public leaderboard score) and describe what helped or hurt performance along with any tuning I tried.

Conclusion
I restate the best result, summarize key learnings, explain what did not work and why, and list concrete next steps to improve the model.

Repo & submission
I provide a public GitHub repository with the notebook and a concise README explaining how to reproduce results, and I include a screenshot of my Kaggle leaderboard entry.

In [68]:
# ONE-CELL SUBMISSION MAKER (
from pathlib import Path
import numpy as np, zipfile
from PIL import Image
import os

# 1) locate data
ROOT = Path("/kaggle/input/gan-getting-started")
MONET_DIR = ROOT/"monet_jpg"
PHOTO_DIR = ROOT/"photo_jpg"
monet_paths = sorted(MONET_DIR.glob("*.jpg"))
photo_paths = sorted(PHOTO_DIR.glob("*.jpg"))
print("Counts -> Monet:", len(monet_paths), "Photos:", len(photo_paths))

# 2) fast baseline 
def monetize_baseline(photo_img, ref_img):
    p = np.asarray(photo_img.convert("RGB").resize((256,256))).astype(np.float32)
    r = np.asarray(ref_img.convert("RGB").resize((256,256))).astype(np.float32)
    p_m, p_s = p.mean((0,1), keepdims=True), p.std((0,1), keepdims=True)+1e-6
    r_m, r_s = r.mean((0,1), keepdims=True), r.std((0,1), keepdims=True)+1e-6
    out = (p - p_m) / p_s * r_s + r_m
    return Image.fromarray(np.clip(out,0,255).astype(np.uint8))

# 3) generate 7000 JPGs and zip them
out_dir = Path("/kaggle/working/images"); out_dir.mkdir(exist_ok=True)
for f in out_dir.glob("*"): f.unlink()  # clean any old files

rng = np.random.default_rng(42)
refs = rng.choice(monet_paths, min(50, len(monet_paths)), replace=False)
sel  = rng.choice(photo_paths, min(7000, len(photo_paths)), replace=False)

for i, p in enumerate(sel, 1):
    im  = Image.open(p)
    ref = Image.open(rng.choice(refs))
    monetize_baseline(im, ref).save(out_dir/f"image_{i:05d}.jpg", "JPEG", quality=92)
    if i % 500 == 0: print(f"{i}/{len(sel)}")

zip_path = "/kaggle/working/images.zip"=-
with zipfile.ZipFile(zip_path, "w", zipfile.ZIP_DEFLATED) as z:
    for f in out_dir.glob("*.jpg"):
        z.write(f, f.name)

print("Ready to submit:", zip_path, "| count:", len(list(out_dir.glob('*.jpg'))))


Counts -> Monet: 300 Photos: 7038
500/7000
1000/7000
1500/7000
2000/7000
2500/7000
3000/7000
3500/7000
4000/7000
4500/7000
5000/7000
5500/7000
6000/7000
6500/7000
7000/7000
Ready to submit: /kaggle/working/images.zip | count: 7000


In [11]:
# FAST generator (keeps Save & Run All quick and stable)
from PIL import Image
import numpy as np

def monetize_baseline(photo_img, ref_img):
    p = np.asarray(photo_img.convert("RGB").resize((256,256))).astype(np.float32)
    r = np.asarray(ref_img.convert("RGB").resize((256,256))).astype(np.float32)
    p_m, p_s = p.mean((0,1), keepdims=True), p.std((0,1), keepdims=True)+1e-6
    r_m, r_s = r.mean((0,1), keepdims=True), r.std((0,1), keepdims=True)+1e-6
    out = (p - p_m) / p_s * r_s + r_m
    return Image.fromarray(np.clip(out,0,255).astype(np.uint8))


In [None]:
from pathlib import Path
import zipfile
from IPython.display import FileLink

out_dir = Path("/kaggle/working/images")
zip_path = Path("/kaggle/working/images.zip")

# If the zip doesn't exist yet, create it from the JPGs you see in /kaggle/working/images
if not zip_path.exists():
    with zipfile.ZipFile(zip_path, "w", zipfile.ZIP_DEFLATED) as z:
        for f in out_dir.glob("*.jpg"):
            z.write(f, f.name)

print("zip exists:", zip_path.exists(), "files:", len(list(out_dir.glob('*.jpg'))))
FileLink(str(zip_path))


In [67]:
from pathlib import Path
import numpy as np, zipfile
from PIL import Image

# assumes monet_paths, photo_paths, monetize_baseline() already defined
out_dir = Path("/kaggle/working/images"); out_dir.mkdir(exist_ok=True)
# clean old outputs (PNGs, etc.)
for f in out_dir.glob("*.*"):
    f.unlink()

N = min(7000, len(photo_paths))          # 7k fits the 7k–10k rule
sel = np.random.choice(photo_paths, N, replace=False)
ref_pool = np.random.choice(monet_paths, min(50, len(monet_paths)), replace=False)

for i, p in enumerate(sel, 1):
    photo = Image.open(p)
    ref   = Image.open(np.random.choice(ref_pool))
    out   = monetize_baseline(photo, ref).resize((256,256)).convert("RGB")
    out.save(out_dir / f"image_{i:05d}.jpg", "JPEG", quality=92)
    if i % 500 == 0: print(f"{i}/{N} generated")

zip_path = "/kaggle/working/images.zip"
with zipfile.ZipFile(zip_path, "w", zipfile.ZIP_DEFLATED) as z:
    for f in out_dir.glob("*.jpg"):
        z.write(f, arcname=f.name)

print("Ready to submit:", zip_path, "| count:", len(list(out_dir.glob('*.jpg'))))


500/7000 generated
1000/7000 generated
1500/7000 generated
2000/7000 generated
2500/7000 generated
3000/7000 generated
3500/7000 generated
4000/7000 generated
4500/7000 generated
5000/7000 generated
5500/7000 generated
6000/7000 generated
6500/7000 generated
7000/7000 generated
Ready to submit: /kaggle/working/images.zip | count: 7000


In [None]:
import matplotlib.pyplot as plt
gen_paths = sorted((Path("/kaggle/working/images")).glob("*.jpg"))[:9]
plt.figure(figsize=(6,6))
for i,p in enumerate(gen_paths,1):
    plt.subplot(3,3,i); plt.imshow(Image.open(p)); plt.axis("off")
plt.suptitle("Generated Monet-style samples"); plt.show()


Data: Monet_jpg (≈7k paintings), Photo_jpg (≈7k photos), Test_jpg (≈3k photos).
Images are RGB; we’ll standardize to 256×256 for training/inference.


In [None]:
def show_grid(paths, title):
    idx = np.random.choice(len(paths), 9, replace=False)
    plt.figure(figsize=(6,6))
    for i,j in enumerate(idx,1):
        plt.subplot(3,3,i)
        plt.imshow(Image.open(paths[j]))
        plt.axis("off")
    plt.suptitle(title); plt.show()

show_grid(monet_paths, "Monet samples")
show_grid(photo_paths, "Photo samples")


In [None]:
def intensity_hist(paths, n=200):
    vals = []
    for p in np.random.choice(paths, min(n, len(paths)), replace=False):
        arr = np.asarray(Image.open(p).convert("L").resize((256,256)))
        vals.append(arr.flatten())
    vals = np.concatenate(vals)
    plt.hist(vals, bins=30)
    plt.title("Pixel intensity distribution"); plt.xlabel("0..255"); plt.ylabel("count")
    plt.show()

intensity_hist(monet_paths)
intensity_hist(photo_paths)


Plan: Start with a fast baseline (histogram-matching color transfer) to guarantee a valid submission. 
Then, if time allows, train a lightweight CycleGAN (ResNet-9 generator + 70×70 PatchGAN) with λ_cycle=10, λ_id=5, lr=2e-4, β1=0.5, batch=1–4, a few epochs.
Evaluation: Kaggle MiFID; we show qualitative grids and report public score.


In [None]:
# Try to use scikit-image; if unavailable, fall back to simple channel-wise mean/std transfer
try:
    from skimage.exposure import match_histograms
    def monetize_baseline(photo_img: Image.Image, ref_img: Image.Image) -> Image.Image:
        photo = photo_img.convert("RGB").resize((256,256))
        ref   = ref_img.convert("RGB").resize((256,256))
        matched = match_histograms(np.asarray(photo), np.asarray(ref), channel_axis=-1)
        return Image.fromarray(np.clip(matched,0,255).astype(np.uint8))
except Exception:
    def monetize_baseline(photo_img: Image.Image, ref_img: Image.Image) -> Image.Image:
        p = np.asarray(photo_img.convert("RGB").resize((256,256))).astype(np.float32)
        r = np.asarray(ref_img.convert("RGB").resize((256,256))).astype(np.float32)
        p_m, p_s = p.mean((0,1), keepdims=True), p.std((0,1), keepdims=True)+1e-6
        r_m, r_s = r.mean((0,1), keepdims=True), r.std((0,1), keepdims=True)+1e-6
        out = (p - p_m) / p_s * r_s + r_m
        return Image.fromarray(np.clip(out,0,255).astype(np.uint8))


In [None]:
out_dir = Path("/kaggle/working/images"); out_dir.mkdir(exist_ok=True)
ref_pool = np.random.choice(monet_paths, min(50, len(monet_paths)), replace=False)

for i, p in enumerate(test_paths, 1):
    photo = Image.open(p)
    ref   = Image.open(np.random.choice(ref_pool))
    out   = monetize_baseline(photo, ref)
    out.save(out_dir / (p.stem + ".png"))
    if i % 300 == 0: print(f"{i}/{len(test_paths)}")

zip_path = "/kaggle/working/images.zip"
with zipfile.ZipFile(zip_path, "w", zipfile.ZIP_DEFLATED) as z:
    for f in out_dir.glob("*.png"):
        z.write(f, arcname=f.name)

print("Ready to submit:", zip_path)


CycleGAN plan:
- Generators: ResNet-9 (photo→Monet, Monet→photo) with InstanceNorm; 256×256 inputs; residual blocks in the bottleneck.
- Discriminators: 70×70 PatchGAN.
- Loss: L_adv (LSGAN), L_cycle (λ=10), L_identity (λ=5).
- Optimizer: Adam(lr=2e-4, β1=0.5). Batch size=1–4. Train for N steps/epochs with image flips/crops.
We will compare LR/λ settings and report CV proxy losses + qualitative grids; public MiFID used for final comparison.


Plan of analysis: Start with this fast color-transfer baseline to ensure a valid MiFID submission. 
Extension plan: CycleGAN with ResNet-9 generators and 70×70 PatchGAN discriminators; losses = adversarial + cycle (λ=10) + identity (λ=5); Adam lr=2e-4, β1=0.5, batch 1–4.
