Monet-Style Painting Generation using CycleGAN (https://www.kaggle.com/code/rohanxaviergupta/cycleganmonet | https://rohanxavier33.github.io/monet_gan/)
This project explores artistic style transfer using CycleGAN, specifically translating real-world photos into paintings resembling the style of Claude Monet. The model operates in an unsupervised setting, meaning it does not require paired data between source and target domains. This project has a 50.77 MiFID score on Kaggle which is in the top ten for the competition.
The objective is to generate realistic Monet-style paintings from a set of real-world photos. The primary challenge is that this is an unsupervised image-to-image translation task, making CycleGAN an ideal choice due to its capability to learn domain transformations without paired examples.
The dataset used in this project is from the Kaggle GAN Getting Started Competition:
- Citation: Amy Jang, Ana Sofia Uzsoy, and Phil Culliton. I'm Something of a Painter Myself, Kaggle, 2020.
Dataset Details:
-
300 Monet-style paintings (256 x 256 pixels)
-
7,000 real-world photos (256 x 256 pixels)
EDA involved:
-
Visual inspection of images from each domain.
-
Verification and cleaning for any corrupted or inconsistent images.
-
Normalization and preprocessing of images to prepare for model training.
The project utilizes CycleGAN due to its effectiveness in unsupervised style transfer tasks. The CycleGAN consists of two key components:
-
Encoder-Decoder (U-Net) structure with skip connections for retaining essential spatial information.
-
Downsampling followed by upsampling layers, ensuring preservation and accurate reconstruction of input image details.
-
Designed to distinguish real from generated images based on local patches.
-
Captures detailed texture and stylistic nuances crucial to Monet paintings.
-
Loss functions used:
-
Cycle Consistency Loss
-
Adversarial Loss
-
Identity Loss (to stabilize and improve training)
-
-
Optimizer: Adam with a learning rate scheduler.
-
Training details: Model trained for multiple epochs until convergence of losses.
Generated images visually resemble Monet's style, effectively capturing characteristic color palettes and brushstroke textures.
-
Qualitative evaluation through visual inspection showed effective stylistic conversion.
-
Examples and visualizations included to demonstrate successful translations.
-
The model successfully captured the distinctive style elements of Monet's paintings.
-
Generated images maintained high visual quality and preserved photo content accurately.
-
Managing training stability required careful tuning of hyperparameters and losses.
-
Occasionally encountered artifacts in generated images, improved by refining image normalization and architecture adjustments.
Included:
-
Sample images: side-by-side comparisons of original photos and their Monet-style translations.
-
Loss curves visualizing training progress and model stability.
-
Histograms for color distributions pre- and post-transformation.
The model effectively demonstrated CycleGAN's capability to handle unpaired image translation tasks, generating visually compelling Monet-style images. The critical analysis included recognizing what adjustments, such as tuning hyperparameters, normalization, and data augmentation, contributed positively to the final quality.
-
Demonstrated CycleGAN's suitability for artistic image translation tasks.
-
Learned the significance of proper preprocessing and the effect of cycle consistency loss in training stability.
-
Gained insights into the importance of qualitative evaluation methods in GAN-based style transfer tasks.
-
Experiment with deeper architectures and advanced normalization methods.
-
Apply data augmentation strategies for improved generalization.
Visual examples of generated Monet-style images compared to their original photos are provided within the notebook, highlighting the effectiveness of the CycleGAN architecture in this artistic translation task.