Skip to content

rohanxavier33/monet_gan

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Project Overview

This project explores artistic style transfer using CycleGAN, specifically translating real-world photos into paintings resembling the style of Claude Monet. The model operates in an unsupervised setting, meaning it does not require paired data between source and target domains. This project has a 50.77 MiFID score on Kaggle which is in the top ten for the competition.

Problem Statement

The objective is to generate realistic Monet-style paintings from a set of real-world photos. The primary challenge is that this is an unsupervised image-to-image translation task, making CycleGAN an ideal choice due to its capability to learn domain transformations without paired examples.

Dataset

The dataset used in this project is from the Kaggle GAN Getting Started Competition:

Dataset Details:

  • 300 Monet-style paintings (256 x 256 pixels)

  • 7,000 real-world photos (256 x 256 pixels)

Exploratory Data Analysis (EDA)

EDA involved:

  • Visual inspection of images from each domain.

  • Verification and cleaning for any corrupted or inconsistent images.

  • Normalization and preprocessing of images to prepare for model training.

Model Architecture

The project utilizes CycleGAN due to its effectiveness in unsupervised style transfer tasks. The CycleGAN consists of two key components:

Generators (ResNet-based U-Net)

  • Encoder-Decoder (U-Net) structure with skip connections for retaining essential spatial information.

  • Downsampling followed by upsampling layers, ensuring preservation and accurate reconstruction of input image details.

Discriminators (PatchGAN)

  • Designed to distinguish real from generated images based on local patches.

  • Captures detailed texture and stylistic nuances crucial to Monet paintings.

Training Approach

  • Loss functions used:

    • Cycle Consistency Loss

    • Adversarial Loss

    • Identity Loss (to stabilize and improve training)

  • Optimizer: Adam with a learning rate scheduler.

  • Training details: Model trained for multiple epochs until convergence of losses.

Results & Evaluation

Generated images visually resemble Monet's style, effectively capturing characteristic color palettes and brushstroke textures.

  • Qualitative evaluation through visual inspection showed effective stylistic conversion.

  • Examples and visualizations included to demonstrate successful translations.

Analysis of Results

Successes:

  • The model successfully captured the distinctive style elements of Monet's paintings.

  • Generated images maintained high visual quality and preserved photo content accurately.

Challenges and Troubleshooting:

  • Managing training stability required careful tuning of hyperparameters and losses.

  • Occasionally encountered artifacts in generated images, improved by refining image normalization and architecture adjustments.

Visualizations

Included:

  • Sample images: side-by-side comparisons of original photos and their Monet-style translations.

  • Loss curves visualizing training progress and model stability.

  • Histograms for color distributions pre- and post-transformation.

Results and Analysis

The model effectively demonstrated CycleGAN's capability to handle unpaired image translation tasks, generating visually compelling Monet-style images. The critical analysis included recognizing what adjustments, such as tuning hyperparameters, normalization, and data augmentation, contributed positively to the final quality.

Key Learnings

  • Demonstrated CycleGAN's suitability for artistic image translation tasks.

  • Learned the significance of proper preprocessing and the effect of cycle consistency loss in training stability.

  • Gained insights into the importance of qualitative evaluation methods in GAN-based style transfer tasks.

Future Improvements

  • Experiment with deeper architectures and advanced normalization methods.

  • Apply data augmentation strategies for improved generalization.

Results Showcase

Visual examples of generated Monet-style images compared to their original photos are provided within the notebook, highlighting the effectiveness of the CycleGAN architecture in this artistic translation task.

About

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published