<a href="https://colab.research.google.com/github/bkkaggle/pytorch-CycleGAN-and-pix2pix/blob/master/pix2pix.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Install

In [None]:
!git clone https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix

In [9]:
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "3"
# os.chdir('pytorch-CycleGAN-and-pix2pix/')

In [4]:
# !pip install -r requirements.txt

# Datasets

Download one of the official datasets with:

-   `bash ./datasets/download_pix2pix_dataset.sh [cityscapes, night2day, edges2handbags, edges2shoes, facades, maps]`

Or use your own dataset by creating the appropriate folders and adding in the images. Follow the instructions [here](https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix/blob/master/docs/datasets.md#pix2pix-datasets).

In [5]:
!bash ./datasets/download_pix2pix_dataset.sh facades

Specified [facades]
for details.

--2025-08-03 19:33:27--  http://efrosgans.eecs.berkeley.edu/pix2pix/datasets/facades.tar.gz
Resolving efrosgans.eecs.berkeley.edu (efrosgans.eecs.berkeley.edu)... 128.32.244.190
Connecting to efrosgans.eecs.berkeley.edu (efrosgans.eecs.berkeley.edu)|128.32.244.190|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 30168306 (29M) [application/x-gzip]
Saving to: ‘./datasets/facades.tar.gz’


2025-08-03 19:34:13 (663 KB/s) - ‘./datasets/facades.tar.gz’ saved [30168306/30168306]

facades/
facades/test/
facades/test/27.jpg
facades/test/5.jpg
facades/test/72.jpg
facades/test/1.jpg
facades/test/10.jpg
facades/test/100.jpg
facades/test/101.jpg
facades/test/102.jpg
facades/test/103.jpg
facades/test/104.jpg
facades/test/105.jpg
facades/test/106.jpg
facades/test/11.jpg
facades/test/12.jpg
facades/test/13.jpg
facades/test/14.jpg
facades/test/15.jpg
facades/test/16.jpg
facades/test/17.jpg
facades/test/18.jpg
facades/test/19.jpg
facades/test/2.

# Pretrained models

Download one of the official pretrained models with:

-   `bash ./scripts/download_pix2pix_model.sh [edges2shoes, sat2map, map2sat, facades_label2photo, and day2night]`

Or add your own pretrained model to `./checkpoints/{NAME}_pretrained/latest_net_G.pt`

In [6]:
!bash ./scripts/download_pix2pix_model.sh facades_label2photo

Note: available models are edges2shoes, sat2map, map2sat, facades_label2photo, and day2night
Specified [facades_label2photo]
for details.

--2025-08-03 19:34:13--  http://efrosgans.eecs.berkeley.edu/pix2pix/models-pytorch/facades_label2photo.pth
Resolving efrosgans.eecs.berkeley.edu (efrosgans.eecs.berkeley.edu)... 128.32.244.190
Connecting to efrosgans.eecs.berkeley.edu (efrosgans.eecs.berkeley.edu)|128.32.244.190|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 217704720 (208M)
Saving to: ‘./checkpoints/facades_label2photo_pretrained/latest_net_G.pth’


2025-08-03 19:38:48 (777 KB/s) - ‘./checkpoints/facades_label2photo_pretrained/latest_net_G.pth’ saved [217704720/217704720]



# Training

-   `python train.py --dataroot ./datasets/facades --name facades_pix2pix --model pix2pix --direction BtoA`

Change the `--dataroot` and `--name` to your own dataset's path and model's name. Use `--gpu_ids 0,1,..` to train on multiple GPUs and `--batch_size` to change the batch size. Add `--direction BtoA` if you want to train a model to transfrom from class B to A.

In [11]:
!python train.py --dataroot ./datasets/personal_gallery --name facades_pix2pix --model pix2pix --direction BtoA

----------------- Options ---------------
               batch_size: 1                             
                    beta1: 0.5                           
          checkpoints_dir: ./checkpoints                 
           continue_train: False                         
                crop_size: 256                           
                 dataroot: ./datasets/personal_gallery   	[default: None]
             dataset_mode: aligned                       
                direction: BtoA                          	[default: AtoB]
             display_freq: 400                           
          display_winsize: 256                           
                    epoch: latest                        
              epoch_count: 1                             
                 gan_mode: vanilla                       
                  gpu_ids: 0                             
                init_gain: 0.02                          
                init_type: normal                       

# Testing

-   `python test.py --dataroot ./datasets/facades --direction BtoA --model pix2pix --name facades_pix2pix`

Change the `--dataroot`, `--name`, and `--direction` to be consistent with your trained model's configuration and how you want to transform images.

> from https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix:
> Note that we specified --direction BtoA as Facades dataset's A to B direction is photos to labels.

> If you would like to apply a pre-trained model to a collection of input images (rather than image pairs), please use --model test option. See ./scripts/test_single.sh for how to apply a model to Facade label maps (stored in the directory facades/testB).

> See a list of currently available models at ./scripts/download_pix2pix_model.sh

In [12]:
!ls checkpoints/

facades_label2photo_pretrained	facades_pix2pix


In [16]:
!python test.py --dataroot ./datasets/personal_gallery --direction BtoA --model pix2pix --name facades_pix2pix --use_wandb

----------------- Options ---------------
             aspect_ratio: 1.0                           
               batch_size: 1                             
          checkpoints_dir: ./checkpoints                 
                crop_size: 256                           
                 dataroot: ./datasets/personal_gallery   	[default: None]
             dataset_mode: aligned                       
                direction: BtoA                          	[default: AtoB]
          display_winsize: 256                           
                    epoch: latest                        
                     eval: False                         
                  gpu_ids: 0                             
                init_gain: 0.02                          
                init_type: normal                        
                 input_nc: 3                             
                  isTrain: False                         	[default: None]
                load_iter: 0            

# Visualize

# Understanding pix2pix: Image-to-Image Translation

## What is pix2pix?

**pix2pix** is a Generative Adversarial Network (GAN) designed for **image-to-image translation**. It learns to map from one image domain to another using paired training data.

## How it Works:

1. **Generator Network**: Takes an input image and generates a corresponding output image
2. **Discriminator Network**: Tries to distinguish between real and fake image pairs
3. **Adversarial Training**: Generator tries to fool the discriminator, discriminator tries to catch fakes

## The Three Images Explained:

When you run pix2pix testing, you get three images for each sample:

### 🎯 **real_A**: Input Image (Source Domain)
- This is the **original input** image you want to transform
- In facades dataset: architectural labels/sketches
- In your case: the source image you're transforming

### 🏆 **real_B**: Ground Truth Target (Target Domain) 
- This is the **real target** image (what the output should look like)
- In facades dataset: actual photographs of buildings
- Used for comparison to see how well the model performed

### 🤖 **fake_B**: Generated Output (Model's Prediction)
- This is what the **pix2pix model generated** from real_A
- The model's attempt to transform real_A into the target domain
- Compare this with real_B to evaluate quality

## Example Workflow:
```
real_A (label) → [pix2pix model] → fake_B (generated photo)
                                        ↓
                                   Compare with real_B (actual photo)
```

## Evaluation:
- **Good model**: fake_B should look very similar to real_B
- **Poor model**: fake_B will look unrealistic or different from real_B

Let's visualize all three to see the comparison!

In [None]:
import matplotlib.pyplot as plt

img = plt.imread('./results/facades_label2photo_pretrained/test_latest/images/100_fake_B.png')
plt.imshow(img)

In [None]:
img = plt.imread('./results/facades_label2photo_pretrained/test_latest/images/100_real_A.png')
plt.imshow(img)

In [None]:
img = plt.imread('./results/facades_label2photo_pretrained/test_latest/images/100_real_B.png')
plt.imshow(img)

In [None]:
# Complete pix2pix Visualization: Compare Input, Target, and Generated Images
import matplotlib.pyplot as plt
import os
import glob

def visualize_pix2pix_results(results_dir, sample_id="100"):
    """
    Visualize pix2pix results showing real_A, real_B, and fake_B side by side.
    
    Args:
        results_dir: Path to results directory
        sample_id: ID of the sample to visualize (default: "100")
    """
    
    # Define image paths
    real_A_path = f"{results_dir}/{sample_id}_real_A.png"
    real_B_path = f"{results_dir}/{sample_id}_real_B.png"
    fake_B_path = f"{results_dir}/{sample_id}_fake_B.png"
    
    # Check if files exist
    paths = [real_A_path, real_B_path, fake_B_path]
    labels = ["Input (real_A)", "Ground Truth (real_B)", "Generated (fake_B)"]
    descriptions = [
        "Original input image\n(what we want to transform)",
        "Target ground truth\n(what it should look like)", 
        "Model's generated output\n(what pix2pix produced)"
    ]
    
    # Create subplot
    fig, axes = plt.subplots(1, 3, figsize=(15, 5))
    fig.suptitle(f'pix2pix Results Comparison - Sample {sample_id}', fontsize=16, fontweight='bold')
    
    for i, (path, label, desc) in enumerate(zip(paths, labels, descriptions)):
        if os.path.exists(path):
            img = plt.imread(path)
            axes[i].imshow(img)
            axes[i].set_title(f'{label}\n{desc}', fontsize=10, pad=10)
            axes[i].axis('off')
        else:
            axes[i].text(0.5, 0.5, f'Image not found:\n{path}', 
                        ha='center', va='center', transform=axes[i].transAxes,
                        fontsize=10, color='red')
            axes[i].set_title(label, fontsize=12)
            axes[i].axis('off')
    
    plt.tight_layout()
    plt.show()
    
    # Print analysis
    print("🔍 ANALYSIS:")
    print("• Compare the middle (Ground Truth) with the right (Generated)")
    print("• Good model: Generated image should closely match Ground Truth")
    print("• Look for: realistic textures, correct colors, proper structure")
    print("• Common issues: blurriness, color shifts, missing details")

# Find available results directories
results_base = "./results"
if os.path.exists(results_base):
    result_dirs = [d for d in os.listdir(results_base) if os.path.isdir(os.path.join(results_base, d))]
    print(f"📁 Available result directories: {result_dirs}")
    
    # Try to find the most recent results
    for result_dir in result_dirs:
        full_path = os.path.join(results_base, result_dir)
        
        # Look for test results
        test_paths = glob.glob(f"{full_path}/test*/images/")
        if test_paths:
            images_dir = test_paths[0]
            print(f"\n🎯 Using results from: {images_dir}")
            
            # Find available sample IDs
            sample_files = glob.glob(f"{images_dir}*_real_A.png")
            if sample_files:
                # Extract sample IDs
                sample_ids = [os.path.basename(f).split('_')[0] for f in sample_files]
                print(f"📊 Available samples: {sample_ids[:5]}..." if len(sample_ids) > 5 else f"📊 Available samples: {sample_ids}")
                
                # Visualize the first available sample
                sample_id = sample_ids[0]
                print(f"\n🖼️  Visualizing sample: {sample_id}")
                visualize_pix2pix_results(images_dir, sample_id)
                break
            else:
                print(f"❌ No image files found in {images_dir}")
else:
    print("❌ Results directory not found. Please run the test command first!")

# Your Specific Use Case: Facades Dataset

## What You're Seeing:

In the **facades_label2photo** model you're using:

- **real_A (Input)**: 🏗️ **Architectural label maps** - simplified, colored semantic segmentation of building facades
  - Different colors represent different building elements (walls, windows, doors, etc.)
  - Think of it as a "coloring book" version of a building

- **real_B (Ground Truth)**: 📸 **Real photographs** - actual photos of building facades
  - These are the real-world images the model should learn to generate
  - High detail with textures, lighting, shadows, materials

- **fake_B (Generated)**: 🎨 **AI-generated photos** - what pix2pix creates from the labels
  - The model tries to add realistic textures, lighting, and details
  - Should transform the simple labels into photorealistic building images

## The Magic: 
The model learns to translate from **simple architectural drawings** → **realistic building photos**!

## Applications:
- **Architecture visualization**: Turn building plans into realistic renderings
- **Game development**: Generate realistic textures from simple maps  
- **Urban planning**: Visualize proposed buildings from blueprints
- **Art creation**: Transform sketches into photorealistic images

## Quality Indicators:
✅ **Good results**: Sharp textures, realistic lighting, proper building materials
❌ **Poor results**: Blurry details, unrealistic colors, missing architectural elements