# **Cycle-GAN Consistency**

Allucinations are one of the main problems to avoid in generative models and avoid them is an inmediate necessity for a fair integration of AI models to critical environments. In this notebook we explore a metric or explanation that will let us know if the results we are seeing are actually trustworthy or a made up scenario created by the model.

Author:  
@Jwpr-dpr  
23-04-2025

## **Introduction: Trusting Super-Resolution GANs through Cycle-Consistency**
In recent years, Generative Adversarial Networks (GANs) have become the driving force behind stunning advances in image generation and enhancement. Among these, Super-Resolution GANs (SRGANs) have drawn attention for their ability to take low-resolution (LR) images and transform them into high-resolution (HR) versions that are visually convincing, even photo-realistic.

However, this impressive performance comes at a cost: how do we trust what the model is generating? In real-world applications like medical imaging, satellite analysis, or surveillance, a hallucinated detail could mislead decisions. We might get images that look good, but introduce artifacts that never existed in the original scene.

* The Problem of Trust in SRGANs  
SRGANs prioritize perceptual quality, often using adversarial and perceptual losses. This encourages visually pleasing results, but not necessarily faithful ones. We might get sharper images, but with invented textures and no guarantee that key structures remain intact.

So, how can we assess if an SRGAN is not just making things up?

* Introducing Cycle-Consistency  
The idea of cycle-consistency emerged from CycleGANs, a type of GAN designed for unpaired image translation (like turning a horse into a zebra, and back again). The key concept is:

If you transform an image to another domain and then back again, you should end up close to where you started. In the context of image super-resolution we can’t reverse the resolution naturally, but we can simulate it: Upsample with SRGAN, then downsample again.If the downsampled image resembles the original LR input, we’ve preserved structure and avoided hallucination.

## **Theoretical Framework: Key Concepts Behind CycleGAN and Cycle-Consistency**
To understand how cycle-consistency can be used to evaluate models like SRGANs, we first need to revisit a few core ideas from GANs and CycleGANs. This section outlines the most relevant concepts and mathematical formulations.

1. Generative Adversarial Networks (GANs)
A GAN consists of two neural networks:

Generator 𝐺 learns to generate realistic data from random or structured input.
Discriminator 𝐷 learns to distinguish between real and fake (generated) data.
They are trained in a minimax game, where one tries to fool the other.

2. SRGAN: Super-Resolution with Perceptual loss
SRGAN enhances low-res images using perceptual loss instead of just pixel-wise loss. The objective combines:

* Adversarial loss(realism)
* Content loss (perceptual similarity)
* Pixel-wise loss (optional)

3. CycleGAN and Cycle-Consistency

The Idea behind Cycle-GAN consistency is the same as working with functions. Let's say we have a function $f$, that takes inputs (images) from a lower dimentional space $X$ to a higher dimensional space $Y$.

$$f:X \rightarrow Y$$

If this function exists, then also must exist an inverse, that means, a function that takes images, from a higher dimensional space $Y$ to a lower dimensional space $X$

$$ f^{-1} = g:Y \rightarrow X $$

If we did $f(g(x))$, we have to get our original $x$, as both inverse functions will cancel themselves. Cycle-consistency aims to evaluate that this is acomplished. If we downsample our enhanced image, we should get an image that is almost 100% equal to the original input -maybe some contrast or textures may vary, but not the general structure-, that way we asure that the image integrity is preserved after being enhanced by the model and we didn't get any allucinations being fed into our SR images.

This is going to be calculated as a loss, through this expression:

$$ l_{cyc} = E_{x~A} \cdot |F(G(x)) - x| + E_{y~B} \cdot |G(F(y)) - y| $$

## **Implementation**

This section we will explore the required tools and libraries, auxiliar functions and the core evaluation function 

In [1]:
import torch
import torch.nn.functional as F
from torchvision.transforms.functional import resize
from torchvision.utils import make_grid
import matplotlib.pyplot as plt


In [2]:
def downsample(img_tensor, scale_factor=4, mode='bicubic'):
    """
    Downsamples a high-resolution image back to low-resolution using interpolation.
    """
    h, w = img_tensor.shape[-2:]
    new_h, new_w = h // scale_factor, w // scale_factor
    return resize(img_tensor, [new_h, new_w], interpolation=mode)

def denormalize(tensor, mean=0.5, std=0.5):
    return tensor * std + mean

def normalize(tensor, mean=0.5, std=0.5):
    return (tensor - mean) / std



Now, we implement the loss

In [None]:
def cycle_consistency_loss(sr_model, lr_img, downsample_fn, scale_factor=4):
    """
    Computes the L1 cycle-consistency loss between the original LR image
    and the one obtained after SR -> Downsampling.
    """
    with torch.no_grad():
        sr_img = sr_model(lr_img)                    # LR -> HR
        rec_img = downsample_fn(sr_img, scale_factor) # HR -> LR (reverse)
    
    loss = F.l1_loss(rec_img, lr_img)
    return loss.item()


All of the aove functions can go to our utils.py module, where we store all of the auxiliar functions for an apporpiate modularization. Finally, this evaluate_cycle_consistency function can be integrated into our evaluate function, so we can see how this metrics evolve epoch by epoch

In [None]:
def evaluate_cycle_consistency(sr_model, dataloader, device, downsample_fn, scale_factor=4):
    sr_model.eval()
    total_loss = 0
    num_samples = 0

    for batch in dataloader:
        lr_imgs = batch.to(device)
        loss = cycle_consistency_loss(sr_model, lr_imgs, downsample_fn, scale_factor)
        total_loss += loss * lr_imgs.size(0)
        num_samples += lr_imgs.size(0)

    avg_loss = total_loss / num_samples
    return avg_loss
