# Fast Gradient Sign Method (FGSM) in AI Image Compression
It is a *single-step* computationally efficient attack.  
1. **Mathematical Formulations**
The core idea is that to push the data into direction that **maximizes the loss function**. The formula that are used are:  
$$x_{adv} = x+\epsilon \text{sign} (\nabla_x J(\theta, x, y)) =x+ \frac{\epsilon}{N}\text{sign} (\nabla_x \|x- y\|_F^2) $$
where 
- $x$: The original, clean input image (a tensor of pixel values).  
- $y=x_{hat}$: The decompressed of image $x$: $f(x)[{x_{hat}}]$, and $f$ is the compression model.  
- $J(θ, x, y)$: The loss function (e.g., mse Loss) of the model.

- $θ$: The parameters (weights) of the model. Crucially, these are held constant. We are not learning the model; we are attacking it.

- $∇ₓ J(\cdots)$: The gradient of the loss function with respect to the input image x. This tells us the direction in the input space that, if we follow it, will increase the loss the most.

- $\text{sign}(\cdots)$: This function takes the gradient and converts each of its components to either +1, -1, or 0. This is done because we are only interested in the direction of the steepest ascent, not its magnitude. Using the sign also ensures that the perturbation for each pixel will be exactly +ε or -ε, which is optimal for the L∞ constraint.

- $ε$ (epsilon): A small, scalar value that is the attack's "budget." It defines the maximum amount any pixel is allowed to change. This keeps the perturbation imperceptible (on a [0,1] scale).

In [9]:
# Environment setup and imports
import os
from pathlib import Path
import math
import numpy as np
from PIL import Image
import torch
import torch.nn.functional as F
from PIL import Image
import torchvision.transforms as transforms
import matplotlib.pyplot as plt
import torch.nn as nn 
# compressai
try:
    from compressai.zoo import cheng2020_anchor
    ## Load pretrained AI image compression models
    # from compressai.zoo import cheng2020_anchor
    from compressai.zoo import models
except Exception as e:
    raise SystemExit("compressai is required. Install via: pip install compressai[full]")

# device
DEVICE = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print("Using device:", DEVICE)

# Paths
NB_DIR = Path(__file__).parent if "__file__" in globals() else Path.cwd()
PROJECT_ROOT = NB_DIR.parent
DATA_DIR = (PROJECT_ROOT / "kodim").resolve()
OUTPUT_DIR = (PROJECT_ROOT / "outputs_fgsm").resolve()
OUTPUT_DIR.mkdir(parents=True, exist_ok=True)
print("Data dir:", DATA_DIR)
print("Output dir:", OUTPUT_DIR)


Using device: cpu
Data dir: /mnt/d/github/Adversarial_Attack_Image_compression/kodim
Output dir: /mnt/d/github/Adversarial_Attack_Image_compression/outputs_fgsm


## Load AI Image compression model: compressai models
- Link to the model https://github.com/InterDigitalInc/CompressAI
- Model name: ```cheng2020-anchor``` with quality 6, original papers: 
```bibtex
    @inproceedings{cheng2020image,
    title={Learned Image Compression with Discretized Gaussian Mixture
    Likelihoods and Attention Modules},
    author={Cheng, Zhengxue and Sun, Heming and Takeuchi, Masaru and Katto,
    Jiro},
    booktitle= "Proceedings of the IEEE Conference on Computer Vision and
    Pattern Recognition (CVPR)",
    year={2020}}```

In [13]:

model_name = "cheng2020-anchor"

# Computational device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Dynamically retrieve the model class
quality = 6
model_class = models[model_name]

# Clear GPU memory
torch.cuda.empty_cache()
# Set compression-decompression quality for AI image compression 
model = model_class(quality=quality, pretrained=True).to(device)
for param in model.parameters():
    param.requires_grad = False

# Loading image
transform = transforms.Compose([
    transforms.ToTensor(),
])


# Set device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Load and preprocess the image
def load_image(image_path):
    """Load image and convert to tensor"""
    image = Image.open(image_path).convert('RGB')
    
    transform = transforms.Compose([
        transforms.ToTensor(),
    ])
    
    image_tensor = transform(image).unsqueeze(0).to(device)  # Add batch dimension
    return image_tensor, image

# FGSM Attack
def fgsm_attack(model, x, epsilon, criterion, y_comp=None):
    """
    Perform FGSM attack on the compression model
    
    Args:
        model: The compression model
        x: Original input image tensor
        epsilon: Attack strength (perturbation budget)
        criterion: Loss function
        y_comp: Optional target for the compressed representation
    """
    # Set model to evaluation mode but enable gradients for input
    model.eval()
    
    # Enable gradient computation for input
    x.requires_grad = True
    x_hat = model(x)["x_hat"]
    loss = - criterion(x_hat,x)
    # Forward pass through compression model
    # # Compression models typically return: (output, likelihoods) or similar
    # compressed_output = model.compress(x)
    
    # # If we have a target for the compressed representation, use it
    # # Otherwise, we'll maximize the distortion (minimize quality)
    # if y_comp is None:
    #     # For untargeted attack: maximize the distortion/loss
    #     # We can use the bitrate or reconstruction error as loss
    #     if isinstance(compressed_output, tuple):
    #         # Handle tuple output (common in compressai)
    #         output, likelihoods = compressed_output
    #         # Use negative bits per pixel as loss to maximize bitrate
    #         loss = -torch.log(likelihoods).sum() / (x.shape[2] * x.shape[3] * x.shape[0])
    #     else:
    #         # Fallback: use MSE between input and decompressed output
    #         decompressed = model.decompress(compressed_output)
    #         loss = -criterion(decompressed['x_hat'], x)  # Negative to maximize error
    # else:
    #     # Targeted attack: make compression match y_comp
    #     if isinstance(compressed_output, tuple):
    #         output, likelihoods = compressed_output
    #         loss = criterion(output, y_comp)
    #     else:
    #         loss = criterion(compressed_output, y_comp)
    
    # Compute gradients
    model.zero_grad()
    loss.backward()
    
    # Get gradient sign
    gradient_sign = x.grad.data.sign()
    
    # Create adversarial example
    x_adv = x + epsilon * gradient_sign
    
    # Clip to valid image range [0, 1]
    x_adv = torch.clamp(x_adv, 0, 1)
    
    # Detach and disable gradients
    x_adv = x_adv.detach()
    x.requires_grad = False
    
    return x_adv

# PGD Attack
def pgd_attack(model, x, epsilon, alpha, num_iter, criterion, y_comp=None, random_start=True):
    """
    Perform PGD attack on the compression model
    
    Args:
        model: The compression model
        x: Original input image tensor
        epsilon: Maximum perturbation budget
        alpha: Step size for each iteration
        num_iter: Number of PGD iterations
        criterion: Loss function
        y_comp: Optional target for compressed representation
        random_start: Whether to start from random point within epsilon-ball
    """
    model.eval()
    
    # Start from original or random point within epsilon-ball
    if random_start:
        x_adv = x + torch.empty_like(x).uniform_(-epsilon, epsilon)
        x_adv = torch.clamp(x_adv, 0, 1)
    else:
        x_adv = x.clone()
    
    for i in range(num_iter):
        x_adv.requires_grad = True
        
        # Forward pass
        # compressed_output = model.compress(x_adv)
        x_hat = model(x_adv)["x_hat"]
        loss = - criterion(x_hat,x)
        # # Calculate loss
        # if y_comp is None:
        #     # Untargeted attack: maximize distortion
        #     if isinstance(compressed_output, tuple):
        #         output, likelihoods = compressed_output
        #         loss = -torch.log(likelihoods).sum() / (x.shape[2] * x.shape[3] * x.shape[0])
        #     else:
        #         decompressed = model.decompress(compressed_output)
        #         loss = -criterion(decompressed['x_hat'], x)
        # else:
        #     # Targeted attack
        #     if isinstance(compressed_output, tuple):
        #         output, likelihoods = compressed_output
        #         loss = criterion(output, y_comp)
        #     else:
        #         loss = criterion(compressed_output, y_comp)
        
        # Compute gradients
        model.zero_grad()
        loss.backward()
        
        # Get gradient sign
        gradient_sign = x_adv.grad.data.sign()
        
        # Update adversarial example
        x_adv = x_adv + alpha * gradient_sign
        
        # Project back to epsilon-ball around original image
        delta = torch.clamp(x_adv - x, min=-epsilon, max=epsilon)
        x_adv = torch.clamp(x + delta, 0, 1)
        
        x_adv = x_adv.detach()
    
    return x_adv

# Visualization function
def visualize_attack(original, adversarial, original_compressed, adversarial_compressed, epsilon):
    """Visualize original and adversarial images with their compressed versions"""
    fig, axes = plt.subplots(2, 2, figsize=(12, 10))
    
    # Convert tensors to numpy for plotting
    original_np = original.squeeze(0).cpu().permute(1, 2, 0).numpy()
    adversarial_np = adversarial.squeeze(0).cpu().permute(1, 2, 0).numpy()
    
    # Plot original and adversarial
    axes[0, 0].imshow(original_np)
    axes[0, 0].set_title('Original Image')
    axes[0, 0].axis('off')
    
    axes[0, 1].imshow(adversarial_np)
    axes[0, 1].set_title(f'Adversarial Image (ε={epsilon})')
    axes[0, 1].axis('off')
    
    # Plot compressed versions
    if 'x_hat' in original_compressed:
        orig_comp_np = original_compressed['x_hat'].squeeze(0).cpu().permute(1, 2, 0).numpy()
        adv_comp_np = adversarial_compressed['x_hat'].squeeze(0).cpu().permute(1, 2, 0).numpy()
        
        axes[1, 0].imshow(np.clip(orig_comp_np, 0, 1))
        axes[1, 0].set_title('Compressed Original')
        axes[1, 0].axis('off')
        
        axes[1, 1].imshow(np.clip(adv_comp_np, 0, 1))
        axes[1, 1].set_title('Compressed Adversarial')
        axes[1, 1].axis('off')
    
    plt.tight_layout()
    plt.show()
    
    # Calculate and print metrics
    perturbation = torch.abs(adversarial - original)
    max_perturbation = perturbation.max().item()
    avg_perturbation = perturbation.mean().item()
    
    print(f"Attack parameters: ε={epsilon}")
    print(f"Maximum perturbation: {max_perturbation:.4f}")
    print(f"Average perturbation: {avg_perturbation:.4f}")

# Main execution
def main():
    # Load your model (you already have this)
    from compressai.zoo import models
    model_name = "cheng2020-anchor"
    quality = 6
    model_class = models[model_name]
    
    torch.cuda.empty_cache()
    model = model_class(quality=quality, pretrained=True).to(device)
    
    # Freeze model parameters
    for param in model.parameters():
        param.requires_grad = False
    
    # Load image
    image_path = "../kodim/kodim01.png"
    x_original, pil_image = load_image(image_path)
    
    # Define loss function
    criterion = nn.MSELoss()
    
    # Attack parameters
    epsilon = 8/255.0  # Common attack strength
    alpha = 2/255.0    # PGD step size
    num_iter = 10      # PGD iterations
    
    print("Performing FGSM attack...")
    # FGSM Attack
    x_adv_fgsm = fgsm_attack(model, x_original, epsilon, criterion)
    
    print("Performing PGD attack...")
    # PGD Attack
    x_adv_pgd = pgd_attack(model, x_original, epsilon, alpha, num_iter, criterion)
    
    # Compress the images to see the effect
    print("Compressing images...")
    with torch.no_grad():
        # Original compression
        comp_original = model.compress(x_original)
        decomp_original = model.decompress(comp_original['strings'], comp_original['shape'])
        
        # FGSM adversarial compression
        comp_fgsm = model.compress(x_adv_fgsm)
        decomp_fgsm = model.decompress(comp_fgsm['strings'], comp_fgsm['shape'])
        
        # PGD adversarial compression
        comp_pgd = model.compress(x_adv_pgd)
        decomp_pgd = model.decompress(comp_pgd['strings'], comp_pgd['shape'])
    
    # Visualize results
    print("FGSM Attack Results:")
    visualize_attack(x_original, x_adv_fgsm, decomp_original, decomp_fgsm, epsilon)
    
    print("PGD Attack Results:")
    visualize_attack(x_original, x_adv_pgd, decomp_original, decomp_pgd, epsilon)
    
    # Calculate quantitative metrics
    mse_original = criterion(decomp_original['x_hat'], x_original).item()
    mse_fgsm = criterion(decomp_fgsm['x_hat'], x_original).item()
    mse_pgd = criterion(decomp_pgd['x_hat'], x_original).item()
    
    print(f"\nReconstruction MSE:")
    print(f"Original: {mse_original:.6f}")
    print(f"After FGSM: {mse_fgsm:.6f}")
    print(f"After PGD: {mse_pgd:.6f}")
    print(f"FGSM Increase: {mse_fgsm/mse_original:.2f}x")
    print(f"PGD Increase: {mse_pgd/mse_original:.2f}x")

if __name__ == "__main__":
    main()


Performing FGSM attack...


: 

: 

: 