# Interpretability in Convolutional Neural Networks

Interpretability in Convolutional Neural Networks (CNNs) is essential to understand thedecisions made by the model, detect potential biases, and improve robustness againstperturbations or distributional shifts. Due to their depth and the combination ofconvolutions with nonlinear activation functions, CNNs behave as highly complex systemswhose internal mechanisms are difficult to inspect directly. For this reason, specifictechniques are developed to visualize which regions or features of the input signalcontribute most significantly to the predictions.This document describes and implements several of the most widely used methodologies forinterpreting CNNs: saliency maps, Grad-CAM, Guided Grad-CAM (based on GuidedBackpropagation), occlusion analysis, and Integrated Gradients. A complete, functionalimplementation on CIFAR-10 using an adapted ResNet-18 model is then presented, organizedlinearly for step-by-step execution and easily convertible into a Jupyter Notebook.

## Saliency Maps

Saliency maps rely on computing the gradient of the model output with respect to eachinput pixel. Intuitively, if a small variation in a pixel produces a significant changein the output associated with a specific class, that pixel is considered important forthe decision. The absolute value of this gradient is used as a local relevance measure.Given a model $f(\cdot)$ and an input image $x$, the saliency map for a class $c$ isdefined as$$S = \left| \frac{\partial f_c(x)}{\partial x} \right|.$$When the network processes inputs with multiple channels (for example, RGB images), it iscommon to aggregate the channel-wise information to construct a two-dimensional map. Asimple strategy is to take the maximum over the channel dimension:$$S_{i,j} = \max_{k} \left| \frac{\partial f_c(x)}{\partial x_{k,i,j}} \right|.$$This map provides, for each spatial position $(i,j)$, a sensitivity measure of the classscore $f_c$ with respect to perturbations of the corresponding pixels. Saliency maps areconceptually simple and computationally efficient; however, the resulting visualizationsare often noisy and do not always align clearly with semantically interpretable regionsof the image.The following code initializes the environment, defines basic configuration, andimplements a class that generates saliency maps based on gradients, together with avisualization function that overlays the resulting map on the original image.

In [None]:
"""Interpretability in Convolutional Neural NetworksComplete functional implementation with CIFAR-10Implemented techniques:1. Saliency Maps2. Grad-CAM3. Guided Grad-CAM (based on Guided Backpropagation)4. Occlusion Analysis5. Integrated Gradients"""# IMPORTSimport warningsimport matplotlib.pyplot as pltimport numpy as npimport torchimport torch.nn as nnimport torch.nn.functional as Fimport torchvision.models as modelsfrom scipy.ndimage import zoomfrom torch.utils.data import DataLoaderfrom torchvision import datasets, transformswarnings.filterwarnings("ignore")print(f"PyTorch version: {torch.__version__}")print(f"CUDA available: {torch.cuda.is_available()}\n")# GLOBAL CONFIGURATIONCONFIG = {    "device": "cuda" if torch.cuda.is_available() else "cpu",    "batch_size": 32,}CIFAR10_CLASSES = [    "airplane", "automobile", "bird", "cat", "deer",    "dog", "frog", "horse", "ship", "truck"]print(f"Configuration: {CONFIG}\n")

## CIFAR-10 Data Preparation

To illustrate the interpretability techniques, the CIFAR-10 dataset is used. CIFAR-10contains color images of size $32 \times 32$ belonging to ten different classes. Thefollowing function downloads and prepares the test set, applying a standard normalizationthat is widely used for this dataset.

In [None]:
def prepare_cifar10_data():    """    Downloads and prepares the CIFAR-10 test set    with standard normalization.    """    print("Preparing CIFAR-10...")    transform = transforms.Compose([        transforms.ToTensor(),        transforms.Normalize(            (0.4914, 0.4822, 0.4465),            (0.2470, 0.2435, 0.2616)        )    ])    test_dataset = datasets.CIFAR10(        root="./data",        train=False,        download=True,        transform=transform    )    test_loader = DataLoader(        test_dataset,        batch_size=CONFIG["batch_size"],        shuffle=False,        num_workers=2    )    print(f"Test: {len(test_dataset)} images\n")    return test_loader, test_dataset

## ResNet-18 Model Adapted to CIFAR-10

A ResNet-18 model pretrained on ImageNet is used as the base and adapted to thecharacteristics of CIFAR-10. The adaptation consists of modifying the first convolutionallayer to work more appropriately with $32 \times 32$ images and adjusting the final fullyconnected layer to the number of classes in CIFAR-10. Although the model is loaded withpretrained ImageNet weights, the final layer is initialized randomly, so the performancemay not be optimal without fine-tuning. However, this limitation does not affect the mainpurpose of the code, which is to illustrate interpretability techniques in a functionalmanner.

In [None]:
def load_pretrained_model():    """    Loads a ResNet-18 pretrained on ImageNet and adapts it to CIFAR-10.    """    print("Loading pretrained model...")    model = models.resnet18(pretrained=True)    # Adapt the first layer to 32x32 images (remove initial max-pooling)    model.conv1 = nn.Conv2d(        3, 64,        kernel_size=3,        stride=1,        padding=1,        bias=False    )    model.maxpool = nn.Identity()    # Adapt final layer for 10 CIFAR-10 classes    model.fc = nn.Linear(model.fc.in_features, 10)    model = model.to(CONFIG["device"])    model.eval()    print(f"Model loaded on {CONFIG['device']}\n")    return model

## Saliency Maps: Implementation and Visualization

The following implementation computes saliency maps via gradients and includes avisualization routine that facilitates the direct analysis of which image regionscontribute most to the model’s prediction.

In [None]:
print("=" * 70)print("1. SALIENCY MAPS")print("=" * 70)print("""Saliency maps compute the gradient of the output with respectto each image pixel, indicating which regions have the largestinfluence on the prediction.Advantages:- Simple and efficient computation.- Shows the direct influence of pixels.Limitations:- Visualizations are often noisy.- Do not always align with semantically clear regions.""")class SaliencyMapGenerator:    """Generates saliency maps using gradients."""    def __init__(self, model: nn.Module, device: str = "cuda") -> None:        self.model = model.to(device)        self.model.eval()        self.device = device    def generate_saliency(self, image: torch.Tensor, target_class: int | None = None):        """        Computes the saliency map for a single image.        Args:            image: Tensor [1, 3, H, W] normalized.            target_class: Target class index; if None, the model prediction is used.        Returns:            2D saliency map (numpy array).        """        image = image.to(self.device)        image.requires_grad = True        output = self.model(image)        if target_class is None:            target_class = output.argmax(dim=1)        self.model.zero_grad()        output[0, target_class].backward()        saliency = image.grad.data.abs()        # Channel aggregation: maximum along the channel axis        saliency, _ = torch.max(saliency, dim=1)        return saliency.squeeze().cpu().numpy()    def visualize_saliency(        self,        image: torch.Tensor,        original_image: np.ndarray,        target_class: int | None = None    ) -> None:        """        Visualizes the saliency map and its overlay on the original image.        Args:            image: Tensor [1, 3, H, W] normalized.            original_image: Denormalized image [H, W, 3] in [0, 1].            target_class: Target class; if None, the model prediction is used.        """        saliency = self.generate_saliency(image, target_class)        # Normalize to [0, 1] for visualization        saliency = (saliency - saliency.min()) / (saliency.max() - saliency.min() + 1e-8)        fig, axes = plt.subplots(1, 3, figsize=(15, 5))        axes[0].imshow(original_image)        axes[0].set_title("Original Image", fontsize=12, fontweight="bold")        axes[0].axis("off")        axes[1].imshow(saliency, cmap="hot")        axes[1].set_title("Saliency Map", fontsize=12, fontweight="bold")        axes[1].axis("off")        axes[2].imshow(original_image)        axes[2].imshow(saliency, cmap="hot", alpha=0.5)        axes[2].set_title("Overlay", fontsize=12, fontweight="bold")        axes[2].axis("off")        plt.tight_layout()        plt.show()

## Grad-CAM (Gradient-weighted Class Activation Mapping)

Grad-CAM generates heatmaps that localize the regions of an image that contribute moststrongly to the prediction for a specific class. Instead of operating directly on thepixels, Grad-CAM works on the activation maps of an internal convolutional layer, whichtends to produce spatial relevance maps that are more structured and semanticallyinterpretable.Let $A^k \in \mathbb{R}^{H \times W}$ denote the activation map associated with channel$k$ of a selected convolutional layer. For a class $c$, importance coefficients arecomputed by performing a global average pooling of the gradients over the spatialdimensions:$$\alpha_k = \frac{1}{HW} \sum_{i=1}^{H} \sum_{j=1}^{W} \frac{\partial f_c}{\partial A_{ij}^k}.$$Using these coefficients, a class-specific weighted activation map is constructed as$$L_c^{\text{Grad-CAM}} = \mathrm{ReLU}\left( \sum_k \alpha_k A^k \right).$$The ReLU function is applied to retain only positive contributions, under the assumptionthat activations that increase the class score are those to be highlighted. The spatialresolution of the Grad-CAM map is limited by the size of the activation maps of thechosen layer; therefore, the resulting map is often interpolated to match the size of theoriginal image.The implementation below uses hooks to capture activations and gradients at the targetlayer and generates the corresponding Grad-CAM map.

In [None]:
print("=" * 70)print("2. GRAD-CAM (Gradient-weighted Class Activation Mapping)")print("=" * 70)print("""Grad-CAM generates heatmaps that highlight the regions of the imagethat are most important for a specific class, using gradients withrespect to an internal convolutional layer.Advantages:- More interpretable maps than basic saliency maps.- Localizes relevant object regions.Limitations:- Depends on the choice of the target layer.- Resolution is limited by the resolution of that layer.""")class GradCAM:    """Grad-CAM implementation for a target layer of a CNN."""    def __init__(self, model: nn.Module, target_layer: str, device: str = "cuda") -> None:        self.model = model.to(device)        self.target_layer = target_layer        self.device = device        self.gradients: torch.Tensor | None = None        self.activations: torch.Tensor | None = None        self._register_hooks()    def _register_hooks(self) -> None:        """        Registers hooks on the target layer to capture activations and gradients        during forward and backward passes.        """        def forward_hook(module, input, output):            self.activations = output.detach()        def backward_hook(module, grad_input, grad_output):            self.gradients = grad_output[0].detach()        for name, module in self.model.named_modules():            if name == self.target_layer:                module.register_forward_hook(forward_hook)                module.register_full_backward_hook(backward_hook)                break    def generate_cam(self, image: torch.Tensor, target_class: int | None = None):        """        Generates the Grad-CAM map for an image and a target class.        Args:            image: Tensor [1, 3, H, W] normalized.            target_class: Target class; if None, the model prediction is used.        Returns:            cam: Normalized 2D Grad-CAM map (numpy array).            target_class: Class used for the explanation.        """        self.model.eval()        image = image.to(self.device)        output = self.model(image)        if target_class is None:            target_class = output.argmax(dim=1).item()        self.model.zero_grad()        output[0, target_class].backward()        # Weights: global average of gradients over H x W        weights = torch.mean(self.gradients, dim=(2, 3), keepdim=True)        cam = torch.sum(weights * self.activations, dim=1, keepdim=True)        cam = torch.relu(cam)        cam = cam.squeeze().cpu().numpy()        cam = (cam - cam.min()) / (cam.max() - cam.min() + 1e-8)        return cam, target_class    def visualize_cam(        self,        image: torch.Tensor,        original_image: np.ndarray,        target_class: int | None = None    ):        """        Visualizes the Grad-CAM map and its overlay on the original image.        Args:            image: Tensor [1, 3, H, W] normalized.            original_image: Denormalized image [H, W, 3] in [0, 1].            target_class: Target class; if None, the model prediction is used.        """        cam, pred_class = self.generate_cam(image, target_class)        # Resize the map to the original image size via interpolation        cam_resized = zoom(            cam,            (                original_image.shape[0] / cam.shape[0],                original_image.shape[1] / cam.shape[1]            )        )        fig, axes = plt.subplots(1, 3, figsize=(15, 5))        axes[0].imshow(original_image)        axes[0].set_title("Original Image", fontsize=12, fontweight="bold")        axes[0].axis("off")        axes[1].imshow(cam_resized, cmap="jet")        axes[1].set_title(            f"Grad-CAM (Class: {CIFAR10_CLASSES[pred_class]})",            fontsize=12, fontweight="bold"        )        axes[1].axis("off")        axes[2].imshow(original_image)        axes[2].imshow(cam_resized, cmap="jet", alpha=0.5)        axes[2].set_title("Overlay", fontsize=12, fontweight="bold")        axes[2].axis("off")        plt.tight_layout()        plt.show()        return cam_resized, pred_class

## Guided Backpropagation and Guided Grad-CAM

Guided Backpropagation modifies the gradient flow through ReLU units by forcing to zerothose gradients that are negative both in the activation and in the incoming gradient.This filtering yields sharper gradient maps that focus on features considered relevant.Guided Grad-CAM combines the global localization capability of Grad-CAM with thepixel-level detail of Guided Backpropagation. The usual procedure consists of threesequential steps: first, a Grad-CAM map is computed for the target class; second, guidedgradients with respect to the input image are obtained; finally, the Grad-CAM map isupsampled to the input resolution and multiplied elementwise by the guided gradients. Theresult is a high-resolution visualization in which edges and fine details inside theGrad-CAM-relevant regions are emphasized.The following code implements Guided Backpropagation. This implementation integratesnaturally with the `GradCAM` class to build Guided Grad-CAM by multiplying the resizedGrad-CAM map by the guided gradients.

In [None]:
print("=" * 70)print("3. GUIDED GRAD-CAM")print("=" * 70)print("""Guided Grad-CAM combines Grad-CAM with Guided Backpropagationto obtain high-resolution visualizations that are bothspatially precise and detailed at the pixel level.This script implements Guided Backpropagation,which can be combined with Grad-CAM maps.""")class GuidedBackprop:    """Guided Backpropagation implementation for a CNN."""    def __init__(self, model: nn.Module, device: str = "cuda") -> None:        self.model = model.to(device)        self.device = device        self._register_hooks()    def _register_hooks(self) -> None:        """        Registers hooks on ReLU layers to filter negative gradients        during the backward pass.        """        def backward_hook(module, grad_input, grad_output):            if len(grad_input) > 0 and grad_input[0] is not None:                return (torch.clamp(grad_input[0], min=0.0),)            return grad_input        for module in self.model.modules():            if isinstance(module, nn.ReLU):                module.register_full_backward_hook(backward_hook)    def generate_gradients(self, image: torch.Tensor, target_class: int | None = None):        """        Generates guided gradients with respect to the input image.        Args:            image: Tensor [1, 3, H, W] normalized.            target_class: Target class; if None, the model prediction is used.        Returns:            Guided gradients as a numpy array [3, H, W].        """        self.model.eval()        image = image.to(self.device)        image.requires_grad = True        output = self.model(image)        if target_class is None:            target_class = output.argmax(dim=1)        self.model.zero_grad()        output[0, target_class].backward()        gradients = image.grad.data.cpu().numpy()[0]        return gradients

## Occlusion Analysis

Occlusion analysis adopts a complementary viewpoint to gradient-based methods. Instead ofexploring the internal sensitivity of the model, it modifies the input explicitly. Smallregions (patches) of the image are systematically occluded, and the effect on theprobability assigned to a given class is measured. When occluding a region significantlydecreases the probability, that region is interpreted as important for the prediction.Formally, for each position $(i,j)$ of a sliding window, an occluded version of the image$x^{(i,j)}$ is constructed, and the difference$$\Delta p_c^{(i,j)} = p_c(x) - p_c\bigl(x^{(i,j)}\bigr)$$is evaluated, where $p_c(x)$ denotes the model probability assigned to class $c$. Theresulting sensitivity map directly quantifies the importance of each region in terms ofits impact on the model’s confidence. This technique is independent of gradients andspecific architectural details, although its computational cost increases with imageresolution, due to the large number of model evaluations required.The class below implements a simple occlusion analysis, allowing the patch size andstride of the sliding window to be adjusted.

In [None]:
print("=" * 70)print("4. OCCLUSION ANALYSIS")print("=" * 70)print("""Systematically occludes regions of the image to observehow the prediction changes, revealing which areas are critical.Advantages:- Direct interpretation at the input level.- Does not require gradients or internal access to the architecture.Limitations:- High computational cost.- Sensitive to patch size and stride.""")class OcclusionAnalysis:    """Occlusion analysis for obtaining sensitivity maps."""    def __init__(self, model: nn.Module, device: str = "cuda") -> None:        self.model = model.to(device)        self.model.eval()        self.device = device    def analyze(        self,        image: torch.Tensor,        target_class: int | None = None,        patch_size: int = 4,        stride: int = 2    ):        """        Computes a sensitivity map via systematic occlusion.        Args:            image: Tensor [1, 3, H, W] normalized.            target_class: Target class; if None, the model prediction is used.            patch_size: Side length of the square occlusion patch in pixels.            stride: Stride of the sliding occlusion window.        Returns:            2D sensitivity map (numpy array).        """        image = image.to(self.device)        with torch.no_grad():            output = self.model(image)            if target_class is None:                target_class = output.argmax(dim=1).item()            baseline_prob = torch.softmax(output, dim=1)[0, target_class].item()        _, _, h, w = image.shape        sensitivity_map = np.zeros((h, w))        for i in range(0, h - patch_size + 1, stride):            for j in range(0, w - patch_size + 1, stride):                occluded_image = image.clone()                occluded_image[:, :, i:i + patch_size, j:j + patch_size] = 0                with torch.no_grad():                    output = self.model(occluded_image)                    prob = torch.softmax(output, dim=1)[0, target_class].item()                sensitivity = baseline_prob - prob                current = sensitivity_map[i:i + patch_size, j:j + patch_size].mean()                sensitivity_map[i:i + patch_size, j:j + patch_size] = max(current, sensitivity)        return sensitivity_map    def visualize(        self,        image: torch.Tensor,        original_image: np.ndarray,        target_class: int | None = None,        patch_size: int = 4,        stride: int = 2    ) -> None:        """        Visualizes the sensitivity map obtained via occlusion.        Args:            image: Tensor [1, 3, H, W] normalized.            original_image: Denormalized image [H, W, 3] in [0, 1].            target_class: Target class; if None, the model prediction is used.            patch_size: Occlusion patch size.            stride: Sliding window stride.        """        print(f"Analyzing with patch_size={patch_size}, stride={stride}...")        sensitivity = self.analyze(image, target_class, patch_size, stride)        sensitivity = (sensitivity - sensitivity.min()) / \            (sensitivity.max() - sensitivity.min() + 1e-8)        fig, axes = plt.subplots(1, 3, figsize=(15, 5))        axes[0].imshow(original_image)        axes[0].set_title("Original Image", fontsize=12, fontweight="bold")        axes[0].axis("off")        axes[1].imshow(sensitivity, cmap="hot")        axes[1].set_title("Sensitivity Map", fontsize=12, fontweight="bold")        axes[1].axis("off")        axes[2].imshow(original_image)        axes[2].imshow(sensitivity, cmap="hot", alpha=0.5)        axes[2].set_title("Overlay", fontsize=12, fontweight="bold")        axes[2].axis("off")        plt.tight_layout()        plt.show()

## Integrated Gradients

Integrated Gradients is a theoretically grounded method to attribute a model predictionto input features. Instead of considering the gradient only at the point $x$, this methodintegrates gradients along a continuous path that connects a baseline $x'$ (for example,a completely black image) to the actual image $x$. This approach mitigates gradientsaturation issues and satisfies desirable attribution axioms such as sensitivity andimplementation invariance.Let $f_c$ denote the score for class $c$ (for example, the pre-softmax output).Integrated Gradients for dimension $i$ is defined as$$\mathrm{IG}_i(x) = (x_i - x'_i) \int_{\alpha=0}^{1}\frac{\partial f_c\bigl(x' + \alpha (x - x')\bigr)}{\partial x_i} \, d\alpha.$$In practice, the integral is approximated by a discrete sum over $m$ uniformly spacedsteps:$$\mathrm{IG}_i(x) \approx (x_i - x'_i) \cdot \frac{1}{m} \sum_{k=1}^{m}\frac{\partial f_c\bigl(x' + \tfrac{k}{m}(x - x')\bigr)}{\partial x_i}.$$Aggregating the absolute attributions over channels yields a spatial relevance map thatis typically smoother and more stable than basic saliency maps, at the cost of requiringmultiple model evaluations along the path between the baseline and the original image.The following implementation computes Integrated Gradients for a single image, allowingthe baseline and the number of integration steps to be specified.

In [None]:
print("=" * 70)print("5. INTEGRATED GRADIENTS")print("=" * 70)print("""Method that attributes the prediction to input features byintegrating gradients along a path from a baselineto the actual image.Advantages:- Strong theoretical foundation.- Mitigates gradient saturation issues.Limitations:- Requires multiple model evaluations.- Depends on the choice of baseline.""")class IntegratedGradients:    """Integrated Gradients implementation for PyTorch models."""    def __init__(self, model: nn.Module, device: str = "cuda") -> None:        self.model = model.to(device)        self.model.eval()        self.device = device    def generate(        self,        image: torch.Tensor,        target_class: int | None = None,        baseline: torch.Tensor | None = None,        steps: int = 50    ):        """        Computes Integrated Gradients for an image and target class.        Args:            image: Tensor [1, 3, H, W] normalized.            target_class: Target class; if None, the model prediction is used.            baseline: Tensor [1, 3, H, W] used as reference. If None, a zero tensor is used.            steps: Number of points along the integration path.        Returns:            Numpy array [C, H, W] with per-channel attributions.        """        if baseline is None:            baseline = torch.zeros_like(image)        baseline = baseline.to(self.device)        image = image.to(self.device)        with torch.no_grad():            output = self.model(image)            if target_class is None:                target_class = output.argmax(dim=1).item()        # Linear path between baseline and image        scaled_inputs = [            baseline + (float(i) / steps) * (image - baseline)            for i in range(steps + 1)        ]        scaled_inputs = torch.cat(scaled_inputs, dim=0)        scaled_inputs.requires_grad = True        output = self.model(scaled_inputs)        self.model.zero_grad()        target_output = output[:, target_class]        target_output.backward(torch.ones_like(target_output))        gradients = scaled_inputs.grad        avg_gradients = torch.mean(gradients, dim=0, keepdim=True)        integrated_grads = (image - baseline) * avg_gradients        return integrated_grads.squeeze().cpu().detach().numpy()    def visualize(        self,        image: torch.Tensor,        original_image: np.ndarray,        target_class: int | None = None    ) -> None:        """        Visualizes spatially aggregated Integrated Gradients and its overlay.        Args:            image: Tensor [1, 3, H, W] normalized.            original_image: Denormalized image [H, W, 3] in [0, 1].            target_class: Target class; if None, the model prediction is used.        """        print("Computing Integrated Gradients (50 steps)...")        ig = self.generate(image, target_class)        ig_aggregated = np.sum(np.abs(ig), axis=0)        ig_aggregated = (ig_aggregated - ig_aggregated.min()) / \            (ig_aggregated.max() - ig_aggregated.min() + 1e-8)        fig, axes = plt.subplots(1, 3, figsize=(15, 5))        axes[0].imshow(original_image)        axes[0].set_title("Original Image", fontsize=12, fontweight="bold")        axes[0].axis("off")        axes[1].imshow(ig_aggregated, cmap="hot")        axes[1].set_title("Integrated Gradients", fontsize=12, fontweight="bold")        axes[1].axis("off")        axes[2].imshow(original_image)        axes[2].imshow(ig_aggregated, cmap="hot", alpha=0.5)        axes[2].set_title("Overlay", fontsize=12, fontweight="bold")        axes[2].axis("off")        plt.tight_layout()        plt.show()

## Visualization Utilities

To interpret the results properly, CIFAR-10 images should be denormalized beforevisualization. The function below reverses the standard normalization applied duringpreprocessing and returns an image in a format suitable for `matplotlib`.

In [None]:
def denormalize_cifar10(tensor: torch.Tensor) -> np.ndarray:    """    Denormalizes a CIFAR-10 tensor for visualization.    Args:        tensor: Tensor [3, H, W] normalized with CIFAR-10 mean and std.    Returns:        Image as numpy array [H, W, 3] with values in [0, 1].    """    mean = torch.tensor([0.4914, 0.4822, 0.4465]).view(3, 1, 1)    std = torch.tensor([0.2470, 0.2435, 0.2616]).view(3, 1, 1)    denorm = tensor * std + mean    denorm = torch.clamp(denorm, 0, 1)    return denorm.permute(1, 2, 0).numpy()

## Complete Interpretability Pipeline

Finally, all components are integrated into a coherent workflow that applies thedifferent interpretability techniques to a test image from CIFAR-10. The pipelineincludes data loading, model loading, sample selection, and the sequential execution ofsaliency maps, Grad-CAM, occlusion analysis, and Integrated Gradients. GuidedBackpropagation is implemented and can be used to construct Guided Grad-CAM if one wishesto extend the pipeline.

In [None]:
def run_complete_pipeline() -> None:    """    Executes all interpretability techniques in an integrated way    on a single CIFAR-10 image.    """    print("\n" + "=" * 70)    print("COMPLETE PIPELINE: INTERPRETABILITY IN CNNs")    print("=" * 70 + "\n")    # Data    test_loader, _ = prepare_cifar10_data()    # Model    model = load_pretrained_model()    # Select a test image    print("Selecting test image...")    images, labels = next(iter(test_loader))    image = images[0:1]    label = labels[0].item()    original_image = denormalize_cifar10(images[0].clone())    print(f"True class: {CIFAR10_CLASSES[label]}\n")    # 1. Saliency Maps    print("\n" + "=" * 70)    print("RUNNING: Saliency Maps")    print("=" * 70 + "\n")    saliency_gen = SaliencyMapGenerator(model, CONFIG["device"])    saliency_gen.visualize_saliency(image.clone(), original_image)    # 2. Grad-CAM    print("\n" + "=" * 70)    print("RUNNING: Grad-CAM")    print("=" * 70 + "\n")    grad_cam = GradCAM(model, target_layer="layer4", device=CONFIG["device"])    grad_cam.visualize_cam(image.clone(), original_image)    # 4. Occlusion Analysis    print("\n" + "=" * 70)    print("RUNNING: Occlusion Analysis")    print("=" * 70 + "\n")    occlusion = OcclusionAnalysis(model, device=CONFIG["device"])    occlusion.visualize(image.clone(), original_image, patch_size=4, stride=2)    # 5. Integrated Gradients    print("\n" + "=" * 70)    print("RUNNING: Integrated Gradients")    print("=" * 70 + "\n")    ig = IntegratedGradients(model, device=CONFIG["device"])    ig.visualize(image.clone(), original_image)if __name__ == "__main__":    run_complete_pipeline()

This complete pipeline provides a practical framework for exploring interpretability inCNNs on CIFAR-10. Although the ResNet-18 model is not explicitly fine-tuned on thisdataset within the script, the code structure allows the same analysis workflow to bereused with a model trained specifically on CIFAR-10 by simply replacing the modelloading function with a version that retrieves weights adapted to the domain of interest.