Layer Attribution for Convolutional Networks

Hi @bilalsal @NarineK @sarahtranfb @vivekmig @aobo-y, 

First of all, thank you for the amazing work on Captum! I'm currently exploring the capabilities of Captum for analyzing a convolutional neural network trained on audio data, specifically for estimating COVID probabilities from cough recordings. The model takes Mel Spectrograms as input.

As part of my exploration, I wanted to understand the relative importance of different layers in the network for driving the model's predictions. To do this, I used the Internal Influence algorithm to compute attributions for layers comprising convolutional and linear blocks. I'm feeding the model with inputs shaped (1, 64, 44). The first convolutional layer in the model is defined as:

`nn.Sequential(
            nn.Conv2d(in_channels=1,out_channels=16,kernel_size=3,stride=1,padding=2),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2)
        )`

This layer produces an output of shape (1, 16, 33, 23), and I noticed that the attributions returned by Internal Influence also have this shape. Based on my reading of the documentation, I had the impression that the attribution shape should match the input to the layer, not its output. So I'm unsure if I'm misinterpreting something here.

Here’s how I computed the attributions (with zero baseline):

`    lc = InternalInfluence(model, layer), 
    attributions = lc.attribute(input_tensor, baselines=baseline)`

To analyze the importance of channels, I aggregated the attributions as follows:

`channel_scores = attributions.abs().sum(dim=(0, 2, 3))`

I passed over 100 distinct cough samples through this setup and computed the mean of attribution scores across samples to estimate each channel’s overall importance. My goal was to identify potentially redundant channels that could be pruned. However, I observed that all channels across all layers (conv1 to conv4) had fairly uniform attribution scores, as shown in the attached plots.

This leads me to a few questions I’d appreciate guidance on:

- Is Internal Influence an appropriate algorithm for my use case (layer-wise analysis of channel importance)? Are there other algorithms in Captum that you would recommend for this purpose?

- The attribution shape I'm getting matches the output shape of the convolutional layer. However, the documentation mentions that attribution should match the input shape. Is this behavior expected?

- I'm aggregating attribution values using a sum over the spatial dimensions to compare channel-wise importance. Is this a sound approach, or would you recommend another aggregation method?

- Finally, is it valid to use attribution scores to guide channel pruning? Are there any existing examples or best practices in this direction that you could point me to?

Thanks again for your excellent work on this library. I'm looking forward to any insights you can share!

<img width="1000" height="400" alt="Image" src="https://github.com/user-attachments/assets/cb567e90-e99b-4107-9194-886b95884c7c" />
<img width="1000" height="400" alt="Image" src="https://github.com/user-attachments/assets/d0778d25-376b-4570-b338-093187b5aa1e" />
<img width="1000" height="400" alt="Image" src="https://github.com/user-attachments/assets/d604f8e7-674c-454c-af78-64920a2e49ae" />
<img width="1000" height="400" alt="Image" src="https://github.com/user-attachments/assets/d7b57c37-d3ef-4d16-bcb1-abcbacfa242f" />

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Layer Attribution for Convolutional Networks #1630

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Layer Attribution for Convolutional Networks #1630

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions