The v0.9.0 release of Captum adds NumPy 2.x support, retires the long-deprecated Captum Insights, introduces new multimodal image-mask attribution primitives, adds remote LLM attribution via vLLM, and promotes cross-tensor attribution to a default-on feature for perturbation-based methods. This release also raises minimum supported Python and PyTorch versions to 3.10 and 1.13, respectively.
Upgrading from v0.8 should be drop-in for most users. The exceptions to watch for: any code that imports from captum.insights or installs the [insights] extra (removed — see below), environments on Python < 3.10 or PyTorch < 1.13 (no longer supported), and any code relying on the previous default behavior of FeatureAblation / FeaturePermutation treating input tensors independently (cross-tensor attribution is now default-on).
LLM & Multimodal Attribution:
Multimodal Image-Segment Attribution
This release adds a new interpretable input type for image-segment attribution, ImageMaskInput, useful for vision and vision-language models where features correspond to user-defined image regions is now exposed from captum.attr and is compatible with perturbation-based attribution methods and with LLMAttribution.
Key functionality:
- Support for
mask_listto describe multiple segment sets with overlapped pixels (PR #1749) - Mask-segmentation visualization utilities, with legends on the overlay
plot_mask_overlay(PRs #1752, #1753, #1758, #1756) - New plot helper
plot_image_heatmapfor rendering pixel-level attributions on anImageMaskInput(PR #1739, #1684)
A new tutorial, Multimodal_Image_Segment_Attribution.ipynb, demonstrates end-to-end usage leveraging SAM (Meta Segment Anything Model) to interpret a multimodal LLM (Gemma-4) (PR #1811, #1812).
LLM Attribution Improvements
Building on the LLM attribution work from v0.7 and v0.8, this release substantially expands what LLMAttribution can attribute over and how attributions are consumed.
- New
RemoteLLMAttributionwrapper andVLLMProviderallow perturbation-based attribution to be computed against a remotely hosted LLM serving endpoint, removing the requirement that the model fit in the client process (PR #1544). This enables attribution over much larger models than was previously feasible. - New boolean
forward_in_tokensargument toLLMAttribution.attribute, to choose between replicating token-by-token output decoding or directly forwarding the output sequence in one-shot (PRs #1740, #1741, #1742, #1744) - Dict-like
model_inputsupport (PR #1698) skip_tokensis no longer accepted as a target argument, and target encoding no longer adds special tokens (PRs #1685, #1686)
Captum Attribution Enhancement
Cross-Tensor Attribution
Cross-tensor attribution — the ability to group, ablate, or permute features across multiple input tensors simultaneously — was introduced incrementally over the v0.8 cycle and is now default-on for perturbation-based methods (PR commit 38230a70). This release finalizes the feature surface:
- Feature grouping across input tensors for
FeatureAblation(PR #1497) - Cross-tensor permutation in
FeaturePermutation(PR #1507) - Support for multiple perturbations per eval when masking across tensors (PR #1530)
Perturbation-Based Method Improvements
FeatureAblationandFeaturePermutationgained amin_examples_per_batchargument and now skip feature groups when permuting features if any group has batch size ≤ 1 (PRs #1533, #1539)Occlusionwas migrated to the new ablated-batch construction path used byFeatureAblation(PR #1616)- Shapley Value perturbation construction performance improved (PR #1635), with further micro-optimization when formatting
total_attrib(PR #1648) - Aggregate mode is now enabled whenever
perturbations_per_eval == 1(PR #1525) - Avoided unnecessary tensor construction when creating input masks for permutation/ablation (PR #1527)
- Output validity for
perturbations_per_eval > 1is now checked via a dedicated method (PR #1666) - Scalar and 1-D tensor model outputs are handled explicitly (PR #1521)
Layer & Neuron Attribution
- Reduced GPU OOM in layer gradient computation by offloading intermediate tensors to CPU (PR #1796)
- Layer gradient attributor now supports a list-of-tensors output (PR #1629)
Captum Insights Retirement
As announced in the v0.8.0 release, Captum Insights has been retired in v0.9.0. The Insights frontend, backend, Sphinx API pages, and packaging extras (pip install captum[insights]) have all been removed (PRs #1795, #1782, #1784, #1693). Users who still need the Insights UI can pin to the v0.8.0 tag.
Bug Fixes
- Ensure mask tensor is placed on the same device as inputs (PR #1809)
- Ensure eval diff and mask tensor are on the same device (PR #1542)
- Move
predto the same device aslm.classes()in LLM attribution (PR #1189) - Fix long-tensor behavior in
torch.gatherinside interpretable input (PR #1678) - Fix
UnpicklingErrorin DLRM tutorial for PyTorch 2.6+ (PR #1765) - Lime output-dimension bug in batched forward fixed (PR #1513)
- Fixed index adjustment in
LayerAttributormask for individual neurons (PR #1531)