Skip to content

Captum v0.9.0 Release

Latest

Choose a tag to compare

@aobo-y aobo-y released this 17 Apr 06:36

The v0.9.0 release of Captum adds NumPy 2.x support, retires the long-deprecated Captum Insights, introduces new multimodal image-mask attribution primitives, adds remote LLM attribution via vLLM, and promotes cross-tensor attribution to a default-on feature for perturbation-based methods. This release also raises minimum supported Python and PyTorch versions to 3.10 and 1.13, respectively.

Upgrading from v0.8 should be drop-in for most users. The exceptions to watch for: any code that imports from captum.insights or installs the [insights] extra (removed — see below), environments on Python < 3.10 or PyTorch < 1.13 (no longer supported), and any code relying on the previous default behavior of FeatureAblation / FeaturePermutation treating input tensors independently (cross-tensor attribution is now default-on).

LLM & Multimodal Attribution:

Multimodal Image-Segment Attribution

Screenshot 2026-04-16 at 11 35 25 PM

This release adds a new interpretable input type for image-segment attribution, ImageMaskInput, useful for vision and vision-language models where features correspond to user-defined image regions is now exposed from captum.attr and is compatible with perturbation-based attribution methods and with LLMAttribution.

Key functionality:

  • Support for mask_list to describe multiple segment sets with overlapped pixels (PR #1749)
  • Mask-segmentation visualization utilities, with legends on the overlay plot_mask_overlay (PRs #1752, #1753, #1758, #1756)
  • New plot helper plot_image_heatmap for rendering pixel-level attributions on an ImageMaskInput (PR #1739, #1684)

A new tutorial, Multimodal_Image_Segment_Attribution.ipynb, demonstrates end-to-end usage leveraging SAM (Meta Segment Anything Model) to interpret a multimodal LLM (Gemma-4) (PR #1811, #1812).

LLM Attribution Improvements

Building on the LLM attribution work from v0.7 and v0.8, this release substantially expands what LLMAttribution can attribute over and how attributions are consumed.

  • New RemoteLLMAttribution wrapper and VLLMProvider allow perturbation-based attribution to be computed against a remotely hosted LLM serving endpoint, removing the requirement that the model fit in the client process (PR #1544). This enables attribution over much larger models than was previously feasible.
  • New boolean forward_in_tokens argument to LLMAttribution.attribute, to choose between replicating token-by-token output decoding or directly forwarding the output sequence in one-shot (PRs #1740, #1741, #1742, #1744)
  • Dict-like model_input support (PR #1698)
  • skip_tokens is no longer accepted as a target argument, and target encoding no longer adds special tokens (PRs #1685, #1686)

Captum Attribution Enhancement

Cross-Tensor Attribution

Cross-tensor attribution — the ability to group, ablate, or permute features across multiple input tensors simultaneously — was introduced incrementally over the v0.8 cycle and is now default-on for perturbation-based methods (PR commit 38230a70). This release finalizes the feature surface:

  • Feature grouping across input tensors for FeatureAblation (PR #1497)
  • Cross-tensor permutation in FeaturePermutation (PR #1507)
  • Support for multiple perturbations per eval when masking across tensors (PR #1530)

Perturbation-Based Method Improvements

  • FeatureAblation and FeaturePermutation gained a min_examples_per_batch argument and now skip feature groups when permuting features if any group has batch size ≤ 1 (PRs #1533, #1539)
  • Occlusion was migrated to the new ablated-batch construction path used by FeatureAblation (PR #1616)
  • Shapley Value perturbation construction performance improved (PR #1635), with further micro-optimization when formatting total_attrib (PR #1648)
  • Aggregate mode is now enabled whenever perturbations_per_eval == 1 (PR #1525)
  • Avoided unnecessary tensor construction when creating input masks for permutation/ablation (PR #1527)
  • Output validity for perturbations_per_eval > 1 is now checked via a dedicated method (PR #1666)
  • Scalar and 1-D tensor model outputs are handled explicitly (PR #1521)

Layer & Neuron Attribution

  • Reduced GPU OOM in layer gradient computation by offloading intermediate tensors to CPU (PR #1796)
  • Layer gradient attributor now supports a list-of-tensors output (PR #1629)

Captum Insights Retirement

As announced in the v0.8.0 release, Captum Insights has been retired in v0.9.0. The Insights frontend, backend, Sphinx API pages, and packaging extras (pip install captum[insights]) have all been removed (PRs #1795, #1782, #1784, #1693). Users who still need the Insights UI can pin to the v0.8.0 tag.

Bug Fixes

  • Ensure mask tensor is placed on the same device as inputs (PR #1809)
  • Ensure eval diff and mask tensor are on the same device (PR #1542)
  • Move pred to the same device as lm.classes() in LLM attribution (PR #1189)
  • Fix long-tensor behavior in torch.gather inside interpretable input (PR #1678)
  • Fix UnpicklingError in DLRM tutorial for PyTorch 2.6+ (PR #1765)
  • Lime output-dimension bug in batched forward fixed (PR #1513)
  • Fixed index adjustment in LayerAttributor mask for individual neurons (PR #1531)

Changes to Requirements

  • Minimum Python version raised to 3.10 (PR #1660)
  • Minimum PyTorch version raised to 1.13 (PR #1617)
  • Captum now supports NumPy 2.x (PR #1580). Users blocked on NumPy 1.x pins can now install and use Captum without constraint. Existing NumPy 1.x environments remain supported.