Faithful Vision-Language Interpretation via Concept Bottleneck Models (FVLC)

Official repository for the paper Faithful Vision-Language Interpretation via Concept Bottleneck Models (ICLR 2024).

Authors: Songning Lai, Lijie Hu, Junxiao Wang, Laure Berti-Equille, Di Wang

Pipeline overview

An overview of our pipeline for creating FVLC. The concept set from GPT-3 is encoded by CLIP to obtain E_T; the image is processed by the backbone and CLIP image encoder. The activation matrix M and mappings g, W_c, and W_F are learned. L1–L4 are used to obtain the faithful mapping W̃_c and g̃(x). Noise is introduced in the concept set, text encoder, and input to validate faithfulness.

Motivation: instability under perturbation

Places365 example: input image, concept output without perturbation, and concept output after concept-word and input perturbations. Concept positions and ranks change (e.g. the concept “surgery”); the prediction also changes under slight perturbations.

Abstract

The demand for transparency in healthcare and finance has led to interpretable machine learning (IML) models, notably Concept Bottleneck Models (CBMs), valued for their potential in performance and insights into deep neural networks. However, CBM's reliance on manually annotated data poses challenges. Label-free CBMs have emerged to address this, but they remain unstable, affecting their faithfulness as explanatory tools.

We introduce a formal definition for Faithful Vision-Language Concept (FVLC) and a methodology for constructing an FVLC that satisfies four critical properties. Our experiments on four benchmark datasets (CIFAR-10, CIFAR-100, CUB, Places365) demonstrate that FVLC outperforms baselines regarding stability against input and concept set perturbations (WP1, WP2, IP), with minimal accuracy degradation compared to vanilla Label-free CBM.

Key Ideas

Four properties of a faithful concept: (i) significant top-k overlap for interpretability; (ii) stability of the concept vector under noise and concept-set perturbations; (iii) prediction distribution close to vanilla CBM; (iv) stable output distribution under perturbations.
Objective: Min-max formulation (Eq. 7 in the paper) with losses L₁–L₄ (overlap, concept stability, prediction closeness, output stability). We use PGD to find worst-case perturbations δ (input) and ρ (concept words), then update the projection layer.
Metrics: TCPC (Top Concept Prediction Change) and TOPC (Top Overlap Prediction Change); lower is more stable.

Main Results

Table 1: Accuracy (%)

Method	CIFAR10	CIFAR100	CUB	Places365
Standard (no interpretability)	88.80	70.10	76.70	48.56
P-CBM (CLIP)	84.50	56.00	N/A	N/A
Label-free CBM	86.32	65.42	74.23	43.63
WP1(10%) – base	86.25	65.09	73.97	43.67
WP1(10%) – FVLC	86.39	64.90	73.92	43.62
WP2 – base	86.41	65.16	73.96	43.54
WP2 – FVLC	86.22	65.34	74.44	44.55
IP – base	86.62	65.36	74.39	43.64
IP – FVLC	86.88	65.29	74.01	43.71

FVLC keeps accuracy on par or better while greatly improving stability (see Table 2).

Table 2: Stability (TCPC / TOPC) — lower is better

Method	CIFAR10	CIFAR100	CUB	Places365
WP1(10%) – base	1.99E-01 / 8.36E-02	1.94E-01 / 1.31E-01	2.32E-01 / 3.41E-01	2.26E-01 / 1.14E-01
WP1(10%) – FVLC	1.19E-03 / 7.40E-03	3.67E-03 / 4.55E-03	1.19E-02 / 1.53E-03	1.39E-03 / 1.25E-03
WP2 – base	1.53E-01 / 4.99E-02	1.36E-01 / 6.67E-02	1.43E-01 / 1.73E-01	1.40E-01 / 6.37E-02
WP2 – FVLC	1.10E-02 / 8.72E-03	3.35E-03 / 4.55E-03	1.05E-02 / 1.53E-03	1.55E-03 / 1.29E-03
IP – base	1.68E-01 / 6.28E-02	1.38E-01 / 8.81E-02	1.71E-01 / 2.23E-01	1.73E-01 / 8.09E-02
IP – FVLC	8.02E-03 / 8.29E-03	3.24E-03 / 4.56E-03	1.04E-02 / 1.59E-03	1.50E-03 / 1.25E-03

Ablation (Table 3)

Ablation shows that L₂, L₃, and L₄ all contribute; using all three (✓✓✓) gives the best TCPC/TOPC.

Concept weight visualizations

Concept weights and final-layer weights on one sample per dataset: input x, concept without perturbation (c), with perturbation (c+δ), and optimized with perturbation (c̃+δ).

Setup

# Clone this repository
git clone https://github.com/xll0328/FVLC.git
cd FVLC

# Create environment and install dependencies
pip install -r requirements.txt

Python: 3.8+
PyTorch: with CUDA if available.
CLIP: used for text/image encoders; concept sets are under data/concept_sets/.

Optional: download CUB and pretrained models

bash download_cub.sh      # CUB-200-2011
bash download_models.sh   # pretrained backbone / CLIP
bash download_rn18_places.sh  # ResNet-18 for Places365

Running the models

Train Label-free CBM (base):
python train_cbm.py --dataset cifar10
(See train_cbm.py for --concept_set, --backbone, --clip_name, etc.)
Train FVLC (with faithfulness losses):
python train_fcbm_all.py --dataset cifar10
or projection-only:
python train_fcbm_projonly.py --dataset cifar10
(Uses PGD for δ and ρ; see attack.py and metric.py for TCPC/TOPC and overlap losses.)
Evaluation:
python eval.py --dataset cifar10 --model_path <path>
(Adjust --model_path and dataset flags as needed.)
Attack/perturbation evaluation:
python attack.py (see script args for perturbation types: WP1, WP2, IP).

Example commands are also listed in training_commands.txt.

Project structure (overview)

train_cbm.py — train base Label-free CBM
train_fcbm_all.py — train full FVLC (all losses)
train_fcbm_projonly.py — train FVLC projection layer only
attack.py — PGD attacker for input/concept perturbations
metric.py — top-k overlap, TCPC/TOPC, robust losses
cbm.py — CBM model definition
data_utils.py, utils.py — data and backbone/CLIP helpers
data/concept_sets/ — concept sets per dataset
clip/ — CLIP model and tokenizer

Citation

If you use this code or the paper, please cite:

@inproceedings{lai2023faithful,
  title={Faithful Vision-Language Interpretation via Concept Bottleneck Models},
  author={Lai, Songning and Hu, Lijie and Wang, Junxiao and Berti-Equille, Laure and Wang, Di},
  booktitle={The Twelfth International Conference on Learning Representations (ICLR)},
  year={2024}
}

Paper: OpenReview
Project page: https://xll0328.github.io/fvlc/

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
clip		clip
colors		colors
data		data
docs/figures		docs/figures
experiments		experiments
glm_saga		glm_saga
ConceptNet_conceptset.ipynb		ConceptNet_conceptset.ipynb
GPT_conceptset_processor.ipynb		GPT_conceptset_processor.ipynb
GPT_initial_concepts.ipynb		GPT_initial_concepts.ipynb
LICENSE		LICENSE
README.md		README.md
attack.py		attack.py
cbm.py		cbm.py
conceptset_utils.py		conceptset_utils.py
data_utils.py		data_utils.py
download_cub.sh		download_cub.sh
download_models.sh		download_models.sh
download_rn18_places.sh		download_rn18_places.sh
eval.py		eval.py
evaluate_cbm.ipynb		evaluate_cbm.ipynb
metric.py		metric.py
more_exp		more_exp
plots.py		plots.py
requirements.txt		requirements.txt
similarity.py		similarity.py
train_cbm.py		train_cbm.py
train_fcbm_all.py		train_fcbm_all.py
train_fcbm_projonly.py		train_fcbm_projonly.py
train_standard.py		train_standard.py
training_commands.txt		training_commands.txt
utils.py		utils.py
visualization.py		visualization.py
xr_exp.py		xr_exp.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Faithful Vision-Language Interpretation via Concept Bottleneck Models (FVLC)

Pipeline overview

Motivation: instability under perturbation

Abstract

Key Ideas

Main Results

Table 1: Accuracy (%)

Table 2: Stability (TCPC / TOPC) — lower is better

Ablation (Table 3)

Concept weight visualizations

Setup

Optional: download CUB and pretrained models

Running the models

Project structure (overview)

Citation

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

xll0328/ICLR24-FVLC

Folders and files

Latest commit

History

Repository files navigation

Faithful Vision-Language Interpretation via Concept Bottleneck Models (FVLC)

Pipeline overview

Motivation: instability under perturbation

Abstract

Key Ideas

Main Results

Table 1: Accuracy (%)

Table 2: Stability (TCPC / TOPC) — lower is better

Ablation (Table 3)

Concept weight visualizations

Setup

Optional: download CUB and pretrained models

Running the models

Project structure (overview)

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages