Code for reproducing the experiments from the paper "Aggregate Models, Not Explanations: Improving Feature Importance Estimation".
@inproceedings{paillard2026aggregate,
title={Aggregate Models, Not Explanations: Improving Feature Importance Estimation},
author={Paillard, Joseph and Reyero Lobo, Angel and Engemann, Denis A. and Thirion, Bertrand},
booktitle={International Conference on Machine Learning (ICML)},
year={2026},
}All required packages can be installed using pip:
pip install -e .For the TabICL foundation model experiment:
pip install -e ".[tabicl]"For generating figures:
pip install -e ".[plots]"ensemble_vim/simulation.py runs variable importance estimation on synthetic datasets.
ensemble_vim/asymptotic.py computes asymptotic ground-truth importance at large sample size.
python ensemble_vim/simulation.py \
--n_samples 128 256 512 1024 2048 \
--seed 1 \
--n_jobs 4 \
--n_splits 5 \
--snr 1 \
--n_ensemble 10 \
--results_dir ./results \
--dataset_name friedman1 \
--n_features 20 \
--model_name mlp \
--ensemble bagging \
--sageAvailable models: mlp, mlp256, rf, linear, tabicl.
Available datasets: friedman1, ishigami, g_function, nonlinear.
Importance methods computed: LOCO, CFI, PFI (always), SAGE (with --sage flag).
python ensemble_vim/simulation.py \
--dataset_name nonlinear --n_features 100 --model_name mlp256 \
--ensemble bagging --n_ensemble 10 --snr 1 --seed 1python ensemble_vim/simulation.py \
--model_name tabicl --dataset_name friedman1 \
--ensemble bagging --n_ensemble 5 --n_samples 512 --seed 1Requires pip install -e ".[tabicl]".
ensemble_vim/run_brca.py evaluates ensemble vs. sub-models LOCO importance on the TCGA BRCA gene expression dataset with 10 validated driver genes as ground truth.
python ensemble_vim/run_brca.py --model_name mlp --seed 0
python ensemble_vim/run_brca.py --model_name logreg_l2 --seed 0The BRCA dataset (572 patients, 50 genes) can be downloaded from
Catav et al. 2021
and placed at ./data/BRCA.csv.
The scripts ensemble_vim/run_simulation.slurm and ensemble_vim/run_asymptotic.slurm can be used to submit the
experiments to a SLURM cluster. They use job arrays to parallelize over random seeds.
Results are saved in the specified results_dir:
results/
├── <dataset>_<model>_n<n>_p<d>_<ensemble><B>/
│ ├── models/
│ ├── scores_<dataset>_<seed>.csv
│ ├── support_<dataset>_<seed>.npy
│ ├── loco_<dataset>_<seed>.csv
│ ├── cfi_<dataset>_<seed>.csv
│ ├── pfi_<dataset>_<seed>.csv
│ └── sage_<dataset>_<seed>.csv
└── ...
Figure scripts are in ensemble_vim/figures/. Set the results_dir variable at the top of each script to point to your results directory.
figure_2.py,figure_3.py,figure_4.py— main paper figuresfigure_supplement.py— supplementary learning curves (LOCO, SAGE, CFI)plot_brca.py— BRCA driver gene recoverycompute_stability.py— Spearman rank correlation stability
The UK Biobank experiment code is in ensemble_vim/script_ukbb.py. It requires access to the UK Biobank proteomics data.