This is the code associated with the ICLR 2023 paper: "Taking a step back with KCal: Multi-class Kernel-based Calibration for Deep Neural Networks." If you use this code, please consider using the citation below:
@inproceedings{
lin2023taking,
title={Taking a Step Back with {KC}al: Multi-Class Kernel-Based Calibration for Deep Neural Networks},
author={Zhen Lin and Shubhendu Trivedi and Jimeng Sun},
booktitle={International Conference on Learning Representations},
year={2023},
url={https://openreview.net/forum?id=p_jIy5QFB7}
}
The environment in which we ran the experiments is exported to env.yml
.
Coming soon.
The exact steps to reproduce our experiments are elaborated below:
First, the healthcare datasets require preprocessing. For PN2017, we follow https://github.com/hsd1503/MINA. For ISRUC, the dataloader already has all preprocessing steps. For IIIC, the data can be shared upon request.
For this step, run scripts/final_run_all.py
.
It should train all the DNNs and kernels needed for the experiment.
The package persist_to_disk will cache down the important results.
It is advised to set the persist_path
to somewhere with enough storage (~30GB for all experiments, mostly due to ImageNet) in the config generated by:
import persist_to_disk as ptd
ptd.config.generate_config()
For this step, use the notebooks in notebook
.
An example is given in notebook/main_exp.ipynb
.
Note that the "keys" (used to identify exactly the kernel and DNN) will be different than what's used in the paper if you train your own models (You could download the model weights used in the paper from here).
Once the training is done, you need to change the corresponding keys in _TRAINED_KEYS
and _KERNEL_KEYS
in _settings.py
and access them in the notebooks.
Baselines' performance can be computed by using their official repositories + the evaluation method shown in main_exp.ipynb
.