Beyond Accuracy: Ensuring Correct Predictions with Correct Rationales (NeurIPS 2024)

This repository holds the Pytorch implementation of Beyond Accuracy: Ensuring Correct Predictions with Correct Rationales by Tang Li, Mengmeng Ma, and Xi Peng. If you find our paper and code useful in your research, please consider citing:

@inproceedings{li2024beyond,
 title={Beyond Accuracy: Ensuring Correct Predictions with Correct Rationales},
 author={Li, Tang and Ma, Mengmeng and Peng, Xi},
 booktitle={Proceedings of the Annual Conference on Neural Information Processing Systems (NeurIPS)},
 year={2024}
}

Introduction

Can we trust Large Foundation Models (LFMs) in their predictions? Our findings say NO! There are many unsafe prediction examples:

To address this issues, we propose Double-Correct Predictions (DCP). Please refer to our paper for method details.

Pretrained Weights

Fine-tuned on ImageNet: DCP-ViT-B/32

Requirements

This repository reproduces our results on ImageNet, CIFAR-10/100, CUB, Caltech101, OxfordPets, Food101, SUN397, and Stanford Cars datasets, please download these datasets as needed. Our code is build upon Python3 and Pytorch v2.0.1 on Ubuntu 18.04. Please install all required packages by running:

pip install -r requirements.txt

Rationale Dataset

Our structured rationales capture the major attributes and their sub-attributes that lead to the recognition of objects. Our dataset offers over 4,000 unique rationales covering all 1,000 categories from ImageNet. The dataset is in .JSON format:

./DCP/Rationale Dataset/rationale_imagenet.json

To curate customized rationale datasets, you will need to add your OpenAI API token and run the following notebook. Note that in notebook showcase our best prompt for this task, you can change to any category list as you want or modify the prompts as needed.

./DCP/generate_graph.ipynb

OpenAI will update their API library, please modify the code accordingly if needed.

Rationale-informed Optimization

Before pretraining, please replace the paths in load.py to your own datasets and run:

sh run_cross_recon.sh

Note that we parse the ontology graphs in the rationale dataset into visual concepts in ./DCP/descriptors/my_imagenet.json.

Evaluations

We provide example code for reproducing zero-shot prediction accuracy and rationale disentanglability:

To evaluate the zero-shot prediction accuracy, please run:

./DCP/evaluation.ipynb

To evaluate rationale disentanglability, please run:

./DCP/disentanglability.ipynb

Acknowledgement

Part of our code is borrowed from the following repositories.

We thank to the authors for releasing their codes. Please also consider citing their works.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
Rationale Dataset		Rationale Dataset
descriptors		descriptors
figures		figures
utils		utils
README.md		README.md
datasets.py		datasets.py
descriptor_strings.py		descriptor_strings.py
disentanglability.ipynb		disentanglability.ipynb
evaluation.ipynb		evaluation.ipynb
generate_graph.ipynb		generate_graph.ipynb
load.py		load.py
loading_helpers.py		loading_helpers.py
loss.py		loss.py
prs_hook.py		prs_hook.py
requirements.txt		requirements.txt
run_cross_recon.sh		run_cross_recon.sh
train_cross_recon.py		train_cross_recon.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Beyond Accuracy: Ensuring Correct Predictions with Correct Rationales (NeurIPS 2024)

Introduction

Pretrained Weights

Requirements

Rationale Dataset

Rationale-informed Optimization

Evaluations

Acknowledgement

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Beyond Accuracy: Ensuring Correct Predictions with Correct Rationales (NeurIPS 2024)

Introduction

Pretrained Weights

Requirements

Rationale Dataset

Rationale-informed Optimization

Evaluations

Acknowledgement

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages