Skip to content

bhaweshiitk/ConformalLLM

Repository files navigation

ConformalLLM

Extending Conformal Prediction to LLMs

Read our paper here Conformal Prediction with Large Language Models for Multi-Choice Question Answering

Code Contributors: Charles Lu and Bhawesh Kumar

Code Organization

conformal_llm_scores.py contains the python script for classification using 1-shot question prompts. It outputs three files

  1. The softmax scores corresponding to each subjects for each of the 10 prompts
  2. The accuracy for each subject prompt for mmlu-based 1-shot question as a dictionary where the key is the subject name and value is a list containing accuracy for each of the 10 prompts.
  3. The accuracy for each subject prompt for gpt4-based 1-shot question as a dictionary where the key is the subject name and value is a list containing accuracy for each of the 10 prompts.

In conformal.ipynb, we have results for all conformal prediction experiments and gpt4 vs mmlu based prompt comparison. It requires the three files outputted by conformal_llm_scores.py to work. To run the experiment, download the llm_probs_gpt.zip file, unzip it and save it in your working directory and then run the conformal.ipynb file.

If you would like to run the experiments from scratch, apply for LLaMA access here and then use the hugging face version of LLaMA by converting original LLaMA weights to hugging face version refer here for instructions and then run the conformal_llm_scores.py script.

About

Extending Conformal Prediction to LLMs

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published