🎉 Our paper has been accepted to [AAAI 2026]!
Studying external and internal uncertainty of LLMs. This repository provides the necessary code for running the Semantic Volume method for both external uncertainty detection (query ambiguity) and internal uncertainty detection (response uncertainty) of LLMs.
The original clamber data can be downloaded here https://github.com/zt991211/CLAMBER. The necessary code for query augmentation and embedding generation are provided in extend_questions.py and generate_embeddings.py.
The code to run the Semantic Volume calculation for query ambiguity detection is in detect_query_ambiguity.py.
Please put the original Trivia10K data (10K subset of the original TriviaQA data: https://nlp.cs.washington.edu/triviaqa/) in a data folder. The necessary code to sample candidate responses and embedding generation is provided in sample_llama_answers.py.
The code to run the Semantic Volume calculation for response uncertainty detection is in detect_response_uncertainty.py.
See CONTRIBUTING for more information.
This project is licensed under the Apache-2.0 License.