Code for "Prediction-Powered Ranking of Large Language Models", NeurIPS 2024.
ranking-algorithm
llm-eval
llm-evaluation
llm-evaluation-framework
prediction-powered-inference
rank-sets
-
Updated
Oct 28, 2024 - Jupyter Notebook