You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
PanCanBench is a benchmark of 282 de-identified authentic pancreatic cancer patient questions paired with 3,130expert-designed rubrics for evaluating large language models(LLM).
Data Availability
The question-rubrics are available on
Citation
About
PanCanBench is a clinically-grounded benchmark designed to evaluate the utility and safety of Large Language Models (LLMs) in responding to real-world patient inquiries about pancreatic cancer.