Skip to content

Website of the paper "Evaluating Large Language Models for Health-related Queries with Presuppositions" (Kaur et al., ACL Findings 2024)

Notifications You must be signed in to change notification settings

FLAIR-IISc/uphill

Repository files navigation

UPHILL

As corporations rush to integrate large language models (LLMs) to their search offerings, it is critical that they provide factually accurate information that is robust to any presuppositions that a user may express. We introduce UPHILL, a benchmark for Understanding Pressupositions for Health-related Inquiries to LLMs. It comprises of 9725 health-related queries with 5 different levels of presuppositions. For each query, it includes the veracity of the (presupposed) claim, model responses and entailment predictions for agreement between model responses and the claim within the query.

Quick Links


Acknowledgements

Our design and code is inspired by https://writing-assistant.github.io/.

About

Website of the paper "Evaluating Large Language Models for Health-related Queries with Presuppositions" (Kaur et al., ACL Findings 2024)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published