Popular repositories Loading
-
-
chatgpt-evaluation
chatgpt-evaluation PublicThis respository contains the code for extracting the test samples we used in our paper: "A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity"
Repositories
- belief-revision Public
Belief-R test LMs' belief revision ability when presented with new evidence. Inspired by how humans suppress prior inferences, this task assesses LMs within delta reasoning (ΔR) framework. Belief-R features sequences of premises designed to simulate scenarios where additional information could necessitate revision on prior conclusions drawn by LMs.
- llm-political-bias Public
- long-biomedical-model Public
How Long Is Enough? Exploring the Optimal Intervals of Long-Range Clinical Note Language Modeling
- sensational_headline Public
This is the repo for sensational headline generation of our published paper in EMNLP 2019
- InstructAlign Public
- cantonese-asr Public