Repository for "Can Language Models Be Tricked by Language Illusions? Easier with Syntax, Harder with Semantics" by Yuhan Zhang, Ted Gibson, and Forrest Davis to appear in CoNLL 2023
The file data
includes three files mapping to the three illusions. Each file stores the sentences we tested and the corresponding metrics (whole-sentence perplexity & critical region surprisal) out of a certain language model (i.e., BERT, RoBERTa, GPT-2, GPT-3).
The file processing_scripts
includes four R files that store the reproducing codes that generate our statistical analyses.
The conference paper is named paper_CoNLL_2023
.
We thank Ankana Saha, Carina Kauf and Hayley Ross for the helpful discussion about the project. Any errors are ours and we appreciate your feedback!
Corresponding email: yuz551@g.harvard.edu.