Lost in the Middle: How Language Models Use Long Contexts, Nelson F. Liu+, N/A, arXiv'23 #793

AkihikoWatanabe · 2023-07-11T11:09:37Z

URL

While recent language models have the ability to take long contexts as input,relatively little is known about how well the language models use longercontext. We analyze language model performance on two tasks that requireidentifying relevant information within their input contexts: multi-documentquestion answering and key-value retrieval. We find that performance is oftenhighest when relevant information occurs at the beginning or end of the inputcontext, and significantly degrades when models must access relevantinformation in the middle of long contexts. Furthermore, performancesubstantially decreases as the input context grows longer, even for explicitlylong-context models. Our analysis provides a better understanding of howlanguage models use their input context and provides new evaluation protocolsfor future long-context models.

最近の言語モデルは、長い文脈を入力として受け取る能力を持っていますが、長い文脈をどれだけうまく利用しているかについてはあまり知られていません。
私たちは、入力文脈内の関連情報を特定する必要がある2つのタスク、マルチドキュメントの質問応答とキー・バリューの検索における言語モデルのパフォーマンスを分析しました。
私たちは、関連情報が入力文脈の始まりや終わりにある場合、パフォーマンスが最も高くなることを見つけましたが、長い文脈の中で関連情報にアクセスする必要がある場合、パフォーマンスが著しく低下します。
さらに、入力文脈が長くなるにつれて、明示的に長い文脈を扱うモデルでもパフォーマンスが大幅に低下します。
私たちの分析は、言語モデルが入力文脈をどのように利用しているかをより良く理解するためのものであり、将来の長い文脈モデルのための新しい評価プロトコルを提供します。

最近の言語モデルは、長い文脈を入力として受け取ることができますが、その長い文脈をどれだけうまく利用しているかについてはまだよくわかっていません。この研究では、マルチドキュメントの質問応答とキー・バリューの検索という2つのタスクにおいて、言語モデルのパフォーマンスを分析しました。その結果、関連情報が入力文脈の始まりや終わりにある場合、パフォーマンスが最も高くなることがわかりましたが、長い文脈の中で関連情報にアクセスする必要がある場合、パフォーマンスが著しく低下します。さらに、入力文脈が長くなるにつれて、明示的に長い文脈を扱うモデルでもパフォーマンスが大幅に低下します。この分析は、言語モデルが入力文脈をどのように利用しているかをより良く理解するためのものであり、将来の長い文脈モデルのための新しい評価プロトコルを提供します。

AkihikoWatanabe · 2023-07-11T11:10:28Z

非常に重要な知見がまとめられている

AkihikoWatanabe · 2023-07-11T11:12:19Z

AkihikoWatanabe added the Pocket label Jul 11, 2023

AkihikoWatanabe changed the title あ Lost in the Middle: How Language Models Use Long Contexts, Nelson F. Liu+, N/A, arXiv'23 Jul 11, 2023

AkihikoWatanabe added LanguageModel In-ContextLearning MachineLearning NLP and removed Pocket labels Jul 11, 2023

AkihikoWatanabe mentioned this issue Oct 31, 2023

Re-Reading Improves Reasoning in Language Models, Xiaohan Xu+, N/A, arXiv'23 #1110

Open