Do Models Really Learn to Follow Instructions? An Empirical Study of Instruction Tuning, ACL'23 #832

AkihikoWatanabe · 2023-07-15T22:15:13Z

https://virtual2023.aclweb.org/paper_P2358.html

AkihikoWatanabe · 2023-07-22T15:39:51Z

Recent works on instruction tuning (IT) have achieved great performance with zero-shot generalizability to unseen tasks. With additional context (e.g., task definition, examples) provided to models for fine-tuning, they achieved much higher performance than untuned models. Despite impressive performance gains, what models learn from IT remains understudied. In this work, we analyze how models utilize instructions during IT by comparing model training with altered vs. original instructions. Specifically, we create simplified task definitions by removing all semantic components and only leaving the output space information, and delusive examples that contain incorrect input-output mapping. Our experiments show that models trained on simplified task definition or delusive examples can achieve comparable performance to the ones trained on the original instructions and examples. Furthermore, we introduce a random baseline to perform zeroshot classification tasks, and find it achieves similar performance (42.6% exact-match) as IT does (43% exact-match) in low resource setting, while both methods outperform naive T5 significantly (30% per exact-match). Our analysis provides evidence that the impressive performance gain of current IT models can come from picking up superficial patterns, such as learning the output format and guessing. Our study highlights the urgent need for more reliable IT methods and evaluation.

Translation (by gpt-3.5-turbo)

最近のinstruction tuning（IT）に関する研究では、未知のタスクに対してゼロショットの汎化性能を持つ素晴らしいパフォーマンスが実現されています。モデルに追加のコンテキスト（タスクの定義、例など）を提供してファインチューニングすることで、非チューニングモデルよりもはるかに高いパフォーマンスが実現されました。しかし、ITからモデルが学習する内容はまだ研究されていません。本研究では、モデルのトレーニングを変更された指示と元の指示との比較によって、モデルがIT中に指示をどのように利用するかを分析します。具体的には、すべての意味的な要素を削除し、出力空間の情報のみを残した簡略化されたタスク定義と、誤った入出力マッピングを含む誤解を招く例を作成します。実験の結果、簡略化されたタスク定義や誤解を招く例でトレーニングされたモデルは、元の指示と例でトレーニングされたモデルと同等のパフォーマンスを達成することが示されました。さらに、ゼロショット分類タスクを実行するためのランダムなベースラインを導入し、低リソース設定ではITと同様のパフォーマンス（完全一致率42.6％）を達成し、両方の方法が単純なT5を大幅に上回ることがわかりました（完全一致率30％）。私たちの分析は、現在のITモデルの印象的なパフォーマンス向上が、出力形式の学習や推測などの表面的なパターンの把握によるものである可能性を示しています。私たちの研究は、より信頼性の高いIT手法と評価の緊急性を強調しています。

Summary (by gpt-3.5-turbo)

最近のinstruction tuning（IT）の研究では、追加のコンテキストを提供してモデルをファインチューニングすることで、ゼロショットの汎化性能を持つ素晴らしいパフォーマンスが実現されている。しかし、IT中にモデルがどのように指示を利用しているかはまだ研究されていない。本研究では、モデルのトレーニングを変更された指示と元の指示との比較によって、モデルがIT中に指示をどのように利用するかを分析する。実験の結果、トレーニングされたモデルは元の指示と同等のパフォーマンスを達成し、ITと同様のパフォーマンスを達成することが示された。この研究は、より信頼性の高いIT手法と評価の緊急性を強調している。

AkihikoWatanabe added the translation_required label Jul 22, 2023

AkihikoWatanabe changed the title ~~Do Models Really Learn to Follow Instructions? An Empirical Study of Instruction Tuning~~ Do Models Really Learn to Follow Instructions? An Empirical Study of Instruction Tuning, ACL'23 Oct 22, 2023

AkihikoWatanabe added InstructionTuning NLP Analysis LanguageModel labels Oct 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Do Models Really Learn to Follow Instructions? An Empirical Study of Instruction Tuning, ACL'23 #832

Do Models Really Learn to Follow Instructions? An Empirical Study of Instruction Tuning, ACL'23 #832

AkihikoWatanabe commented Jul 15, 2023

AkihikoWatanabe commented Jul 22, 2023 •

edited

Do Models Really Learn to Follow Instructions? An Empirical Study of Instruction Tuning, ACL'23 #832

Do Models Really Learn to Follow Instructions? An Empirical Study of Instruction Tuning, ACL'23 #832

Comments

AkihikoWatanabe commented Jul 15, 2023

AkihikoWatanabe commented Jul 22, 2023 • edited

Translation (by gpt-3.5-turbo)

Summary (by gpt-3.5-turbo)

AkihikoWatanabe commented Jul 22, 2023 •

edited