Orca 2: Teaching Small Language Models How to Reason, Arindam Mitra+, N/A, arXiv'23 #1148

AkihikoWatanabe · 2023-11-21T10:00:53Z

URL

https://arxiv.org/abs/2311.11045

Affiliations

Arindam Mitra, N/A
Luciano Del Corro, N/A
Shweti Mahajan, N/A
Andres Codas, N/A
Clarisse Simoes, N/A
Sahaj Agrawal, N/A
Xuxi Chen, N/A
Anastasia Razdaibiedina, N/A
Erik Jones, N/A
Kriti Aggarwal, N/A
Hamid Palangi, N/A
Guoqing Zheng, N/A
Corby Rosset, N/A
Hamed Khanpour, N/A
Ahmed Awadallah, N/A

Abstract

Orca 1 learns from rich signals, such as explanation traces, allowing it tooutperform conventional instruction-tuned models on benchmarks like BigBenchHard and AGIEval. In Orca 2, we continue exploring how improved trainingsignals can enhance smaller LMs' reasoning abilities. Research on trainingsmall LMs has often relied on imitation learning to replicate the output ofmore capable models. We contend that excessive emphasis on imitation mayrestrict the potential of smaller models. We seek to teach small LMs to employdifferent solution strategies for different tasks, potentially different fromthe one used by the larger model. For example, while larger models mightprovide a direct answer to a complex task, smaller models may not have the samecapacity. In Orca 2, we teach the model various reasoning techniques(step-by-step, recall then generate, recall-reason-generate, direct answer,etc.). More crucially, we aim to help the model learn to determine the mosteffective solution strategy for each task. We evaluate Orca 2 using acomprehensive set of 15 diverse benchmarks (corresponding to approximately 100tasks and over 36,000 unique prompts). Orca 2 significantly surpasses models ofsimilar size and attains performance levels similar or better to those ofmodels 5-10x larger, as assessed on complex tasks that test advanced reasoningabilities in zero-shot settings. We open-source Orca 2 to encourage furtherresearch on the development, evaluation, and alignment of smaller LMs.

Translation (by gpt-3.5-turbo)

Orca 1は、説明トレースなどの豊富なシグナルから学習し、BigBench HardやAGIEvalなどのベンチマークで従来の指示に調整されたモデルを上回る性能を発揮します。
Orca 2では、改善されたトレーニングシグナルが小さな言語モデルの推論能力をどのように向上させるかをさらに探求しています。
小さな言語モデルのトレーニングに関する研究では、より能力のあるモデルの出力を模倣するために模倣学習に頼ることが多かった。
私たちは、模倣に過度に重点を置くことが小さなモデルの潜在能力を制限する可能性があると主張します。
私たちは、小さな言語モデルに異なるタスクに対して異なる解決戦略を使用するように教えることを目指しています。これは、より大きなモデルが複雑なタスクに直接的な答えを提供する一方、小さなモデルには同じ能力がない場合があるからです。
Orca 2では、モデルにステップバイステップ、リコールして生成、リコール-理由-生成、直接的な回答などのさまざまな推論技術を教えます。
さらに重要なことは、モデルが各タスクに対して最も効果的な解決戦略を決定する方法を学ぶのを支援することです。
私たちは、Orca 2を15の多様なベンチマーク（約100のタスクと36,000以上の一意のプロンプトに対応）を使用して評価しました。
Orca 2は、同じサイズのモデルを大幅に上回り、ゼロショット設定で高度な推論能力をテストする複雑なタスクで、5〜10倍大きなモデルと同等またはそれ以上の性能を達成します。
私たちは、より小さな言語モデルの開発、評価、および整合性に関するさらなる研究を促進するためにOrca 2をオープンソース化しています。

Summary (by gpt-3.5-turbo)

Orca 1は、豊富なシグナルから学習し、従来のモデルを上回る性能を発揮します。Orca 2では、小さな言語モデルの推論能力を向上させるために異なる解決戦略を教えることを目指しています。Orca 2は、さまざまな推論技術を使用し、15のベンチマークで評価されました。Orca 2は、同じサイズのモデルを大幅に上回り、高度な推論能力を持つ複雑なタスクで優れた性能を発揮します。Orca 2はオープンソース化されており、小さな言語モデルの研究を促進します。

AkihikoWatanabe added the Pocket label Nov 21, 2023

AkihikoWatanabe changed the title あ Orca 2: Teaching Small Language Models How to Reason, Arindam Mitra+, N/A, arXiv'23 Nov 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Orca 2: Teaching Small Language Models How to Reason, Arindam Mitra+, N/A, arXiv'23 #1148

Orca 2: Teaching Small Language Models How to Reason, Arindam Mitra+, N/A, arXiv'23 #1148

AkihikoWatanabe commented Nov 21, 2023 •

edited

Orca 2: Teaching Small Language Models How to Reason, Arindam Mitra+, N/A, arXiv'23 #1148

Orca 2: Teaching Small Language Models How to Reason, Arindam Mitra+, N/A, arXiv'23 #1148

Comments

AkihikoWatanabe commented Nov 21, 2023 • edited

URL

Affiliations

Abstract

Translation (by gpt-3.5-turbo)

Summary (by gpt-3.5-turbo)

AkihikoWatanabe commented Nov 21, 2023 •

edited