Textbooks Are All You Need II: phi-1.5 technical report, Yuanzhi Li+, N/A, arXiv'23 #1039

AkihikoWatanabe · 2023-09-13T10:21:19Z

URL

https://arxiv.org/abs/2309.05463

Affiliations

Yuanzhi Li, N/A
Sébastien Bubeck, N/A
Ronen Eldan, N/A
Allie Del Giorno, N/A
Suriya Gunasekar, N/A
Yin Tat Lee, N/A

Abstract

We continue the investigation into the power of smaller Transformer-basedlanguage models as initiated by \textbf{TinyStories} -- a 10 million parametermodel that can produce coherent English -- and the follow-up work on\textbf{phi-1}, a 1.3 billion parameter model with Python coding performanceclose to the state-of-the-art. The latter work proposed to use existing LargeLanguage Models (LLMs) to generate textbook quality" data as a way to enhancethe learning process compared to traditional web data. We follow theTextbooks Are All You Need" approach, focusing this time on common sensereasoning in natural language, and create a new 1.3 billion parameter modelnamed \textbf{phi-1.5}, with performance on natural language tasks comparableto models 5x larger, and surpassing most non-frontier LLMs on more complexreasoning tasks such as grade-school mathematics and basic coding. Moregenerally, \textbf{phi-1.5} exhibits many of the traits of much larger LLMs,both good -- such as the ability to ``think step by step" or perform somerudimentary in-context learning -- and bad, including hallucinations and thepotential for toxic and biased generations -- encouragingly though, we areseeing improvement on that front thanks to the absence of web data. Weopen-source \textbf{phi-1.5} to promote further research on these urgenttopics.

Translation (by gpt-3.5-turbo)

私たちは、\textbf{TinyStories}という1000万パラメータのモデルによって開始された、より小さなTransformerベースの言語モデルの能力についての調査を続けます。このモデルは、連続した英語を生成することができます。また、Pythonのコーディング性能も最新技術に近い1.3億パラメータの\textbf{phi-1}というモデルに関する続編の研究も行いました。後者の研究では、既存の大規模言語モデル（LLMs）を使用して「教科書の品質」のデータを生成し、従来のウェブデータと比較して学習プロセスを向上させる方法を提案しました。今回は、「教科書が必要なすべて」のアプローチに従い、自然言語における常識的な推論に焦点を当て、新たな13億パラメータのモデル\textbf{phi-1.5}を作成しました。このモデルは、自然言語のタスクにおいて、5倍大きなモデルと比較して性能があり、小学校の数学や基本的なコーディングなどのより複雑な推論タスクにおいて、ほとんどの非フロンティアLLMsを上回ります。より一般的には、\textbf{phi-1.5}は、より大きなLLMsの特徴を多く備えており、ステップバイステップで「考える」能力や、ある程度の文脈における学習などの良い特性と、幻覚や有害でバイアスのある生成の可能性などの悪い特性を持っています。しかし、ウェブデータの不在により、その点で改善が見られています。私たちは\textbf{phi-1.5}をオープンソース化し、これらの緊急のトピックに関するさらなる研究を促進します。

Summary (by gpt-3.5-turbo)

私たちは、小さなTransformerベースの言語モデルであるTinyStoriesと、大規模な言語モデルであるphi-1の能力について調査しました。また、phi-1を使用して教科書の品質のデータを生成し、学習プロセスを改善する方法を提案しました。さらに、phi-1.5という新しいモデルを作成し、自然言語のタスクにおいて性能が向上し、複雑な推論タスクにおいて他のモデルを上回ることを示しました。phi-1.5は、良い特性と悪い特性を持っており、オープンソース化されています。

The text was updated successfully, but these errors were encountered:

AkihikoWatanabe · 2023-11-19T12:31:52Z

#766 に続く論文

AkihikoWatanabe added the Pocket label Sep 13, 2023

AkihikoWatanabe changed the title あ Textbooks Are All You Need II: phi-1.5 technical report, Yuanzhi Li+, N/A, arXiv'23 Sep 13, 2023

AkihikoWatanabe added Efficiency/SpeedUp MachineLearning NLP LanguageModel labels Nov 19, 2023

AkihikoWatanabe mentioned this issue Apr 23, 2024

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone, Marah Abdin+, N/A, arXiv'24 #1293

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Textbooks Are All You Need II: phi-1.5 technical report, Yuanzhi Li+, N/A, arXiv'23 #1039

Textbooks Are All You Need II: phi-1.5 technical report, Yuanzhi Li+, N/A, arXiv'23 #1039

AkihikoWatanabe commented Sep 13, 2023 •

edited

AkihikoWatanabe commented Nov 19, 2023

Textbooks Are All You Need II: phi-1.5 technical report, Yuanzhi Li+, N/A, arXiv'23 #1039

Textbooks Are All You Need II: phi-1.5 technical report, Yuanzhi Li+, N/A, arXiv'23 #1039

Comments

AkihikoWatanabe commented Sep 13, 2023 • edited

URL

Affiliations

Abstract

Translation (by gpt-3.5-turbo)

Summary (by gpt-3.5-turbo)

AkihikoWatanabe commented Nov 19, 2023

AkihikoWatanabe commented Sep 13, 2023 •

edited