You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We present evidence that language models can learn meaning despite beingtrained only to perform next token prediction on text, specifically a corpus ofprograms. Each program is preceded by a specification in the form of (textual)input-output examples. Working with programs enables us to precisely defineconcepts relevant to meaning in language (e.g., correctness and semantics),making program synthesis well-suited as an intermediate testbed forcharacterizing the presence (or absence) of meaning in language models. We first train a Transformer model on the corpus of programs, then probe thetrained model's hidden states as it completes a program given a specification.Despite providing no inductive bias toward learning the semantics of thelanguage, we find that a linear probe is able to extract abstractions of bothcurrent and future program states from the model states. Moreover, there is astrong, statistically significant correlation between the accuracy of the probeand the model's ability to generate a program that implements thespecification. To evaluate whether the semantics are represented in the modelstates rather than learned by the probe, we design a novel experimentalprocedure that intervenes on the semantics of the language while preserving thelexicon and syntax. We also demonstrate that the model learns to generatecorrect programs that are, on average, shorter than those in the training set,which is evidence that language model outputs may differ from the trainingdistribution in semantically meaningful ways. In summary, this paper does notpropose any new techniques for training language models, but develops anexperimental framework for and provides insights into the acquisition andrepresentation of (formal) meaning in language models.
URL
Affiliations
Abstract
Translation (by gpt-3.5-turbo)
まず、Transformerモデルをプログラムのコーパスで訓練し、仕様が与えられた場合にプログラムを完成させる際に、訓練されたモデルの隠れ状態をプローブする。言語の意味を学習するための帰納バイアスを提供しないにもかかわらず、線形プローブは、モデルの状態から現在および将来のプログラム状態の抽象化を抽出することができることがわかった。さらに、プローブの精度と、モデルが仕様を実装するプログラムを生成する能力との間には、強い統計的有意な相関がある。言語モデルの出力が訓練分布と異なる意味的に意味のある方法で異なる可能性があることを示すために、正しいプログラムを生成することを学習し、平均的に訓練セットよりも短いプログラムを生成することも示した。要約すると、本論文は言語モデルの訓練に新しい技術を提案するものではなく、(形式的な)意味の習得と表現に関する実験的なフレームワークを開発し、洞察を提供する。
Summary (by gpt-3.5-turbo)
The text was updated successfully, but these errors were encountered: