DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models, Yung-Sung Chuang+, N/A, arXiv'23 #1040

AkihikoWatanabe · 2023-09-13T10:24:33Z

URL

https://arxiv.org/abs/2309.03883

Affiliations

Yung-Sung Chuang, N/A
Yujia Xie, N/A
Hongyin Luo, N/A
Yoon Kim, N/A
James Glass, N/A
Pengcheng He, N/A

Abstract

Despite their impressive capabilities, large language models (LLMs) are proneto hallucinations, i.e., generating content that deviates from facts seenduring pretraining. We propose a simple decoding strategy for reducinghallucinations with pretrained LLMs that does not require conditioning onretrieved external knowledge nor additional fine-tuning. Our approach obtainsthe next-token distribution by contrasting the differences in logits obtainedfrom projecting the later layers versus earlier layers to the vocabulary space,exploiting the fact that factual knowledge in an LLMs has generally been shownto be localized to particular transformer layers. We find that this Decoding byContrasting Layers (DoLa) approach is able to better surface factual knowledgeand reduce the generation of incorrect facts. DoLa consistently improves thetruthfulness across multiple choices tasks and open-ended generation tasks, forexample improving the performance of LLaMA family models on TruthfulQA by12-17% absolute points, demonstrating its potential in making LLMs reliablygenerate truthful facts.

Translation (by gpt-3.5-turbo)

大規模言語モデル（LLMs）は、印象的な能力を持っているにもかかわらず、幻覚（つまり、事前学習中に見た事実から逸脱したコンテンツを生成する）に陥りやすい。
我々は、外部の知識を検索したり、追加のfine-tuningを必要としない、事前学習済みLLMsにおける幻覚を軽減するためのシンプルなデコーディング戦略を提案する。
我々のアプローチは、後のレイヤーと早いレイヤーから得られるロジットを語彙空間に射影することによるロジットの差異を対比することで、次のトークンの分布を得る。
これは、LLMsの事実知識が一般的に特定のトランスフォーマーレイヤーに局所化されているという事実を利用している。
この「レイヤーの対比によるデコーディング（DoLa）」アプローチは、事実知識をより明確に示し、誤った事実の生成を減らすことができることがわかった。
DoLaは、複数の選択課題やオープンエンドの生成課題において一貫して真実性を向上させることができる。
例えば、TruthfulQAにおけるLLaMAファミリーモデルのパフォーマンスを12〜17%絶対ポイント向上させることができ、LLMsが信頼性のある真実の事実を生成する可能性を示している。

Summary (by gpt-3.5-turbo)

我々は、事前学習済みの大規模言語モデル（LLMs）における幻覚を軽減するためのシンプルなデコーディング戦略を提案する。
このアプローチは、ロジットの差異を対比することで次のトークンの分布を得るもので、事実知識をより明確に示し、誤った事実の生成を減らすことができる。
このアプローチは、複数の選択課題やオープンエンドの生成課題において真実性を向上させることができることが示されている。

AkihikoWatanabe · 2023-11-19T12:29:13Z

【以下、WIP状態の論文を読んでいるため今後内容が変化する可能性あり】

概要

Transformer Layerにおいて、factual informationが特定のレイヤーに局所化するという現象を観測しており、それを活用しよりFactual Consistencyのある生成をします、という研究

あるテキストを生成するときの単語の生成確率の分布を可視化。final layer (N=32だと思われる)との間のJensen-shanon Divergence (JSD) で可視化している。が、図を見るとJSDの値域は[0, 1]のはずなのにこれを逸脱しているので一体どういう計算をしているのか。。。
図の説明としては論文中では2つのパターンがあると言及しており

重要な固有表現や日付（Wole Soyinka, 1986など; Factual Knowledgeが必要なもの）は、higher layerでも高い値となっており、higher-layerにおいてpredictionの内容を変えている（重要な情報がここでinjectionされている）
機能語や、questionからの単語のコピー（Nigerian, Nobel Prize など）のような "easy" なtokenは既にmiddle of layersで既にJSDの値が小さく、early layerの時点で出力することが既に決定されている

手法概要

ここからの考察としては、重要な事実に関する情報はfinal layerの方で分布が変化する傾向にあり、低layerの方ではそうではないぽいので、final layerと分布が似ているがFactual Informationがまだあまり顕著に生成確率が高くなっていないlayer（pre mature layer）との対比をとることで、生成されるべきFactual Informationがわかるのではないか、という前提の元提案手法が組まれている。手法としては、final layerとのJSDが最大となるようなlayerを一つ選択する、というものになっているが、果たしてこの選択方法で前述の気持ちが実現できているのか？という気は少しする。

AkihikoWatanabe added the Pocket label Sep 13, 2023

AkihikoWatanabe changed the title あ DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models, Yung-Sung Chuang+, N/A, arXiv'23 Sep 13, 2023

AkihikoWatanabe added NLP LanguageModel Hallucination FactualConsistency labels Nov 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models, Yung-Sung Chuang+, N/A, arXiv'23 #1040

DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models, Yung-Sung Chuang+, N/A, arXiv'23 #1040

AkihikoWatanabe commented Sep 13, 2023 •

edited

AkihikoWatanabe commented Nov 19, 2023 •

edited

DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models, Yung-Sung Chuang+, N/A, arXiv'23 #1040

DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models, Yung-Sung Chuang+, N/A, arXiv'23 #1040

Comments

AkihikoWatanabe commented Sep 13, 2023 • edited

URL

Affiliations

Abstract

Translation (by gpt-3.5-turbo)

Summary (by gpt-3.5-turbo)

AkihikoWatanabe commented Nov 19, 2023 • edited

概要

手法概要

AkihikoWatanabe commented Sep 13, 2023 •

edited

AkihikoWatanabe commented Nov 19, 2023 •

edited