SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning, Ning Miao+, N/A, arXiv'23 #924

AkihikoWatanabe · 2023-08-08T11:08:02Z

URL

The recent progress in large language models (LLMs), especially the inventionof chain-of-thoughts (CoT) prompting, makes it possible to solve reasoningproblems. However, even the strongest LLMs are still struggling with morecomplicated problems that require non-linear thinking and multi-step reasoning.In this work, we explore whether LLMs have the ability to recognize their ownerrors, without resorting to external resources. In particular, we investigatewhether they can be used to identify individual errors within a step-by-stepreasoning. To this end, we propose a zero-shot verification scheme to recognizesuch errors. We then use this verification scheme to improve question-answeringperformance, by using it to perform weighted voting on different generatedanswers. We test the method on three math datasets-GSM8K, MathQA, and MATH-andfind that it successfully recognizes errors and, in turn, increases finalpredictive performance.

最新の大規模言語モデル（LLMs）は、推論問題を解決するために有望な手法ですが、複雑な問題にはまだ苦戦しています。本研究では、LLMsが自身のエラーを認識する能力を持っているかどうかを探求し、ゼロショットの検証スキームを提案します。この検証スキームを使用して、異なる回答に対して重み付け投票を行い、質問応答のパフォーマンスを向上させることができることを実験で確認しました。

AkihikoWatanabe · 2023-08-08T11:11:16Z

これはおもしろそう。後で読む

AkihikoWatanabe added the Pocket label Aug 8, 2023

AkihikoWatanabe changed the title あ SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning, Ning Miao+, N/A, arXiv'23 Aug 8, 2023