Self-Reflection in LLM Agents: Effects on Problem-Solving Performance

Abstract

In this study, we investigated the effects of self-reflection in large language models (LLMs) on problem-solving performance. We instructed nine popular LLMs to answer a series of multiple-choice questions to provide a performance baseline. Then, for each incorrectly answered question, we instructed eight types of self-reflecting LLM agents to reflect on their mistakes and provide themselves with guidance to improve problem-solving. Then, using this guidance, each self-reflecting agent attempted to re-answer the same questions again. Our results indicate that LLM agents are able to significantly improve their problem-solving performance through self-reflection. In addition, we compared the various types of self-reflection to determine their individual contribution to performance.

Documents

Research Paper

Code

Solve with Baseline - answers all questions using the baseline agent
Reflect on Solution - self-reflects on incorrectly answered problems given the correct answer
Save Reflections - separates reflections by type, redacts answers, and saves reflection text
Solve with Reflection - re-answers all incorrectly answered questions using the reflections
Plot Accuracy - plots the accuracy for each agent
Plot Accuracy by Model and Agent - plots the accuracy by model and agent
Plot Accuracy by Exam and Agent - plots the accuracy for each model by exam and agent
Analyze Details - performs the McNemar test and creates a table of the results
Analyze Keywords - analyzes the error keywords produced by the self-reflections

Data

Details - the low-level level details for each question answered in CSV format
Dialogs - the dialog for each question answered in JSON format
Exams - the exams containing MCQA problems in JSONL format
Logs - the log files for each question answered and self-reflection in plain-text format
Plots - the data visualizations of the results in PDF format
Reflections - the text generated during the self-reflections process stored as plain text files
Results - the results from the experiment in CSV format
Tables - the tabular results of the analysis stored as CSV files

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
data		data
documents		documents
source		source
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

documents

documents

source

source

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

Repository files navigation

Self-Reflection in LLM Agents: Effects on Problem-Solving Performance

Abstract

Documents

Code

Data

About

Releases

Packages

License

matthewrenze/self-reflection

Folders and files

Latest commit

History

Repository files navigation

Self-Reflection in LLM Agents: Effects on Problem-Solving Performance

Abstract

Documents

Code

Data

About

Resources

License

Stars

Watchers

Forks