[ICLR 2026] When Agents “Misremember” Collectively: Exploring the Mandela Effect in LLM-based Multi-Agent Systems
This is the official implementation of "When Agents 'Misremember' Collectively: Exploring the Mandela Effect in LLM-based Multi-Agent Systems" (ICLR 2026).
The Mandela Effect is a phenomenon where groups collectively misremember verifiable facts, arising from social reinforcement and internalized misinformation. This repository introduces ManBench, a comprehensive benchmark designed to evaluate the Mandela effect in LLM-based multi-agent systems.
git clone https://github.com/bluedream02/Mandela-Effect.git
cd Mandela-Effect
conda create -n mandela python=3.10 -y
conda activate mandela
pip install -r requirements.txtSet up your API keys as environment variables:
export OPENAI_API_KEY="your-api-key"
export OPENAI_BASE_URL="https://api.openai.com/v1"
# Optional: for local models via Ollama
# Make sure Ollama is installed and running locallyBasic evaluation on a single task:
python eval.py --task_subset disambiguation_qa --max_samples 10 \
--save_path output/disambiguation_qa_test \
--model gpt-4o-mini --total_agents 5Evaluate multiple tasks:
python eval.py --task_subset "disambiguation_qa,auto_categorization" \
--max_samples 10 \
--save_path output/multi_task_test \
--model gpt-4o-mini --total_agents 5Key arguments for eval.py:
--task_subset: Task(s) to evaluate (comma-separated) (default: all)--max_samples: Maximum samples per task (default: all)--save_path: Output directory--model: Model name (default:gpt-4o-mini)--total_agents: Number of agents (default:5)--use_cache: Use cached results to avoid redundant API calls during development. (default:True)--data_folder: Data folder path (default:bbh_all_small)
Evaluate defense strategies:
python eval_defense.py --task_subset disambiguation_qa --max_samples 10 \
--save_path output/defense_test \
--model gpt-4o-mini --total_agents 5After running evaluation, analyze the results:
# Analyze with specific model and directory
python analyze.py --model gpt-4o-mini --input-dir output/disambiguation_qa_test
# Auto-detect output directories
python analyze.py --model gpt-4o-mini
# Results saved to output_evaluation/{model_name}.xlsxKey arguments for analyze.py:
--model: Model name to analyze--input-dir: Input directory (auto-detectsoutput/*if not specified)--output-dir: Output directory (default:output_evaluation)
This work builds upon several excellent open-source projects and related works:
- Do as We Do, Not as You Think: the Conformity of Large Language Models (ICLR 2025) - Paper | GitHub
- Exploring Prosocial Irrationality for LLM Agents: A Social Cognition View (ICLR 2025) - Paper | GitHub
- Exploring Collaboration Mechanisms for LLM Agents: A Social Psychology View (ACL 2024)- Paper | GitHub
- Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models (TMLR) - Paper | GitHub
We thank the authors for their valuable contributions to the community.
If you find this work useful for your research, please cite our paper:
@misc{xu2026agentsmisremembercollectivelyexploring,
title={When Agents "Misremember" Collectively: Exploring the Mandela Effect in LLM-based Multi-Agent Systems},
author={Naen Xu and Hengyu An and Shuo Shi and Jinghuai Zhang and Chunyi Zhou and Changjiang Li and Tianyu Du and Zhihui Fu and Jun Wang and Shouling Ji},
year={2026},
eprint={2602.00428},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2602.00428},
}