Mutual evolution by closed-loop self-referential optimization.
English | 中文
Introduction · Project website · Results · Installation · Quick Start · Examples · Architecture · Provenance
Escher-Loop is a research codebase for studying self-referential optimization: the system evolves task-solving programs while also allowing the optimization procedure itself to evolve.
The central idea is to turn task improvement into a training signal for optimizers. Task agents are evaluated by objective scores; those same outcomes are reused as relative feedback for the optimizer agents that produced them. As task performance improves, optimization ability is not optimized as a separate hand-written objective. It emerges through the same loop.
The code is built on top of an OpenEvolve-style evolutionary programming engine. We keep that relationship explicit: OpenEvolve-derived components are attributed in NOTICE, and the relationship is documented in docs/PROVENANCE.md.
Across three geometric optimization tasks, Escher-Loop raises the best-so-far performance ceiling under a fixed token budget and avoids several plateaus that static optimizer baselines fall into. The examples mirror these task families: Kissing Number, Circle Packing, and Heilbronn Triangle.
The reported curves follow a fixed evaluation protocol. Kissing Number (KN) and Circle Packing (CP) are each evaluated with three independent runs per method. Heilbronn Triangle (HT) uses a stochastic evaluator, so we report six independent runs per method for a more stable comparison. All trajectories come from the same experimental batch; the plot is not a post-hoc selection of the best-looking runs. These results should be read as matched-budget empirical evidence rather than as a claim about the absolute performance ceiling of either Escher-Loop or the static baseline.
| Path | Purpose |
|---|---|
escher/ |
Core Python package for configuration, evolution runtime, prompt sampling, evaluation, program storage, and self-referential optimizer utilities. |
examples/ |
Compact runnable task packages for circle packing, kissing number, Heilbronn triangle, and self-referential optimizer evolution. |
scripts/ |
Local helper scripts for smoke tests and short example runs. |
tests/ |
Offline tests for config handling, evaluator scaffolding, dynamic loading, and self-referential utilities. |
docs/ |
Architecture notes and provenance documentation. |
LICENSES/ |
Third-party license text for OpenEvolve-derived components. |
Runs write local artifacts under outputs/ by default, including checkpoints,
logs, token traces, and exported best programs. These runtime outputs are
git-ignored so repeated experiments do not change the source tree.
git clone https://github.com/scaling-group/escher-loop.git
cd escher-loop
python3 --version # make sure this is Python 3.10+
python3 -m venv .venv
source .venv/bin/activate
pip install -e ".[examples]"Set an OpenAI-compatible model provider before running model-backed evolution:
export OPENAI_API_KEY="your-api-key"
export OPENAI_API_BASE="https://api.openai.com/v1"The configs read provider settings from environment variables, so any OpenAI-compatible endpoint can be used.
Run all deterministic evaluator smoke tests:
./scripts/smoke_test.shRun a short single-task evolution job:
./scripts/run_example.sh circle_packingThe helper defaults to 5 iterations so that the first run stays inexpensive. It checks the standard path: LLM generation, program evaluation, archive update, event logging, token logging, and best-program export.
Run a baseline-style single-task job with a separate output tree:
ITERATIONS=10 ./scripts/run_baseline.sh kissing_numberrun_baseline.sh is a thin wrapper around run_example.sh that writes to
outputs/baseline/<task> by default. Use it when you want baseline-style
single-task runs without mixing their artifacts with Escher-Loop runs.
Run self-referential optimizer evolution:
./scripts/run_self_referential.shThis entry point evolves optimizer agents rather than task programs. It uses
the three benchmark tasks as downstream evaluations and is intentionally
implemented as a separate runner because it must pass a dynamic
sampler_provider into the runtime. The runner supports checkpoint resume and
dynamic mentor selection:
ITERATIONS=100 ./scripts/run_self_referential.sh
python examples/self_referential/run.py \
--checkpoint outputs/self_referential/checkpoints/checkpoint_50 \
--iterations 50Equivalent direct CLI:
escher-loop-run \
examples/circle_packing/initial_program.py \
examples/circle_packing/evaluator.py \
--config examples/circle_packing/config.yaml \
--output outputs/circle_packing \
--iterations 5The example set contains three self-contained paper benchmark tasks and one optimizer-agent evolution entry point:
| Task | Candidate entry point | Evaluator metric |
|---|---|---|
circle_packing |
run_packing() |
Sum of radii normalized by 2.635 |
kissing_number |
kissing_number11() |
Valid point count normalized by 593 |
heilbronn_triangle |
heilbronn_triangle11() |
Minimum triangle area normalized by 0.036529889880030156 |
self_referential |
PromptSampler optimizer agent |
Relative benchmark result over the benchmark tasks |
Each paper benchmark task directory contains:
initial_program.py: seed program for evolution.evaluator.py: deterministic validation and scoring.config.yaml: conservative OpenAI-compatible run configuration.best_program.py: representative task program artifact.requirements.txt: task-specific dependencies.
Available helper runs:
./scripts/run_example.sh circle_packing
./scripts/run_example.sh kissing_number
./scripts/run_example.sh heilbronn_triangle
./scripts/run_baseline.sh kissing_numberOverride iteration count and output directory:
ITERATIONS=20 OUT_DIR=outputs/debug_cp ./scripts/run_example.sh circle_packingImportant fields in examples/*/config.yaml:
llm.models: model names and ensemble weights.llm.temperature,llm.max_tokens,llm.timeout: generation behavior.database.population_size: candidate archive size.database.enable_map_elites: disabled in the compact example configs.evaluator.timeout: per-candidate evaluation limit.prompt.system_message: task-specific optimization instructions.
Useful environment variables:
OPENAI_API_KEY: API key for the selected provider.OPENAI_API_BASE: OpenAI-compatible API base URL.ESCHER_OUTPUT_DIR: shared output directory for self-referential runs.OPENAI_EMBEDDING_API_BASE: explicit endpoint for non-default embedding models.ENABLE_ARTIFACTS: set tofalseto disable artifact capture.
Configuration export is secret-aware: Config.to_yaml() redacts API keys before
writing YAML.
Self-referential optimizer experiments may use the optional MP Fixer to repair invalid optimizer code. This feature is disabled by default. It was also disabled in the reported experiments because automatic repair introduces additional token consumption and would change the matched-budget comparison.
Enable it only when you explicitly want automatic repair:
meta_evolution_settings:
enable_mp_fixer: trueOriginal Escher-Loop code is released under the MIT License. See LICENSE.
This codebase includes OpenEvolve-derived components. The referenced OpenEvolve commit is Apache-2.0 licensed; attribution is recorded in NOTICE, and the Apache-2.0 license text is included under LICENSES/Apache-2.0.txt.
Some files also contain inline notices for code adapted from SakanaAI/ShinkaEvolve and google-deepmind/alphaevolve_results under Apache-2.0. Keep those notices intact when modifying the derived files.
@misc{liu2026escherloopmutualevolutionclosedloop,
title={Escher-Loop: Mutual Evolution by Closed-Loop Self-Referential Optimization},
author={Ziyang Liu and Xinyan Guo and Xuchen Wei and Han Hao and Liu Yang},
year={2026},
eprint={2604.23472},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2604.23472},
}- Open a GitHub issue for code questions.
- Ziyang Liu:
newzil1225@gmail.com

