GitHub - scaling-group/escher-loop

Mutual evolution by closed-loop self-referential optimization.

English | 中文

Introduction · Project website · Results · Installation · Quick Start · Examples · Architecture · Provenance

Introduction

Escher-Loop is a research codebase for studying self-referential optimization: the system evolves task-solving programs while also allowing the optimization procedure itself to evolve.

The central idea is to turn task improvement into a training signal for optimizers. Task agents are evaluated by objective scores; those same outcomes are reused as relative feedback for the optimizer agents that produced them. As task performance improves, optimization ability is not optimized as a separate hand-written objective. It emerges through the same loop.

The code is built on top of an OpenEvolve-style evolutionary programming engine. We keep that relationship explicit: OpenEvolve-derived components are attributed in NOTICE, and the relationship is documented in docs/PROVENANCE.md.

Results

Across three geometric optimization tasks, Escher-Loop raises the best-so-far performance ceiling under a fixed token budget and avoids several plateaus that static optimizer baselines fall into. The examples mirror these task families: Kissing Number, Circle Packing, and Heilbronn Triangle.

The reported curves follow a fixed evaluation protocol. Kissing Number (KN) and Circle Packing (CP) are each evaluated with three independent runs per method. Heilbronn Triangle (HT) uses a stochastic evaluator, so we report six independent runs per method for a more stable comparison. All trajectories come from the same experimental batch; the plot is not a post-hoc selection of the best-looking runs. These results should be read as matched-budget empirical evidence rather than as a claim about the absolute performance ceiling of either Escher-Loop or the static baseline.

What This Repository Contains

Path	Purpose
`escher/`	Core Python package for configuration, evolution runtime, prompt sampling, evaluation, program storage, and self-referential optimizer utilities.
`examples/`	Compact runnable task packages for circle packing, kissing number, Heilbronn triangle, and self-referential optimizer evolution.
`scripts/`	Local helper scripts for smoke tests and short example runs.
`tests/`	Offline tests for config handling, evaluator scaffolding, dynamic loading, and self-referential utilities.
`docs/`	Architecture notes and provenance documentation.
`LICENSES/`	Third-party license text for OpenEvolve-derived components.

Runtime Outputs

Runs write local artifacts under outputs/ by default, including checkpoints, logs, token traces, and exported best programs. These runtime outputs are git-ignored so repeated experiments do not change the source tree.

Installation

git clone https://github.com/scaling-group/escher-loop.git
cd escher-loop
python3 --version  # make sure this is Python 3.10+
python3 -m venv .venv
source .venv/bin/activate
pip install -e ".[examples]"

Set an OpenAI-compatible model provider before running model-backed evolution:

export OPENAI_API_KEY="your-api-key"
export OPENAI_API_BASE="https://api.openai.com/v1"

The configs read provider settings from environment variables, so any OpenAI-compatible endpoint can be used.

Quick Start

Run all deterministic evaluator smoke tests:

./scripts/smoke_test.sh

Run a short single-task evolution job:

./scripts/run_example.sh circle_packing

The helper defaults to 5 iterations so that the first run stays inexpensive. It checks the standard path: LLM generation, program evaluation, archive update, event logging, token logging, and best-program export.

Run a baseline-style single-task job with a separate output tree:

ITERATIONS=10 ./scripts/run_baseline.sh kissing_number

run_baseline.sh is a thin wrapper around run_example.sh that writes to outputs/baseline/<task> by default. Use it when you want baseline-style single-task runs without mixing their artifacts with Escher-Loop runs.

Run self-referential optimizer evolution:

./scripts/run_self_referential.sh

This entry point evolves optimizer agents rather than task programs. It uses the three benchmark tasks as downstream evaluations and is intentionally implemented as a separate runner because it must pass a dynamic sampler_provider into the runtime. The runner supports checkpoint resume and dynamic mentor selection:

ITERATIONS=100 ./scripts/run_self_referential.sh
python examples/self_referential/run.py \
  --checkpoint outputs/self_referential/checkpoints/checkpoint_50 \
  --iterations 50

Equivalent direct CLI:

escher-loop-run \
  examples/circle_packing/initial_program.py \
  examples/circle_packing/evaluator.py \
  --config examples/circle_packing/config.yaml \
  --output outputs/circle_packing \
  --iterations 5

Examples

The example set contains three self-contained paper benchmark tasks and one optimizer-agent evolution entry point:

Task	Candidate entry point	Evaluator metric
`circle_packing`	`run_packing()`	Sum of radii normalized by `2.635`
`kissing_number`	`kissing_number11()`	Valid point count normalized by `593`
`heilbronn_triangle`	`heilbronn_triangle11()`	Minimum triangle area normalized by `0.036529889880030156`
`self_referential`	`PromptSampler` optimizer agent	Relative benchmark result over the benchmark tasks

Each paper benchmark task directory contains:

initial_program.py: seed program for evolution.
evaluator.py: deterministic validation and scoring.
config.yaml: conservative OpenAI-compatible run configuration.
best_program.py: representative task program artifact.
requirements.txt: task-specific dependencies.

Available helper runs:

./scripts/run_example.sh circle_packing
./scripts/run_example.sh kissing_number
./scripts/run_example.sh heilbronn_triangle
./scripts/run_baseline.sh kissing_number

Override iteration count and output directory:

ITERATIONS=20 OUT_DIR=outputs/debug_cp ./scripts/run_example.sh circle_packing

Configuration

Important fields in examples/*/config.yaml:

llm.models: model names and ensemble weights.
llm.temperature, llm.max_tokens, llm.timeout: generation behavior.
database.population_size: candidate archive size.
database.enable_map_elites: disabled in the compact example configs.
evaluator.timeout: per-candidate evaluation limit.
prompt.system_message: task-specific optimization instructions.

Useful environment variables:

OPENAI_API_KEY: API key for the selected provider.
OPENAI_API_BASE: OpenAI-compatible API base URL.
ESCHER_OUTPUT_DIR: shared output directory for self-referential runs.
OPENAI_EMBEDDING_API_BASE: explicit endpoint for non-default embedding models.
ENABLE_ARTIFACTS: set to false to disable artifact capture.

Configuration export is secret-aware: Config.to_yaml() redacts API keys before writing YAML.

Optional MP Fixer

Self-referential optimizer experiments may use the optional MP Fixer to repair invalid optimizer code. This feature is disabled by default. It was also disabled in the reported experiments because automatic repair introduces additional token consumption and would change the matched-budget comparison.

Enable it only when you explicitly want automatic repair:

meta_evolution_settings:
  enable_mp_fixer: true

License And Attribution

Original Escher-Loop code is released under the MIT License. See LICENSE.

This codebase includes OpenEvolve-derived components. The referenced OpenEvolve commit is Apache-2.0 licensed; attribution is recorded in NOTICE, and the Apache-2.0 license text is included under LICENSES/Apache-2.0.txt.

Some files also contain inline notices for code adapted from SakanaAI/ShinkaEvolve and google-deepmind/alphaevolve_results under Apache-2.0. Keep those notices intact when modifying the derived files.

Citation

@misc{liu2026escherloopmutualevolutionclosedloop,
      title={Escher-Loop: Mutual Evolution by Closed-Loop Self-Referential Optimization}, 
      author={Ziyang Liu and Xinyan Guo and Xuchen Wei and Han Hao and Liu Yang},
      year={2026},
      eprint={2604.23472},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2604.23472}, 
}

Contact

Open a GitHub issue for code questions.
Ziyang Liu: newzil1225@gmail.com

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

Results

What This Repository Contains

Runtime Outputs

Installation

Quick Start

Examples

Configuration

Optional MP Fixer

License And Attribution

Citation

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
LICENSES		LICENSES
docs		docs
escher		escher
examples		examples
scripts		scripts
tests		tests
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
README_CN.md		README_CN.md
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

Introduction

Results

What This Repository Contains

Runtime Outputs

Installation

Quick Start

Examples

Configuration

Optional MP Fixer

License And Attribution

Citation

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages