CodeEvolve

An open-source framework that combines large language models with evolutionary algorithms to discover and optimize high-performing code solutions.

CodeEvolve democratizes algorithmic discovery by making LLM-driven evolutionary search transparent, reproducible, and accessible. Whether you're tackling combinatorial optimization, discovering novel algorithms, or optimizing computational kernels, CodeEvolve provides a modular foundation for automated code synthesis guided by quantifiable metrics.

Why CodeEvolve?

State-of-the-art performance with transparency. CodeEvolve matches or exceeds the performance of closed-source systems like Google DeepMind's AlphaEvolve on established algorithm-discovery benchmarks, while remaining fully open and reproducible.

Cost-effective solutions. Open-weight models like Qwen often match or outperform expensive closed-source LLMs at a fraction of the compute cost, making cutting-edge algorithmic discovery accessible to researchers and practitioners with limited budgets.

Designed for real problems. CodeEvolve addresses meta-optimization tasks where you need to discover programs that solve complex optimization problems—from mathematical constructions to scientific discovery.

Key Features

Islands-based Genetic Algorithm

Multiple populations evolve independently and periodically exchange top performers, maintaining diversity while propagating successful solutions across the search space. This parallel architecture enables efficient exploration and scales naturally to concurrent evaluation.

Modular Evolutionary Operators

Inspiration-based Crossover: Contextual recombination that combines successful solution patterns while preserving semantic coherence. Parent solutions are presented to the LLM along with high-performing "inspiration" programs, allowing it to synthesize novel combinations.

Meta-prompting Exploration: Evolves the prompts themselves, enabling the LLM to reflect on and rewrite its own instructions for more diverse search trajectories. The system maintains a population of prompts that co-evolve with solutions.

Depth-based Exploitation: Targeted refinement mechanism that makes precise edits to promising solutions by maintaining conversation history. The LLM sees the full evolutionary lineage, enabling incremental improvements while preserving working components.

Quality-Diversity Optimization

MAP-Elites Integration: Optional quality-diversity archive that maintains behavioral diversity across the solution space. Supports both grid-based and CVT-based (Centroidal Voronoi Tessellation) feature maps.

Flexible LLM Integration

Ensemble Support: Mix and match multiple LLMs with weighted selection. Use different models for exploration vs. exploitation phases, or combine open-weight and proprietary models.

OpenAI-Compatible APIs: Works with any OpenAI-compatible endpoint including vLLM, Ollama, Together AI, and cloud providers.

How It Works

CodeEvolve operates as a distributed evolutionary algorithm where code itself is the evolving entity:

The Evolutionary Loop

Initialization: Start with an initial code template and system prompt that defines the task
Selection: Choose parent programs based on fitness
- Exploration mode: Random or uniform selection for broad search
- Exploitation mode: Tournament or roulette selection for refinement
Variation: Generate new candidates through LLM-driven operations
- Exploration: Broad modifications with meta-prompting
- Exploitation: Targeted improvements with conversation history and inspiration programs
Code Generation: LLM produces SEARCH/REPLACE diffs
- Only specified code blocks are modified (between markers)
- Preserves working code outside evolution zones
- Applies structured diffs rather than regenerating entire files
Evaluation: Execute in sandboxed environment
- Resource limits (time, memory)
- Capture metrics and errors
- Extract fitness from evaluation results
Migration: Periodically exchange top solutions between islands
- Maintains diversity while spreading innovations
- Prevents premature convergence
Archiving (optional): MAP-Elites maintains diverse solutions
- Preserves behavioral variety
- Enables multi-objective optimization

Exploration vs Exploitation

The system dynamically balances exploration and exploitation:

Phase	Selection	LLM Context	Operators
Exploration	Random/uniform	No history	Meta-prompting enabled, no inspirations
Exploitation	Tournament/fitness	Full lineage	Inspiration programs, deep conversation

The exploration rate is controlled by a scheduler (e.g., exponential decay) and can adapt based on fitness improvements.

Distributed Islands Architecture

┌─────────┐     ┌─────────┐     ┌─────────┐
│Island 0 │────▶│Island 1 │────▶│Island 2 │
│Pop: 20  │     │Pop: 20  │     │Pop: 20  │
└────┬────┘     └────┬────┘     └────┬────┘
     │               │               │
     └───────────────┴───────────────┘
        Periodic Migration (Ring Topology)
        
Each island maintains:
- Solution population
- Prompt population (if meta-prompting enabled)
- Local fitness rankings
- Migration history

Architecture

CodeEvolve operates through an iterative process at each epoch:

Population Management: Each island maintains populations of prompts and solutions, evaluated against user-defined fitness metrics
Evolutionary Operators: Generate new candidates through crossover, mutation, and meta-prompting
LLM Ensemble: Transforms operator instructions into executable code modifications via structured SEARCH/REPLACE diffs
Selection & Migration: Top performers are retained and periodically migrated between islands
Archive: MAP-Elites-based archive preserves behavioral diversity across the search

Execution feedback and fitness signals guide the entire loop, translating LLM proposals into testable, executable artifacts.

Core Components

Program Database (`database.py`)

Manages populations with genealogical tracking
Implements selection strategies: random, tournament, roulette, best
Optional MAP-Elites integration (Grid or CVT-based)
Automatic population size management

Evolution Engine (`evolution.py`)

Main evolutionary loop coordinating all components
Handles parent selection, variation, and evaluation
Manages exploration/exploitation scheduling
Checkpoint creation and restoration

Islands Coordinator (`islands.py`)

Distributed execution with multiprocessing
Synchronous or asynchronous migration
Shared global best tracking
Coordinated early stopping

LLM Interface (`lm.py`)

OpenAI-compatible API wrapper
Ensemble with weighted random selection
Automatic retry with exponential backoff
Embedding generation support

Prompt Sampler (`prompt/sampler.py`)

Builds conversation histories from program lineages
Incorporates inspiration programs for crossover
Meta-prompting for prompt evolution
Dynamic depth control

Evaluator (`evaluator.py`)

Sandboxed program execution
Memory and timeout monitoring
Process tree management
Metrics extraction from JSON results

Performance Highlights

CodeEvolve demonstrates superior performance on several benchmarks previously used to assess AlphaEvolve:

Competitive or better results across diverse algorithm-discovery tasks including autocorrelation inequalities, packing problems, and Heilbronn problems
Open-weight models (e.g., Qwen) matching closed-source performance at significantly lower cost
Extensive ablations quantifying each component's contribution to search efficiency

For comprehensive evaluation details and specific results, see our technical report.

Quick Start

Installation

Clone this repository and create the conda environment:

git clone https://github.com/inter-co/science-codeevolve.git
cd science-codeevolve
conda env create -f environment.yml
conda activate codeevolve

Basic Usage

Configure your LLM provider by setting environment variables:

export API_KEY=your_api_key_here
export API_BASE=your_api_base_url

Run CodeEvolve via the command line:

codeevolve \
  --inpt_dir=INPT_DIR \
  --cfg_path=CFG_PATH \
  --out_dir=RESULTS_DIR \
  --load_ckpt=LOAD_CKPT \
  --terminal_logging

Arguments:

--inpt_dir: Directory containing initial solution and evaluation script
--cfg_path: Path to YAML configuration file (required for new runs)
--out_dir: Directory where results will be saved
--load_ckpt: Checkpoint to load (0 for new run, -1 for latest, or specific epoch)
--terminal_logging: Enable live progress display (optional)

The scripts/run.sh provides a bash script for running CodeEvolve with taskset to limit CPU usage. See src/codeevolve/cli.py for further details.

Minimal Example

Here's a complete minimal example for optimizing a simple function:

1. Create problem directory:

mkdir -p my_problem/input

2. Create initial solution (my_problem/input/solution.py):

# EVOLVE-BLOCK-START
def objective(x):
    """Function to maximize."""
    return -(x - 3)**2 + 10
# EVOLVE-BLOCK-END

if __name__ == '__main__':
    x = 0.0  # Initial guess
    result = objective(x)
    print(f"Result: {result}")

3. Create evaluator (my_problem/input/evaluate.py):

import sys
import json
import subprocess

def evaluate(code_path, results_path):
    # Run the solution
    result = subprocess.run(
        [sys.executable, code_path],
        capture_output=True,
        text=True,
        timeout=10
    )
    
    # Extract fitness from output
    fitness = float(result.stdout.split(':')[1].strip())
    
    # Save results
    with open(results_path, 'w') as f:
        json.dump({'fitness': fitness}, f)

if __name__ == '__main__':
    evaluate(sys.argv[1], sys.argv[2])

4. Create config (my_problem/config.yaml):

SEED: 42

CODEBASE_PATH: "."
EVAL_FILE_NAME: "evaluate.py"

INIT_FILE_DATA:
  filename: "solution.py"
  language: "python"

SYS_MSG: |
  # PROMPT-BLOCK-START
  You are an expert optimization algorithm designer. Your goal is to modify
  the given code to maximize the objective function.
  # PROMPT-BLOCK-END

ENSEMBLE:
  - model_name: "meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo"
    temp: 0.8
    weight: 1

SAMPLER_AUX_LM:
  model_name: "meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo"
  temp: 0.7

EVOLVE_CONFIG:
  num_epochs: 50
  num_islands: 2
  migration_topology: "directed_ring"
  migration_interval: 10
  
  selection_policy: "tournament"
  selection_kwargs:
    tournament_size: 3
  
  num_inspirations: 2
  exploration_rate: 0.3
  
  fitness_key: "fitness"
  ckpt: 10
  early_stopping_rounds: 20

5. Run:

codeevolve \
  --inpt_dir=my_problem/input \
  --cfg_path=my_problem/config.yaml \
  --out_dir=my_problem/results \
  --terminal_logging

Customizing for Your Problem

CodeEvolve is designed for algorithmic problems with quantifiable metrics. To apply it to your domain:

Define your evaluation function that measures solution quality
- Must accept: code file path, results file path
- Must output: JSON with at least one metric field
Specify the initial codebase or problem structure
- Mark evolution zones with # EVOLVE-BLOCK-START and # EVOLVE-BLOCK-END
- Code outside these blocks is preserved
Configure evolutionary parameters
- Population size, mutation rates, selection policy
- Exploration/exploitation balance
- Migration topology and frequency
Choose your LLM ensemble composition
- Single model or multiple models
- Separate ensembles for exploration vs exploitation
- Weight distribution across models

See problems/problem_template for a general template. Comprehensive tutorials and example notebooks will be released soon.

Use Cases

The framework is suitable for any domain where solutions can be represented as code and evaluated programmatically:

Mathematical Discovery

Finding solutions to open problems in mathematics
Discovering new inequalities or bounds
Constructing optimal geometric configurations

Algorithm Design

Optimizing computational kernels and scheduling algorithms
Discovering novel heuristics for NP-hard problems
Automated algorithm configuration

Scientific Discovery

Exploring hypothesis spaces expressed as executable code
Parameter optimization for scientific models
Automated experimental design

Software Optimization

Performance tuning of critical code paths
Automatic parallelization strategies
Resource allocation optimization

Reproducing Research Results

For complete experimental configurations, benchmark implementations, and step-by-step examples demonstrating how to run CodeEvolve on various problems, visit our experiments repository:

github.com/inter-co/science-codeevolve-experiments

This companion repository contains all code necessary to reproduce the results from our technical report, including:

All benchmark problem implementations
Experimental configurations for each problem
Raw results and checkpoints from paper runs
Analysis notebooks with visualizations
Statistical comparisons with AlphaEvolve

Documentation

Configuration Reference

Key configuration parameters in your YAML file:

EVOLVE_CONFIG:
  # Basic settings
  num_epochs: 100              # Total iterations
  num_islands: 4               # Parallel populations
  init_pop: 20                 # Initial population per island
  max_size: 50                 # Max population size (null = unlimited)
  
  # Selection
  selection_policy: "tournament"  # or "roulette", "random", "best"
  selection_kwargs:
    tournament_size: 3
  
  # Exploration/Exploitation
  exploration_rate: 0.3        # Probability of exploration
  use_scheduler: true          # Use adaptive scheduling
  type: "ExponentialDecayScheduler"
  scheduler_kwargs:
    decay_rate: 0.995
  
  # Operators
  num_inspirations: 3          # Inspiration programs for crossover
  meta_prompting: true         # Enable prompt evolution
  max_chat_depth: 5            # Conversation history depth
  
  # Migration
  migration_topology: "directed_ring"  # Island topology
  migration_interval: 20       # Epochs between migrations
  migration_rate: 0.1          # Fraction to migrate
  
  # Quality-Diversity (optional)
  use_map_elites: false        # Enable MAP-Elites
  
  # Checkpointing
  ckpt: 10                     # Checkpoint frequency
  early_stopping_rounds: 50    # Stop after N epochs without improvement
  
  # Fitness
  fitness_key: "fitness"       # Metric name from evaluation JSON

Command-Line Reference

codeevolve [OPTIONS]

Required:
  --inpt_dir PATH          Input directory with solution and evaluator
  --out_dir PATH           Output directory for results

Optional:
  --cfg_path PATH          Config file (required for new runs)
  --load_ckpt INT          Checkpoint: 0=new, -1=latest, N=epoch N
  --terminal_logging       Show live progress in terminal

Output Structure

output_directory/
├── config.yaml              # Copy of configuration used
├── 0/                       # Island 0 results
│   ├── results.log          # Detailed execution log
│   ├── best_sol.py          # Best solution found
│   ├── best_prompt.txt      # Best prompt evolved
│   └── ckpt/                # Checkpoints
│       ├── ckpt_10.pkl
│       ├── ckpt_20.pkl
│       └── ...
├── 1/                       # Island 1 results
│   └── ...
└── ...

Releases

We organize the different versions of CodeEvolve as releases in both this repository and its companion experiments repository. Currently, we have the following releases:

v0.1.0: Initial version of CodeEvolve, corresponds to v1 of technical report
v0.2.0 / v0.2.1: Most recent release, corresponds to v3 of technical report with minor bug fixes

Contributing

We welcome contributions from the community! Here's how to get involved:

Start with an issue: Browse existing issues or create a new one describing your proposed change
Submit a pull request:
- Fork the repository
- Create a feature branch
- Make your changes with tests
- Reference the issue in your PR description
Keep PRs focused: Avoid massive changes—smaller, well-tested contributions are easier to review
Maintain quality: Ensure code is tested, documented, and follows existing style

Please refer to CONTRIBUTING.md for detailed guidelines.

Areas for Contribution

New selection policies or evolutionary operators
Additional LLM providers and integrations
Benchmark problems from your domain
Documentation improvements and tutorials
Performance optimizations
Bug fixes and test coverage

Citation

If you use CodeEvolve in your research, please cite our paper:

@article{assumpção2025codeevolveopensourceevolutionary,
      title={CodeEvolve: An open source evolutionary coding agent for algorithm discovery and optimization},
      author={Henrique Assumpção and Diego Ferreira and Leandro Campos and Fabricio Murai},
      year={2025},
      eprint={2510.14150},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2510.14150},
}

Acknowledgements

The authors thank Bruno Grossi for his continuous support during the development of this project. We thank Fernando Augusto and Tiago Machado for useful conversations about possible applications of CodeEvolve. We also thank the OpenEvolve community for their inspiration and discussion about evolutionary coding agents.

License and Disclaimer

All software is licensed under the Apache License, Version 2.0 (Apache 2.0); you may not use this file except in compliance with the Apache 2.0 license. You may obtain a copy of the Apache 2.0 license at: https://www.apache.org/licenses/LICENSE-2.0.

This is not an official Inter product.

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
assets		assets
problems		problems
scripts		scripts
src/codeevolve		src/codeevolve
tests		tests
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

CodeEvolve

Table of Contents

Why CodeEvolve?

Key Features

Islands-based Genetic Algorithm

Modular Evolutionary Operators

Quality-Diversity Optimization

Flexible LLM Integration

How It Works

The Evolutionary Loop

Exploration vs Exploitation

Distributed Islands Architecture

Architecture

Core Components

Program Database (database.py)

Evolution Engine (evolution.py)

Islands Coordinator (islands.py)

LLM Interface (lm.py)

Prompt Sampler (prompt/sampler.py)

Evaluator (evaluator.py)

Performance Highlights

Quick Start

Installation

Basic Usage

Minimal Example

Customizing for Your Problem

Use Cases

Mathematical Discovery

Algorithm Design

Scientific Discovery

Software Optimization

Reproducing Research Results

Documentation

Configuration Reference

Command-Line Reference

Output Structure

Releases

Contributing

Areas for Contribution

Citation

Acknowledgements

License and Disclaimer

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Program Database (`database.py`)

Evolution Engine (`evolution.py`)

Islands Coordinator (`islands.py`)

LLM Interface (`lm.py`)

Prompt Sampler (`prompt/sampler.py`)

Evaluator (`evaluator.py`)

Packages