Skip to content

cairoeth/coptimal

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

coptimal

Optimization agent orchestrator powered by Anthropic Managed Agents.

Point it at any optimization challenge, and it spins up Claude agents in the cloud to solve it autonomously - no Docker, no local compute, no babysitting. Close your laptop and come back to solutions.

Achieves top scores on Optimization Arena challenges, in some cases reaching the theoretical maximum. The orchestrator's optimization loop prevents agents from getting stuck in local maxima, and during initialization, frontier research is gathered via web search and embedded into the agent's context - critical for domains with scarce information like AMM design and quantitative algorithms.

coptimal sessions running in Claude Console

See it in action (gif)

coptimal demo

Key Features

  • Multi-agent parallel search - run 5+ agents simultaneously to explore different strategy spaces, increasing the chance of finding global optima vs a single agent grinding one path
  • Fire and forget - agents run in Anthropic's cloud infrastructure with no local process needed. Your machine can sleep, lose power, or disconnect - agents keep working
  • Init once, run many - the expensive analysis step (web search + LLM) happens once. Every subsequent run reuses the blueprint and environment (packages cached), so spinning up new agents is fast and cheap
  • External service integration - agents can authenticate to GPU providers (Modal for H100s), external APIs, or other services via --secret-file, enabling challenges that require specialized hardware
  • Automatic solution persistence - agents save their best solutions to cloud storage after every improvement, not just at the end. Even if a session crashes or times out, the last best solution is retrievable
  • Cost-efficient iteration - agents use the pre-built toolset (bash, file ops, web search) with built-in prompt caching and context compaction managed by Anthropic, keeping token usage efficient over long runs

The Problem

Optimization challenges (AMM fee strategies, CUDA kernels, trading algorithms) require hours of autonomous iteration: read code, implement, evaluate, improve, repeat. Running this locally means managing Docker containers, GPU access, and keeping your machine alive for hours.

coptimal eliminates all of that. It analyzes the challenge with Claude + web search, builds an agent blueprint, uploads the source to Anthropic's cloud, and launches agents that run independently in managed containers. Each agent gets 8GB RAM, pre-installed toolchains (Python, Rust, Node, Go), unrestricted networking, and bash/file/web tools - all managed by Anthropic.

Future Ideas

  • Agent payments via MPP - let agents autonomously purchase compute, API credits, and infrastructure access. An agent solving an attention kernel challenge could buy H100 time on Modal, or pay for premium dataset access, without human intervention
  • Outcome-based evaluation - use Anthropic's Outcomes API (research preview) to have a separate grader model evaluate solution quality against a rubric, triggering automatic revision cycles
  • Cross-session memory - use Anthropic's Memory Stores (research preview) to carry learnings across runs. An agent that discovered "volatility-responsive fees beat static fees" in run 1 would start run 2 already knowing that
  • Multi-agent coordination - use Anthropic's Multi-Agent Sessions (research preview) to have a coordinator agent delegate to specialized sub-agents (one for reading simulator code, one for implementing strategies, one for parameter tuning)
  • Auto-submission - integrate with challenge submission APIs to automatically submit solutions when they beat the current best score, closing the loop from "agent finds improvement" to "leaderboard updated"
  • Tournament mode - run agents with different models (Opus, Sonnet), different system prompts, or different parameter seeds, and automatically compare results to find the best configuration
  • Cost tracking and budgets - track per-run API costs (input/output tokens, cache hits) and enforce spend limits across runs

Quick Start

# Install
uv sync && uv tool install -e .
cp .env.example .env  # add your ANTHROPIC_API_KEY

# Init a challenge (analyzes repo, uploads source, creates environment)
coptimal init https://github.com/user/challenge --name my-challenge

# Launch 5 agents to solve it for 8 hours
coptimal run my-challenge --budget 8h --count 5

# Come back later and grab the best solution
coptimal download my-challenge -o ./solutions

How It Works

1. Init - Build the Blueprint

coptimal init https://github.com/user/challenge --name my-challenge

Analyzes the challenge repo with Claude + web search to understand the problem, scoring, and environment requirements. Produces:

  • System prompt with challenge details, evaluation commands, research findings, and an optimization loop strategy
  • Environment config with packages to pre-install (pip, apt, cargo, npm)
  • Source archive uploaded to Anthropic's Files API
  • Cloud environment created via the Environments API (reused across all runs)

Everything is saved to ~/.coptimal/workspaces/<name>/. Init once, run many times.

2. Run - Launch Agents

coptimal run my-challenge --budget 8h --count 5

Each run creates:

  • A managed agent with the optimization system prompt and full toolset (bash, file ops, web search)
  • A session with the source archive mounted at /workspace/

The agent extracts the source, reads the codebase, implements strategies, evaluates them, and iterates - all autonomously in the cloud. It checks time via bash (date +%s vs deadline epoch) and saves solutions to /mnt/session/outputs/ after every improvement.

Agents are fully self-contained. No local process needed. Close your terminal - they keep running.

3. Monitor and Download

coptimal status                              # overview of all workspaces
coptimal status my-challenge                 # runs, tokens, session IDs
coptimal watch my-challenge                  # stream live events
coptimal download my-challenge -o ./solutions  # grab solution files

CLI Reference

coptimal init

Flag Default Description
SOURCE (required) - Git URL, local directory, zip file, or YAML spec
--name, -n challenge slug Workspace name
--output, -o ~/.coptimal/workspaces/<name> Workspace directory
--model, -m claude-opus-4-6 Model for agent runs
--analysis-model claude-opus-4-6 Model for the analysis step
--effort high Analysis effort (low, medium, high, max)

coptimal run

Flag Default Description
WORKSPACE cwd Workspace name or path
--budget, -b 2h Time budget (2h, 30m, 90s)
--model, -m claude-opus-4-6 Claude model for the agent
--count, -c 1 Number of parallel agents
--fast off Fast mode (Opus 4.6 only, 1.5x speed, 2x cost)
--secret-file - .env file with secrets mounted at /workspace/.env.secrets

coptimal watch

Flag Default Description
WORKSPACE cwd Workspace name or path
--run latest Specific run ID to watch

coptimal download

Flag Default Description
WORKSPACE cwd Workspace name or path
--run latest Specific run ID
--output, -o . Directory to save files to

coptimal status

Show all workspaces (no args) or a specific workspace with run details.

coptimal clean

Flag Default Description
WORKSPACE - Workspace to clean
--archive off Archive cloud sessions and delete uploaded files
--delete off Delete workspace directory from disk
--all off Clean all registered workspaces

Architecture

~/.coptimal/workspaces/<name>/
  .coptimal/
    config.yaml           # workspace config (model, source_file_id, environment_id)
    analysis.json         # raw LLM analysis results
    system_prompt.md      # agent system prompt (challenge + optimization loop)
    environment.json      # cloud environment packages config
    state.yaml            # run history (session IDs, status, tokens)

Init Flow

Source (git/dir/zip)
  -> Clone & stage locally
  -> LLM analysis with web search (discovers eval commands, packages, research)
  -> Archive source as .tar.gz -> upload to Files API
  -> Create cloud environment (packages cached across runs)
  -> Save blueprint

Run Flow

Blueprint
  -> Create agent (model + system prompt + toolset)
  -> Create session (agent + environment + source file + optional secrets)
  -> Send initial message (extract workspace, set deadline, start optimizing)
  -> Agent runs autonomously in cloud container
  -> Saves best solutions to /mnt/session/outputs/ (retrievable via download)

What the Agent Gets

Resource Details
Container Ubuntu 22.04, x86_64, 8GB RAM, 10GB disk
Runtimes Python 3.12+, Node.js 20+, Rust 1.77+, Go 1.22+
Tools bash, read, write, edit, glob, grep, web_fetch, web_search
Networking Unrestricted outbound (routed through Anthropic egress proxy)
Source Challenge repo extracted at /workspace/
Secrets Optional .env file at /workspace/.env.secrets
Outputs Persist at /mnt/session/outputs/ (downloadable via Files API)

Project Structure

src/coptimal/
  cli.py                  # typer CLI (init, run, watch, status, download, clean)
  models.py               # data models (ChallengeSpec, WorkspaceConfig, RunRecord)
  state.py                # workspace state persistence (load/save/init)
  registry.py             # workspace name registry (~/.coptimal/workspaces.yaml)
  llm.py                  # Anthropic Messages API client (analysis with web search)
  agents/
    client.py             # Managed Agents API (create agent/env/session, stream events)
    prompt.py             # system prompt builder (optimization loop + challenge details)
  analysis/
    analyzer.py           # challenge analysis (collect files, call LLM, parse JSON)
    generate.py           # generate system prompt and environment config from analysis
    schema.py             # structured analysis output schema
  challenges/
    parser.py             # resolve challenge source (git URL, directory, zip, YAML)
  workspace/
    generator.py          # workspace creation, source archiving, Files API upload

Built With

About

Optimization agent orchestrator powered by Anthropic Managed Agents.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages